Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR adds an
IntArray
module, which supports parallel initialisation for Integer Arrays mentioned in #6. It uses @stedolan's prototype which uses strings as the base for IntArray.Benchmarks
I have rewritten the parallel benchmarks
quicksort_multicore.ml
andmergesort_multicore.ml
from sandmark to useDomainslib.IntArray
proposed in this PR along with adding parallel initialisation to them.4.10.0+multicore
compiler variant was used to build and run the benchmarks.Quicksort
n = 10_000_000
Observations: The single core version of
IntArray
is faster compared toArray
version and there is speedup till 8 cores, after which theIntArray
version slows down.When the number of cores is greater than or equal to 8, there's unusually high GC activity which I suspect explains the slowdown. More specifically, the overheads seem to be at
caml_stw_empty_minor_heap
andstw_handler
. @stedolan mentioned that strings are not scanned by the GC, I'm not sure if that's someway related to the increased GC activity.This is a part of the eventlog for execution on 24 cores. Most other processes look similar to the ones seen here.
Mergesort
n = 1_000_000
Observations: Contrary to the quicksort benchmark, there is no surge in the GC activity in mergesort, which makes me all the more uncertain about the cause of it in the quicksort benchmark. The speedup of the
IntArray
version independently is quite close to what would be expected expected. But there is a huge slowdown on 1 core compared to theArray
version. The overheads lie atset
andget
of IntArray (others are present in theArray
version as well). This is a part ofperf report
on 1 core.Would appreciate any insights on these benchmarks.
To-Do
The module needs addition of more Array functions such as
fill
,make
,map
,iter
etc. and possibly some performance tuning. I shall keep updating them to this PR.