mpl-1brc

Using MPL to solve the 1 Billion Row Challenge (site, GitHub).

See Sam's blog post for some discussion and details.

To run it, you need smlpkg and mpl installed.

$ git clone https://github.com/shwestrick/mpl-1brc
$ cd mpl-1brc
$ smlpkg sync
$ make
$ ./main @mpl procs 8 -- data/1M.txt --verbose   # run on example data set of 1 million measurements

The above command uses an included data/1M.txt input file which has 1 million measurements. For the full competition, you need the 1 billion measurements file. See the 1brc GitHub repo for instructions on how to generate it.

Contributions are welcome if anyone wants to help improve performance! Feel free to submit PRs. See ideas for performance improvements below. (Not sure about any of these, but it would be interesting to try them.)

Current results

Here are my current results on 72 cores (144 hyperthreads).

With bounds checking disabled:

$ hyperfine --warmup 1 './main @mpl procs 144 set-affinity -- /usr3/data/1brc/measurements.txt --unsafe-no-bounds-checks'
Benchmark 1: ./main @mpl procs 144 set-affinity -- /usr3/data/1brc/measurements.txt --unsafe-no-bounds-checks
  Time (mean ± σ):      2.342 s ±  0.020 s    [User: 324.751 s, System: 2.546 s]
  Range (min … max):    2.317 s …  2.380 s    10 runs

And, with bounds checking enabled:

$ hyperfine --warmup 1 './main @mpl procs 144 set-affinity -- /usr3/data/1brc/measurements.txt'
Benchmark 1: ./main @mpl procs 144 set-affinity -- /usr3/data/1brc/measurements.txt
  Time (mean ± σ):      2.443 s ±  0.018 s    [User: 339.081 s, System: 2.687 s]
  Range (min … max):    2.417 s …  2.475 s    10 runs

These timings are not directly comparable with the timings reported in the competition, because of differences in hardware. (E.g., I'm using a much larger number of cores here.)

Key optimizations

I ended up making four key optimizations to improve performance. In the end this yielded a total improvement of approximately 10x over my initial working implementation.

The key optimizations were:

Don't allocate any intermediate strings. Instead, use file offsets as keys in the hash table, and compare keys by directly reading from the contents of the file. (commit)
Don't tokenize the file. Instead, parse entries on-the-fly in the main loop. (commit)
Reduce contention in the hash table by sharding the values into multiple copies. (commit)
Don't load the file into an array. Instead, operate directly on the mmap'ed file contents. (commits: a, b, c)

Potential improvements still TODO

Work directly on the mmap'ed file instead of loading it into an array first.
Better hash function?
Faster parsing? There are some cool ideas floating around in discussions of other 1brc solutions.
Use block-local hash tables to avoid contention?
Shard the hash table to avoid contention
Option for disabling bounds checks
Store the components of the hash table values ("weights" in the code) in SoA style?
Pack the min and max components of the weights into a smaller value? (These don't need nearly as many bits as we're using currently...)

At the moment, my guess is that better parsing and hashing would lead to the biggest performance improvements. (We're getting great scalability, but single-core performance is a little slow.)

~~I suspect also that avoiding copying the whole file into an array would be a big win.~~ Done! Got a ~20% improvement.

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
data		data
hash-table		hash-table
timings		timings
.gitignore		.gitignore
Buffer.sml		Buffer.sml
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
main.mlb		main.mlb
main.sml		main.sml
sml.pkg		sml.pkg
time-breakdown.svg		time-breakdown.svg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

mpl-1brc

Current results

Key optimizations

Potential improvements still TODO

About

Releases

Packages

Languages

License

shwestrick/mpl-1brc

Folders and files

Latest commit

History

Repository files navigation

mpl-1brc

Current results

Key optimizations

Potential improvements still TODO

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages