[External] Adding `ankerl` `unordered_dense` #12861

loumalouomega · 2024-11-19T15:50:02Z

📝 Description

Introduction

Adding ankerl unordered_dense, which provides top performance hashsed iterators: https://martin.ankerl.com/2022/08/27/hashmap-bench-01/. MIT license and header only.

Initial testing during efforts to modernize data_value_container, at the end current brute force solution for small number of variables is faster, but I found that ankerl` in my tests was the faster solution.

The using them to replace our current containers (not used anywhere).

Key Features

Performance:
- Optimized for fast lookups and insertions.
- Minimizes memory usage by densely packing the data.
Robin Hood Hashing:
- Ensures even distribution of elements, reducing clustering.
- Backward shift deletion minimizes gaps in the table when elements are removed.
Template Customization:
- Supports custom hash functions, key equality checks, allocators, and bucket types.
- Provides both map (key-value pairs) and set (keys only) interfaces.
Hashing Algorithm:
- Based on wyhash, a fast and high-quality hashing algorithm.
- Provides built-in and extensible hash function templates.
API Compatibility:
- Follows the conventions of the standard library's std::unordered_map and std::unordered_set.
- Offers additional non-standard features like extract for moving data and replace for bulk updates.
Exception Safety:
- Designed with robust exception handling in mind.
- Ensures no memory leaks or corruption even during exceptions.
Modular and Extensible:
- Offers segmentation options for memory management.
- Integrates with polymorphic memory resources (PMR) for custom allocation strategies.
C++17 and Higher:
- Requires C++17 or newer due to the use of features like std::optional, std::tuple, and advanced template metaprogramming.

Main Components

Hashing:
- Customizable via the hash template, supporting standard types, strings, and custom objects.
- Uses a combination of mixing and bit manipulation for uniform distribution.
Buckets:
- The Bucket structure stores metadata (distance and fingerprint) and an index into the value container.
Data Storage:
- Utilizes a segmented_vector or std::vector to store data contiguously.
- The segmentation option (segmented_map or segmented_set) improves memory management for large datasets.
Load Factor:
- Maintains a default maximum load factor of 0.8, adjustable by the user.
- Automatically grows the table to maintain performance.
Transparent Lookup:
- Supports heterogeneous lookups (e.g., std::string_view for std::string keys).
Iterators:
- Provides standard iterators for traversal.
- Iterator invalidation rules are similar to std::unordered_map.

🆕 Changelog

matekelemen

I like the idea of finally using a hash table with better memory layout, but sorry in advance: I'm going to be a picky ass here. This is a very basic data structure and there's a mountain of options to choose from so I don't want us to make a poor decision.

I'd be wary of any advertising based on benchmarks someone does on their own libraries. I skimmed through the comparison you linked, and am missing some things:

I didn't find the source code of the tests
He didn't provide any scaling studies
lack of hardware diversity

I ask you to write benchmarks using google's library for a few hash map implementations and run it on some different hardware.

What to compare

Specifically I'd like the following implementations compared:

std::unordered_map for reference
this lib
google's dense and sparse hash tables (https://github.com/sparsehash/sparsehash)
tsl::robin_map

What to benchmark

As for what to benchmark, we're almost exclusively inserting/searching and practically never erasing anything from existing tables, do I'd like to see

an insertion benchmark
a search benchmark
- with integer keys
- with std::string keys. Specifically, longer ones that don't benefit from short-string-optimizations (make sure that
  sizeof(std::string) < key.size())
- concurrent search with all physical cores participating

What's important is that you run this with different sizes so we can get an idea of how these operations scale.

Hardware

I'm interested in benchmarks running on

a decent desktop with an x86-based CPU
a NUMA cluster if you have access to one (if you don't I can run it on one)
some shitty laptop or a raspberry pi (optional, not super important but good to know because we have a lot of student users)
a Mac with an M-chip (optional, I'm just curious. I can run it on my machine if you don't have access to one)

I know this is a lot of work, but I think it's absolutely necessary for such a basic data structure.

If you are not familiar with google's benchmark framework, I can shoot you an example with std::unordered_map and you can build on top of that for the other implementations.

loumalouomega · 2024-11-21T15:49:49Z

* [`tsl::robin_map`](https://github.com/Tessil/robin-map)

This one is super slow at least for moderate sizes in my own tests.

loumalouomega · 2024-11-21T15:51:32Z

If you are not familiar with google's benchmark framework, I can shoot you an example with std::unordered_map and you can build on top of that for the other implementations.

I would like to have something standarized in Kratos instead of just hand made each time. It is not possible to reuse our GTest infrastructure?

matekelemen · 2024-11-21T15:52:20Z

* [`tsl::robin_map`](https://github.com/Tessil/robin-map)
This one is super slow at least for moderate sizes in my own tests.

Maybe, but that just highlights the dangers of taking devs benchmarks of their own libs at face value. The author of tsl::robin_map also did a benchmark that painted his own implementation in a rather flattering light.

matekelemen · 2024-11-21T15:53:57Z

If you are not familiar with google's benchmark framework, I can shoot you an example with std::unordered_map and you can build on top of that for the other implementations.

I would like to have something standarized in Kratos instead of just hand made each time. It is not possible to reuse our GTest infrastructure?

I'm all for something standardized, but GTest is definitely not a benchmarking library. Google's framework is pretty simple and very popular, but I'm open to other suggestions.

loumalouomega · 2024-11-21T15:57:53Z

* [`tsl::robin_map`](https://github.com/Tessil/robin-map)
This one is super slow at least for moderate sizes in my own tests.
Maybe, but that just highlights the dangers of taking devs benchmarks of their own libs at face value. The author of tsl::robin_map also did a benchmark that painted his own implementation in a rather flattering light.

Yes, in fcat this was the first one I tried and it was the slowest one of all I tried.

loumalouomega · 2024-11-21T15:59:24Z

If you are not familiar with google's benchmark framework, I can shoot you an example with std::unordered_map and you can build on top of that for the other implementations.

I would like to have something standarized in Kratos instead of just hand made each time. It is not possible to reuse our GTest infrastructure?

I'm all for something standardized, but GTest is definitely not a benchmarking library. Google's framework is pretty simple and very popular, but I'm open to other suggestions.

Maybe we can at least add a cmake loop to compile the benchmarks ...

matekelemen · 2024-11-21T16:03:43Z

Maybe we can at least add a cmake loop to compile the benchmarks ...

I'd put the benchmarks in a different repo, similar to how we deal with examples.

loumalouomega · 2024-11-21T16:16:46Z

Maybe we can at least add a cmake loop to compile the benchmarks ...

I'd put the benchmarks in a different repo, similar to how we deal with examples.

Usually code of benchmakes is not very different from tests. Examples are huge in comparison.

RiccardoRossi · 2024-11-23T09:56:50Z

Tonadd my two cents to Mate'@ comments:

memory occupation particularly for the dqtabase is of paramount importance. One simple way to improve efficiency is to decrease the loadfactor, but we cannot affor it in the database
portability
maintenance: is there a team behind the lib?

loumalouomega · 2024-11-26T15:24:00Z

Merging master after #12867, I will write a benchmark...

loumalouomega added 2 commits November 19, 2024 16:37

[External] Adding ankerl unordered_dense

55ed3e5

Using in Kratoss containers

112e604

loumalouomega added Kratos Core External library labels Nov 19, 2024

loumalouomega requested a review from roigcarlo November 19, 2024 15:50

loumalouomega requested a review from a team as a code owner November 19, 2024 15:50

matekelemen requested changes Nov 21, 2024

View reviewed changes

loumalouomega mentioned this pull request Nov 24, 2024

[External] Adding Google benchmark option in CMake #12867

Merged

loumalouomega added 3 commits November 26, 2024 16:24

Merge branch 'master' into external/ankerl-unordered_dense

49be740

Add benchmarks for unordered_map and unordered_set operations

3873041

Comment out rehashing benchmark for unordered_map

bdc9c13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[External] Adding `ankerl` `unordered_dense` #12861

[External] Adding `ankerl` `unordered_dense` #12861

loumalouomega commented Nov 19, 2024

matekelemen left a comment •

edited

Loading

loumalouomega commented Nov 21, 2024

loumalouomega commented Nov 21, 2024

matekelemen commented Nov 21, 2024 •

edited

Loading

matekelemen commented Nov 21, 2024

loumalouomega commented Nov 21, 2024

loumalouomega commented Nov 21, 2024

matekelemen commented Nov 21, 2024

loumalouomega commented Nov 21, 2024

RiccardoRossi commented Nov 23, 2024

loumalouomega commented Nov 26, 2024

[External] Adding ankerl unordered_dense #12861

Are you sure you want to change the base?

[External] Adding ankerl unordered_dense #12861

Conversation

loumalouomega commented Nov 19, 2024

Introduction

Key Features

Main Components

matekelemen left a comment • edited Loading

Choose a reason for hiding this comment

What to compare

What to benchmark

Hardware

loumalouomega commented Nov 21, 2024

loumalouomega commented Nov 21, 2024

matekelemen commented Nov 21, 2024 • edited Loading

matekelemen commented Nov 21, 2024

loumalouomega commented Nov 21, 2024

loumalouomega commented Nov 21, 2024

matekelemen commented Nov 21, 2024

loumalouomega commented Nov 21, 2024

RiccardoRossi commented Nov 23, 2024

loumalouomega commented Nov 26, 2024

[External] Adding `ankerl` `unordered_dense` #12861

[External] Adding `ankerl` `unordered_dense` #12861

matekelemen left a comment •

edited

Loading

matekelemen commented Nov 21, 2024 •

edited

Loading