Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't recreate stream and locale in loop #287

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

Osyotr
Copy link

@Osyotr Osyotr commented Sep 12, 2024

Using std::locale::classic() instead of creating a new std::locale object on each iteration of the loop gives a massive performance boost. Moving std::stringstream out of the loop further improves performance.

These are results from my benchmark:

Run on (20 X 2112 MHz CPU s)
CPU Caches:
  L1 Data 48 KiB (x10)
  L1 Instruction 32 KiB (x10)
  L2 Unified 1280 KiB (x10)
  L3 Unified 25600 KiB (x1)
----------------------------------------------------------------------
Benchmark                            Time             CPU   Iterations
----------------------------------------------------------------------
NewLocaleEveryIteration          15269 ns        15346 ns        44800
LocaleClassicEveryIteration        560 ns          558 ns      1120000
StreamClearStr                     379 ns          377 ns      1866667
Format                            37.0 ns         36.8 ns     18666667
fmt_lib                           58.4 ns         58.6 ns     11200000
to_chars                          12.8 ns         12.8 ns     56000000

Using `std::locale::classic()` instead of creating a new `std::locale` object on each iteration of the loop gives a massive performance boost.
Moving `std::stringstream` out of the loop further improves performance.
@ifcapps
Copy link
Collaborator

ifcapps commented Sep 13, 2024

Hmm, doesn't it crash in the parallel for loop?
Did you test it in release mode?

Then the loop runs in parallel threads, each using the same tempStream instance:
#define FOR_EACH_LOOP std::for_each( std::execution::par,

It could be optimized such that there is a
vector<std::stringstream> tempStreamPerThread with
int numThreads = std::thread::hardware_concurrency();

so that each thread re-uses the same std::stringstream

To prevent race condition
@Osyotr
Copy link
Author

Osyotr commented Sep 13, 2024

You are correct, this is a race condition in release builds. I've moved it back into the loop.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants