Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Execution order changes benchmark results #3766

Open
snnsnn opened this issue Dec 24, 2024 · 7 comments · May be fixed by #3777
Open

Execution order changes benchmark results #3766

snnsnn opened this issue Dec 24, 2024 · 7 comments · May be fixed by #3777

Comments

@snnsnn
Copy link

snnsnn commented Dec 24, 2024

Hi there! 👋

It appears that the execution order of items in the benchmark suite significantly impacts the reported outcomes. When running the same benchmark multiple times, the fastest library changes depending on the order in which the benchmarks are executed. Below are three separate runs of the handle-event benchmark, with varying results:

Run 1

$:hono/benchmarks/handle-event$ node index.mjs 
Hello hey
worktop x 140,112 ops/sec ±3.25% (88 runs sampled)
custom-implementation x 365,763 ops/sec ±2.15% (88 runs sampled)
Hono x 317,173 ops/sec ±4.31% (83 runs sampled)
itty-router x 321,088 ops/sec ±11.34% (72 runs sampled)
Fastest is custom-implementation

Run 2

$:hono/benchmarks/handle-event$ node index.mjs 
Hello hey
custom-implementation x 374,533 ops/sec ±1.89% (90 runs sampled)
worktop x 149,032 ops/sec ±1.58% (93 runs sampled)
Hono x 320,424 ops/sec ±4.13% (84 runs sampled)
itty-router x 307,371 ops/sec ±18.18% (72 runs sampled)
Fastest is custom-implementation

Run 3

$:hono/benchmarks/handle-event$ node index.mjs 
Hello hey
itty-router x 312,020 ops/sec ±17.62% (74 runs sampled)
custom-implementation x 105,017 ops/sec ±37.42% (36 runs sampled)
worktop x 16,306 ops/sec ±65.94% (30 runs sampled)
Hono x 82,493 ops/sec ±43.59% (34 runs sampled)
Fastest is itty-router

Libraries earlier in the execution order tend to perform better, while those executed later show significant performance degradation, likely due to resource constraints or runtime factors like garbage collection.

Could you consider implementing a solution to improve the accuracy and reliability of these benchmarks? It would ensure fair comparisons and provide more actionable performance insights.

Thank you!

@yusukebe
Copy link
Member

yusukebe commented Dec 24, 2024

Hi @snnsnn

I can't produce those behaviors on my machine(Apple M1 Pro/Mem: 32 GB). From fastest to slowest, they are Hono, sunder, itty-router, and worktop. The result is always the same even if I change the order of execution.

@EdamAme-x
Copy link
Contributor

I have previously asked the author of mitata about this issue.

It is possible for the results to vary depending on the order of execution.
However, that is in an environment with significantly less memory, and in an environment with sufficient memory, most of the overhead is shaved off by GC, etc., so the results rarely change depending on the order.

He said.

@snnsnn
Copy link
Author

snnsnn commented Dec 24, 2024

@yusukebe The runtime i used node v20.16.0 (x64-linux) with 16GB RAM, it is probably GC. The way I run the benchmarks is cd into benchmarks/handle-event folder and run pnpm run start after installing dependencies and modifying files. I removed sunder as it is not maintained and had module related issues.

It is probably GC related because I try to run the benchmark with deno v2.1.4, I ran out of memory:

<--- JS stacktrace --->
#
# Fatal JavaScript out of memory: Reached heap limit
#
==== C stack trace ===============================

    deno(+0x2d8b133) [0x5c91184c4133]
    deno(+0x2d8a9fb) [0x5c91184c39fb]
    deno(+0x2d85f18) [0x5c91184bef18]
    deno(+0x2ddc1ac) [0x5c91185151ac]
    deno(+0x2f90467) [0x5c91186c9467]
    deno(+0x2f8e613) [0x5c91186c7613]
    deno(+0x2f85211) [0x5c91186be211]
    deno(+0x2f85ce2) [0x5c91186bece2]
    deno(+0x2f6b55a) [0x5c91186a455a]
    deno(+0x33e0d22) [0x5c9118b19d22]
    deno(+0x426ad36) [0x5c91199a3d36]
Trace/breakpoint trap (core dumped)

With bun, the benchmark failed to execute due to error:

bun index.js 
64 |   }
65 | };
66 | var tryDecodeURI = (str) => tryDecode(str, decodeURI);
67 | var getPath = (request) => {
68 |   const url = request.url;
69 |   const start = url.indexOf("/", 8);
                     ^
TypeError: undefined is not an object (evaluating 'url.indexOf')
      at getPath (/home/user/hono/dist/utils/url.js:69:17)
      at #dispatch (/home/user/hono/dist/hono-base.js:170:18)
      at /home/user/hono/benchmarks/handle-event/index.js:131:19
      at fn (/home/user/hono/benchmarks/handle-event/index.js:130:12)
      at /home/user/hono/benchmarks/handle-event/index.js:143:1

Edit: Updating bun to v1.1.42 fixed the issue.

@yusukebe
Copy link
Member

I am unable to reproduce them. So I cannot do anything regarding this issue.

@snnsnn
Copy link
Author

snnsnn commented Dec 26, 2024

I have previously asked the author of mitata about this issue.

It is possible for the results to vary depending on the order of execution.
However, that is in an environment with significantly less memory, and in an environment with sufficient memory, most of the overhead is shaved off by GC, etc., so the results rarely change depending on the order.

He said.

The handle-event benchmarks uses benchmark, not mitata:

import Benchmark from 'benchmark'

@snnsnn
Copy link
Author

snnsnn commented Dec 26, 2024

@yusukebe Perhaps these tests could be migrated to the latest version of mitata, which provides configurations to mitigate external factors like garbage collection (GC): Mitata#garbage-collection-pressure.

Additionally, using more realistic test cases and increasing the number of tests to better reflect real-world scenarios might yield more reliable results. I’m not sure if using tsx impacts the benchmarks, but running the tests against compiled outputs might help eliminate any inconsistencies.

On the topic of realistic data, some benchmarks use GitHub API paths for testing. For instance, this Rust-based example could provide useful paths: Matchit Benchmark. Extracting similar paths could enhance the relevance of the tests.

At the moment, I’m on a tight schedule and don’t have in-depth knowledge of Hono’s routers. I came across these tests while searching for a fast router implementation in JavaScript. Please feel free to close the issue or keep it open in case someone else decides to tackle these improvements.

Thank you for the great work on Hono!

Best regards

@EdamAme-x EdamAme-x linked a pull request Dec 26, 2024 that will close this issue
@EdamAme-x
Copy link
Contributor

Hi @snnsnn
I sent PR about this
Can you check this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants