Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proper benchmarking #92

Merged
merged 19 commits into from
Aug 4, 2021
Merged

proper benchmarking #92

merged 19 commits into from
Aug 4, 2021

Conversation

hexaeder
Copy link
Member

@hexaeder hexaeder commented Jul 3, 2021

This is a proposal on how to do proper benchmarking for future version using PkgBenchmark.jl and BenchmarkTools.jl.

I've added a readme to the benchmark folder on how to run those.

In short

$ cd NetworkDynamics/benchmarks
$ ./run_benchmarks.jl v0.5.0    # this will take a few minutes

will create 3 markdownfiles with the results of the benchmarks

  • the target version, i.e. the current state of the repo
  • the baseline version (in this case v0.5.0) for the comparison
  • a comparison with relative differences for each of the timings.

You can see those markdown files in this gist.

The most importand is probably this section, where you can see the nice speedup of #80

Results

A ratio greater than 1.0 denotes a possible regression (marked with ❌), while a ratio less
than 1.0 denotes a possible improvement (marked with ✅). Only significant results - results
that indicate possible regressions or improvements - are shown below (thus, an empty table means that all
benchmark results remained invariant between builds).

ID time ratio memory ratio
["diffusion", "ode_edge", "assemble", "10"] 0.97 (5%) 0.87 (1%) ✅
["diffusion", "ode_edge", "assemble", "100"] 0.85 (5%) ✅ 0.41 (1%) ✅
["diffusion", "ode_edge", "assemble", "1000"] 0.32 (5%) ✅ 0.07 (1%) ✅
["diffusion", "ode_edge", "assemble", "10000"] 0.06 (5%) ✅ 0.01 (1%) ✅
["diffusion", "ode_edge", "call_mt", "10"] 0.93 (5%) ✅ 1.01 (1%)
["diffusion", "ode_edge", "call_mt", "100"] 0.64 (5%) ✅ 1.00 (1%)
["diffusion", "static_edge", "assemble", "10"] 0.95 (5%) ✅ 0.86 (1%) ✅
["diffusion", "static_edge", "assemble", "100"] 0.84 (5%) ✅ 0.40 (1%) ✅
["diffusion", "static_edge", "assemble", "1000"] 0.31 (5%) ✅ 0.07 (1%) ✅
["diffusion", "static_edge", "assemble", "10000"] 0.06 (5%) ✅ 0.01 (1%) ✅
["diffusion", "static_edge", "call", "10"] 1.51 (5%) ❌ 1.00 (1%)
["diffusion", "static_edge", "call", "1000"] 0.94 (5%) ✅ 1.00 (1%)
["diffusion", "static_edge", "call_mt", "10"] 0.82 (5%) ✅ 1.00 (1%)
["diffusion", "static_edge", "call_mt", "100"] 0.44 (5%) ✅ 0.99 (1%)
["diffusion", "static_edge", "call_mt", "1000"] 0.53 (5%) ✅ 1.00 (1%)
["diffusion", "static_edge", "call_mt", "10000"] 0.88 (5%) ✅ 0.99 (1%)
["kuramoto", "heterogeneous", "assemble", "10"] 0.96 (5%) 0.86 (1%) ✅
["kuramoto", "heterogeneous", "assemble", "100"] 0.84 (5%) ✅ 0.40 (1%) ✅
["kuramoto", "heterogeneous", "assemble", "1000"] 0.32 (5%) ✅ 0.07 (1%) ✅
["kuramoto", "heterogeneous", "assemble", "10000"] 0.03 (5%) ✅ 0.01 (1%) ✅
["kuramoto", "heterogeneous", "call", "10"] 0.88 (5%) ✅ 1.00 (1%)
["kuramoto", "heterogeneous", "call", "100"] 0.85 (5%) ✅ 1.00 (1%)
["kuramoto", "heterogeneous", "call", "1000"] 0.85 (5%) ✅ 1.00 (1%)
["kuramoto", "heterogeneous", "call", "10000"] 0.83 (5%) ✅ 1.00 (1%)
["kuramoto", "heterogeneous", "call_mt", "10"] 1.09 (5%) ❌ 1.00 (1%)
["kuramoto", "heterogeneous", "call_mt", "100"] 0.88 (5%) ✅ 1.00 (1%)
["kuramoto", "heterogeneous", "call_mt", "1000"] 0.85 (5%) ✅ 1.00 (1%)
["kuramoto", "heterogeneous", "call_mt", "10000"] 0.87 (5%) ✅ 1.00 (1%)
["kuramoto", "homogeneous", "assemble", "10"] 0.97 (5%) 0.87 (1%) ✅
["kuramoto", "homogeneous", "assemble", "100"] 0.85 (5%) ✅ 0.43 (1%) ✅
["kuramoto", "homogeneous", "assemble", "1000"] 0.34 (5%) ✅ 0.08 (1%) ✅
["kuramoto", "homogeneous", "assemble", "10000"] 0.03 (5%) ✅ 0.01 (1%) ✅
["kuramoto", "homogeneous", "call", "10"] 0.75 (5%) ✅ 1.00 (1%)
["kuramoto", "homogeneous", "call", "1000"] 0.90 (5%) ✅ 1.00 (1%)
["kuramoto", "homogeneous", "call_mt", "100"] 1.22 (5%) ❌ 1.00 (1%)
["kuramoto", "homogeneous", "call_mt", "1000"] 0.66 (5%) ✅ 1.00 (1%)
["kuramoto", "homogeneous", "call_mt", "10000"] 0.97 (5%) 0.99 (1%) ✅

Copy link
Collaborator

@lindnemi lindnemi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is absolutely great and could be merged right away. I was wondering about a few more things though

  • It would be nice to benchmark an inhomogeneous system with algebraic constraints in mass matrix form. Thus we would benchmark stiff solvers as well that should improve with the upcoming Jacobian vector product.

  • Would it be possible to add these benchmarks to CI? Or should we run them on our machines manually each time we release a new Version?

  • Just for clarification: If I get this right we could also benchmark against different Julia version in order catch potential regressions in Base?

Project.toml Outdated
@@ -4,6 +4,7 @@ authors = ["Frank Hellmann <[email protected]>, Michael Lindner <michaelli
version = "0.5.4"

[deps]
BenchmarkTools = "6e4b80f9-dd63-53aa-95a3-0cdb28fa8baf"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to add this dependency to a local environment in the benchmarking folder instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not that trivial unfortunately. Normally, pkgbenchmark will take the benchmark files from the specific commit you are benchmarking. I think this is not what we want here. During the process of benchmarking a different commit the script will

  • copy the NetworkDynamics repo to a temp location
  • copy the benchmark folder to a different temp location
  • change the temp NetworkDynamics folder to the commit to we want to
  • at this point, it is not clear how to reach the temp ND folder from the environment which lives inside the benchmark folder, therefore it is much simpler to run everything from the ND environment

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

on a second though though, this works with [email protected] where ther was no BenchmarkTools in the deps. We might get away without the explicit dependency. As i do right now with PkgBenchmarks (which you have to have in your global environment so it doesn't show up in the ND deps)

@hexaeder
Copy link
Member Author

hexaeder commented Jul 5, 2021

* It would be nice to benchmark an inhomogeneous system with algebraic constraints in mass matrix form. Thus we would benchmark stiff solvers as well that should improve with the upcoming Jacobian vector product.

well, thats the idea. Having a common place where to add as much benchmarks as we like... right now they are also a bit redudant, i'm not sure whether we should test everything for so many values of N

* Would it be possible to add these benchmarks to CI? Or should we run them on our machines manually each time we release a new Version?

In theory yeah, totally possible. We can test this out at some point. For this, i think running it lokally will be sufficient. The default config will allways compare to master. I thinke once your are happy with your PR, run ./run_benchmark on your PR branch and but the judgmentfile in the comments

* Just for clarification: If I get this right we could also benchmark against different Julia version in order catch potential regressions in Base?

Yes, right now i am adding more parameters to the benchmark script so you can do something like

$ ./run_benchmarks.jl --target directory --baseline directory --tcommand julia-nightly --bcommand julia

to benchmark the current directory on julia-nightly vs julia.

@lindnemi
Copy link
Collaborator

Unfortunately I can't run the benchmarks on my machine. I get:
ERROR: LoadError: AssertionError: Captured sysimg looks weird? using Revise...
-J/home/micha/julia/julia-1.6.2/lib/julia/sys.so

@lindnemi
Copy link
Collaborator

lindnemi commented Aug 4, 2021

ok great. the benchmarks make a lot of sense like that. diffusion benchmarks a star graph and kuramoto a sparse graph.

at some point we should benchmark different solver calls, especially when it comes to evaluating performance of JVPs. we could add that with the JVP PR #95.

@lindnemi lindnemi merged commit 643a662 into main Aug 4, 2021
@lindnemi lindnemi deleted the hw/benchmarking branch August 4, 2021 11:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants