This repository contains sets of micro benchmarks designed to run on single machine to help Apache Flink's developers assess performance implications of their changes.
The main methods defined in the various classes (test cases) are using jmh micro benchmark suite to define runners to execute those test cases. You can execute the default benchmark suite (which takes ~1hour) at once:
mvn -Dflink.version=1.5.0 clean install exec:exec
There is also a separate benchmark suit for state backend, and you can execute this suit (which takes ~1hour) using below command:
mvn -Dflink.version=1.5.0 clean package exec:exec \
-Dexec.executable=java -Dexec.args="-jar target/benchmarks.jar -rf csv org.apache.flink.state.benchmark.*"
If you want to execute just one benchmark, the best approach is to execute selected main function manually. There're mainly two ways:
-
From your IDE (hint there is a plugin for Intellij IDEA).
- In this case don't forget about selecting
flink.version
, default value for the property is defined in pom.xml.
- In this case don't forget about selecting
-
From command line, using command like:
mvn -Dflink.version=1.5.0 clean package exec:exec \ -Dexec.executable=java -Dexec.args="-jar target/benchmarks.jar <benchmark_class>"
We also support to run each benchmark once (with only one fork and one iteration) for testing, with below command:
mvn test -P test
Recommended code structure is to define all benchmarks in Apache Flink and only wrap them here, in this repository, into executor classes.
Such code structured is due to using GPL2 licensed jmh library for the actual execution of the benchmarks. Ideally we would prefer to have all of the code moved to Apache Flink
- If you can not measure the performance difference, then just don't bother (avoid premature optimisations).
- Make sure that you are not disturbing the benchmarks. While benchmarking, you shouldn't be touching the machine that's running the benchmarks. Scrolling web page in a browser or changing windows (alt/cmd + tab) can seriously affect the results.
- If in doubt, verify the results more then once, like:
- measure the base line
- measure the change, for example +5% performance improvement
- switch back to the base line, make sure that the result is worse
- go back to the change and verify +5% performance improvement
- if something doesn't show the results that you were expecting, investigate and don't ignore this! Maybe there is some larger performance instability and your previous results were just lucky/unlucky flukes.
- Some results can show up over the benchmarking noise only in long term trends.
- Please tune the length of the benchmark (usually by number of processed records). The less records, the faster the benchmark, the more iterations can be executed, however the higher chance of one of setup overheads skewing the results. Rule of thumb is that you should increase the number of processed records up to a point where results stop improving visibly, while trying to keep the single benchmark execution under 1 second.
Regarding naming the benchmark methods, there is one important thing. When uploading the results to the codespeed web UI, uploader is using just the benchmark's method name combined with the parameters to generate visible name of the benchmark in the UI. Because of that it is important to:
- Have the method name explanatory enough so we can quickly recognize what the given benchmark is all about, using just the benchmark's method name.
- Be as short as possible to not bloat the web UI (so we must drop all of the redundant suffixes/prefixes, like
benchmark
,test
, ...) - Class name is completely ignored by the uploader.
- Method names should be unique across the project.
Good example of how to name benchmark methods are:
networkThroughput
sessionWindow
Please attach the results of your benchmarks.