-
Notifications
You must be signed in to change notification settings - Fork 93
Running ASV Benchmarks
ASV is a benchmarking tool that is used by many prominent Python projects to benchmark and compare the performance of the library over time. Some prominent uses are by Numpy, Arrow, SciPy.
The tool has out-of-the-box support for creating benchmarks and using them to measure the performance over time (e.g. per commit, release, etc.) This is done by checking out, building and benchmarking each version. And the information is used to create graphics of the performance of the various versions on the various benchmarks. The latest benchmarks on the master versions can be seen here.
All of the code for the actual benchmarks is located in the benchmarks folder. Any new benchmarks should be added there, either in one of the existing classes/files or in a new one. We are mainly using tests that benchmark the runtime (time_...) or peak memory usage (peakmem_...). But ASV support other benchmark type, you can read more about them in their docs.
Currently, we have the following 4 major groups of benchmarks:
- Basic functions - for benchmarking operations such as read/write/append/update, their batch variant, etc. against a local storage
- List functions - for benchmarking operations such as listing symbols, versions, etc. against a local storage
- Local query builder - for benchmarking QB against a local storage (e.g. LMDB)
- Persistent Query Builder - for benchmarking QB against a persistent storage (e.g. AWS S3) so we can read bigger data size
It is important to understand how the each benchmark run is set up and torn down:
- setup_cache - is called only once, so any heavy computation should go in here (e.g. prepopulating some results in the DB)
- setup - is called before each benchmark run, so any setup that is light on computation should go in here (e.g. initializing the Arctic client)
- teardown - is the opposite of setup and is called after each benchmark run, so any cleanup that should be performed should go in here (be careful that you don't cleanup a library/symbol that is needed by a benchmark)
There is a workflow that automatically benchmarks the latest master commit every night. But if you need to run it manually for some reason, you can issue a manual build from here and click on the Run workflow menu. This will start a build that will benchmark only the latest version.
If you need to benchmark all commits that are tracked, make sure to select the run_all_benchmarks option.
To run ASV locally, you first need to make sure that you have some prerequisites installed(both can be installed with pip), namely:
- asv
- virtualenv
You also need to change the asv.conf.json file to point to your branch instead on master (e.g. "branches": ["some_brnach"], ) If you have introduced any new hard dependencies, you need to add them to the matrix of dependencies that will be installed.
After that you can simply run:
python -m asv run -v --show-stderr HEAD^! => if you want to benchmark only the latest commit
OR
python -m asv run -v --show-stderr HASHFILE:hashes_to_benchmark.txt -> if you want to benchmark all commits that are tracked
!!! Make sure to add any new ones that might be needed !!!
The benchmarks take between 30-60 minutes per commit to run, depending on the machine. After the benchmarks have ran successfully, you can view the result by executing the following commands:
python -m asv publish
python -m asv preview
This will recreate the graphs of the results and will serve them as a web page on localhost.
If you want to benchmark more than one commit(e.g. if you have added new benchmnarks), it might be better to run them on a GH Runner instead of locally. You will again need to change the asv.conf.json file to point to your branch instead on master (e.g. "branches": ["some_brnach"], ) And if you have introduced any new hard dependencies, you need to add them to the matrix of dependencies that will be installed.
Then push your changes and start a manual build from here. Make sure to select your branch and whether or not you want to run the benchmarks against all commits.
After the build finished successfully, it will push the latest results to your branch and you can pull them and view them by executing:
python -m asv publish
python -m asv preview
This will recreate the graphs of the results and will serve them as a web page on localhost.
You need to do this as your changes will not be served to the regular page that is available online. That page is updated only with the results from master.
ArcticDB Wiki