bitcoin/doc/benchmarking.md

Benchmarking
============

Bitcoin Core has an internal benchmarking framework, with benchmarks
for cryptographic algorithms (e.g. SHA1, SHA256, SHA512, RIPEMD160, Poly1305, ChaCha20), rolling bloom filter, coins selection,
thread queue, wallet balance.

Running
---------------------

For benchmarks purposes you only need to compile `bitcoin_bench`. Beware of configuring without `--enable-debug` as this would impact
benchmarking by unlatching log printers and lock analysis.

    make -C src bitcoin_bench

After compiling bitcoin-core, the benchmarks can be run with:

    src/bench/bench_bitcoin

The output will look similar to:
```
|             ns/byte |              byte/s | error % | benchmark
|--------------------:|--------------------:|--------:|:----------------------------------------------
|               64.13 |       15,592,356.01 |    0.1% | `Base58CheckEncode`
|               24.56 |       40,722,672.68 |    0.2% | `Base58Decode`
...
```

Help
---------------------

    src/bench/bench_bitcoin --help

To print options like scaling factor or per-benchmark filter.

Notes
---------------------
More benchmarks are needed for, in no particular order:
- Script Validation
- Coins database
- Memory pool
- Cuckoo Cache
- P2P throughput

Going Further
--------------------

To monitor Bitcoin Core performance more in depth (like reindex or IBD): https://github.com/chaincodelabs/bitcoinperf

To generate Flame Graphs for Bitcoin Core: https://github.com/eklitzke/bitcoin/blob/flamegraphs/doc/flamegraphs.md
[Doc] Add benchmarking notes 2016-05-27 10:01:14 +02:00			`Benchmarking`
			`============`

			`Bitcoin Core has an internal benchmarking framework, with benchmarks`
[doc] Update and extend benchmarking.md 2019-08-02 18:11:04 +02:00			`for cryptographic algorithms (e.g. SHA1, SHA256, SHA512, RIPEMD160, Poly1305, ChaCha20), rolling bloom filter, coins selection,`
			`thread queue, wallet balance.`
[Doc] Add benchmarking notes 2016-05-27 10:01:14 +02:00
Updating benchmarkmarking.md with an updated sample output and help options 2018-01-15 02:53:41 +01:00			`Running`
			`---------------------`
[doc] Update and extend benchmarking.md 2019-08-02 18:11:04 +02:00
			For benchmarks purposes you only need to compile `bitcoin_bench`. Beware of configuring without `--enable-debug` as this would impact
			`benchmarking by unlatching log printers and lock analysis.`

[doc] fix Makefile target in benchmarking.md while the resulting binary is called `bench_bitcoin`, the Makefile target is named `bitcoin_bench` (see `src/Makefile.bench.include`) 2019-10-08 22:14:01 +02:00			`make -C src bitcoin_bench`
[doc] Update and extend benchmarking.md 2019-08-02 18:11:04 +02:00
[Doc] Add benchmarking notes 2016-05-27 10:01:14 +02:00			`After compiling bitcoin-core, the benchmarks can be run with:`
Updating benchmarkmarking.md with an updated sample output and help options 2018-01-15 02:53:41 +01:00
			`src/bench/bench_bitcoin`
[Doc] Add benchmarking notes 2016-05-27 10:01:14 +02:00
			`The output will look similar to:`
			```
Replace current benchmarking framework with nanobench This replaces the current benchmarking framework with nanobench [1], an MIT licensed single-header benchmarking library, of which I am the autor. This has in my opinion several advantages, especially on Linux: * fast: Running all benchmarks takes ~6 seconds instead of 4m13s on an Intel i7-8700 CPU @ 3.20GHz. * accurate: I ran e.g. the benchmark for SipHash_32b 10 times and calculate standard deviation / mean = coefficient of variation: * 0.57% CV for old benchmarking framework * 0.20% CV for nanobench So the benchmark results with nanobench seem to vary less than with the old framework. * It automatically determines runtime based on clock precision, no need to specify number of evaluations. * measure instructions, cycles, branches, instructions per cycle, branch misses (only Linux, when performance counters are available) * output in markdown table format. * Warn about unstable environment (frequency scaling, turbo, ...) * For better profiling, it is possible to set the environment variable NANOBENCH_ENDLESS to force endless running of a particular benchmark without the need to recompile. This makes it to e.g. run "perf top" and look at hotspots. Here is an example copy & pasted from the terminal output: \| ns/byte \| byte/s \| err% \| ins/byte \| cyc/byte \| IPC \| bra/byte \| miss% \| total \| benchmark \|--------------------:\|--------------------:\|--------:\|----------------:\|----------------:\|-------:\|---------------:\|--------:\|----------:\|:---------- \| 2.52 \| 396,529,415.94 \| 0.6% \| 25.42 \| 8.02 \| 3.169 \| 0.06 \| 0.0% \| 0.03 \| `bench/crypto_hash.cpp RIPEMD160` \| 1.87 \| 535,161,444.83 \| 0.3% \| 21.36 \| 5.95 \| 3.589 \| 0.06 \| 0.0% \| 0.02 \| `bench/crypto_hash.cpp SHA1` \| 3.22 \| 310,344,174.79 \| 1.1% \| 36.80 \| 10.22 \| 3.601 \| 0.09 \| 0.0% \| 0.04 \| `bench/crypto_hash.cpp SHA256` \| 2.01 \| 496,375,796.23 \| 0.0% \| 18.72 \| 6.43 \| 2.911 \| 0.01 \| 1.0% \| 0.00 \| `bench/crypto_hash.cpp SHA256D64_1024` \| 7.23 \| 138,263,519.35 \| 0.1% \| 82.66 \| 23.11 \| 3.577 \| 1.63 \| 0.1% \| 0.00 \| `bench/crypto_hash.cpp SHA256_32b` \| 3.04 \| 328,780,166.40 \| 0.3% \| 35.82 \| 9.69 \| 3.696 \| 0.03 \| 0.0% \| 0.03 \| `bench/crypto_hash.cpp SHA512` [1] https://github.com/martinus/nanobench * Adds support for asymptotes This adds support to calculate asymptotic complexity of a benchmark. This is similar to #17375, but currently only one asymptote is supported, and I have added support in the benchmark `ComplexMemPool` as an example. Usage is e.g. like this: ``` ./bench_bitcoin -filter=ComplexMemPool -asymptote=25,50,100,200,400,600,800 ``` This runs the benchmark `ComplexMemPool` several times but with different complexityN settings. The benchmark can extract that number and use it accordingly. Here, it's used for `childTxs`. The output is this: \| complexityN \| ns/op \| op/s \| err% \| ins/op \| cyc/op \| IPC \| total \| benchmark \|------------:\|--------------------:\|--------------------:\|--------:\|----------------:\|----------------:\|-------:\|----------:\|:---------- \| 25 \| 1,064,241.00 \| 939.64 \| 1.4% \| 3,960,279.00 \| 2,829,708.00 \| 1.400 \| 0.01 \| `ComplexMemPool` \| 50 \| 1,579,530.00 \| 633.10 \| 1.0% \| 6,231,810.00 \| 4,412,674.00 \| 1.412 \| 0.02 \| `ComplexMemPool` \| 100 \| 4,022,774.00 \| 248.58 \| 0.6% \| 16,544,406.00 \| 11,889,535.00 \| 1.392 \| 0.04 \| `ComplexMemPool` \| 200 \| 15,390,986.00 \| 64.97 \| 0.2% \| 63,904,254.00 \| 47,731,705.00 \| 1.339 \| 0.17 \| `ComplexMemPool` \| 400 \| 69,394,711.00 \| 14.41 \| 0.1% \| 272,602,461.00 \| 219,014,691.00 \| 1.245 \| 0.76 \| `ComplexMemPool` \| 600 \| 168,977,165.00 \| 5.92 \| 0.1% \| 639,108,082.00 \| 535,316,887.00 \| 1.194 \| 1.86 \| `ComplexMemPool` \| 800 \| 310,109,077.00 \| 3.22 \| 0.1% \|1,149,134,246.00 \| 984,620,812.00 \| 1.167 \| 3.41 \| `ComplexMemPool` \| coefficient \| err% \| complexity \|--------------:\|-------:\|------------ \| 4.78486e-07 \| 4.5% \| O(n^2) \| 6.38557e-10 \| 21.7% \| O(n^3) \| 3.42338e-05 \| 38.0% \| O(n log n) \| 0.000313914 \| 46.9% \| O(n) \| 0.0129823 \| 114.4% \| O(log n) \| 0.0815055 \| 133.8% \| O(1) The best fitting curve is O(n^2), so the algorithm seems to scale quadratic with `childTxs` in the range 25 to 800. 2020-06-13 09:37:27 +02:00			`\| ns/byte \| byte/s \| error % \| benchmark`
			`\|--------------------:\|--------------------:\|--------:\|:----------------------------------------------`
			\| 64.13 \| 15,592,356.01 \| 0.1% \| `Base58CheckEncode`
			\| 24.56 \| 40,722,672.68 \| 0.2% \| `Base58Decode`
[doc] Update and extend benchmarking.md 2019-08-02 18:11:04 +02:00			`...`
[Doc] Add benchmarking notes 2016-05-27 10:01:14 +02:00			```

Updating benchmarkmarking.md with an updated sample output and help options 2018-01-15 02:53:41 +01:00			`Help`
			`---------------------`

[doc] Update and extend benchmarking.md 2019-08-02 18:11:04 +02:00			`src/bench/bench_bitcoin --help`

			`To print options like scaling factor or per-benchmark filter.`
Updating benchmarkmarking.md with an updated sample output and help options 2018-01-15 02:53:41 +01:00
			`Notes`
			`---------------------`
[Doc] Add benchmarking notes 2016-05-27 10:01:14 +02:00			`More benchmarks are needed for, in no particular order:`
			`- Script Validation`
			`- Coins database`
			`- Memory pool`
[doc] Update and extend benchmarking.md 2019-08-02 18:11:04 +02:00			`- Cuckoo Cache`
			`- P2P throughput`

			`Going Further`
			`--------------------`

			`To monitor Bitcoin Core performance more in depth (like reindex or IBD): https://github.com/chaincodelabs/bitcoinperf`

			`To generate Flame Graphs for Bitcoin Core: https://github.com/eklitzke/bitcoin/blob/flamegraphs/doc/flamegraphs.md`