Bitcoin Core integration/staging tree
Find a file
MarcoFalke 2583966130
Merge #19478: Remove CTxMempool::mapLinks data structure member
296be8f58e Get rid of unused functions CTxMemPool::GetMemPoolChildren, CTxMemPool::GetMemPoolParents (Jeremy Rubin)
46d955d196 Remove mapLinks in favor of entry inlined structs with iterator type erasure (Jeremy Rubin)

Pull request description:

  Currently we have a peculiar data structure in the mempool called maplinks. Maplinks job is to track the in-pool children and parents of each transaction. This PR can be primarily understood and reviewed as a simple refactoring to remove this extra data structure, although it comes with a nice memory and performance improvement for free.

  Maplinks is particularly peculiar because removing it is not as simple as just moving it's inner structure to the owning CTxMempoolEntry. Because TxLinks (the class storing the setEntries for parents and children) store txiters to each entry in the mempool corresponding to the parent or child, it means that the TxLinks type is "aware" of the boost multiindex (mapTx) it's coming from, which is in turn, aware of the entry type stored in mapTx. Thus we used maplinks to store this entry associated data we in an entirely separate data structure just to avoid a circular type reference caused by storing a txiter inside a CTxMempoolEntry.

  It turns out, we can kill this circular reference by making use of iterator_to multiindex function and std::reference_wrapper. This allows us to get rid of the maplinks data structure and move the ownership of the parents/child sets to the entries themselves.

  The benefit of this good all around, for any of the reasons given below the change would be acceptable, and it doesn't make the code harder to reason about or worse in any respect (as far as I can tell, there's no tradeoff).

  ### Simpler ownership model
  No longer having to consistency check that mapLinks did have records for our CTxMempoolEntry, impossible to have a mapLinks entry outlive or incorrectly die before a CTxMempoolEntry.

  ### Memory Usage
  We get rid of a O(Transactions) sized map in the mempool, which is a long lived data structure.

  ### Performance
  If you have a CTxMemPoolEntry, you immediately know the address of it's children/parents, rather than having to do a O(log(Transactions)) lookup via maplinks (which we do very often). We do it in *so many* places that a true benchmark has to look at a full running node, but it is easy enough to show an improvement in this case.

  The ComplexMemPool shows a good coherence check that we see the expected result of it being 12.5% faster / 1.14x faster.
  ```
  Before:
  # Benchmark, evals, iterations, total, min, max, median
  ComplexMemPool, 5, 1, 1.40462, 0.277222, 0.285339, 0.279793

  After:
  # Benchmark, evals, iterations, total, min, max, median
  ComplexMemPool, 5, 1, 1.22586, 0.243831, 0.247076, 0.244596
  ```
  The ComplexMemPool benchmark only checks doing addUnchecked and TrimToSize for 800 transactions. While this bench does a good job of hammering the relevant types of function, it doesn't test everything.

  Subbing in 5000 transactions shows a that the advantage isn't completely wiped out by other asymptotic factors (this isn't the only bottleneck in growing the mempool), but it's only a bit proportionally slower (10.8%, 1.12x), which adds evidence that this will be a good change for performance minded users.

  ```
  # Benchmark, evals, iterations, total, min, max, median
  ComplexMemPool, 5, 1, 59.1321, 11.5919, 12.235, 11.7068

  # Benchmark, evals, iterations, total, min, max, median
  ComplexMemPool, 5, 1, 52.1307, 10.2641, 10.5206, 10.4306
  ```
  I don't think it's possible to come up with an example of where a maplinks based design would have better performance, but it's something for reviewers to consider.

  # Discussion
  ## Why maplinks in the first place?

  I spoke with the author of mapLinks (sdaftuar) a while back, and my recollection from our conversation was that it was implemented because he did not know how to resolve the circular dependency at the time, and there was no other reason for making it a separate map.

  ## Is iterator_to weird?

  iterator_to is expressly for this purpose, see https://www.boost.org/doc/libs/1_51_0/libs/multi_index/doc/tutorial/indices.html#iterator_to

  >  iterator_to provides a way to retrieve an iterator to an element from a pointer to the element, thus making iterators and pointers interchangeable for the purposes of element pointing (not so for traversal) in many situations. This notwithstanding, it is not the aim of iterator_to to promote the usage of pointers as substitutes for real iterators: the latter are specifically designed for handling the elements of a container, and not only benefit from the iterator orientation of container interfaces, but are also capable of exposing many more programming bugs than raw pointers, both at compile and run time. iterator_to is thus meant to be used in scenarios where access via iterators is not suitable or desireable:
  >
  >     - Interoperability with preexisting APIs based on pointers or references.
  >     - Publication of pointer-based interfaces (for instance, when designing a C-compatible library).
  >     - The exposure of pointers in place of iterators can act as a type erasure barrier effectively decoupling the user of the code from the implementation detail of which particular container is being used. Similar techniques, like the famous Pimpl idiom, are used in large projects to reduce dependencies and build times.
  >     - Self-referencing contexts where an element acts upon its owner container and no iterator to itself is available.

  In other words, iterator_to is the perfect tool for the job by the last reason given. Under the hood it should just be a simple pointer cast and have no major runtime overhead (depending on if the function call is inlined).

  Edit by laanwj: removed at sign from the description

ACKs for top commit:
  jonatack:
    re-ACK 296be8f per `git range-diff ab338a19 3ba1665 296be8f`, sanity check gcc 10.2 debug build is clean.
  hebasto:
    re-ACK 296be8f58e, only rebased since my [previous](https://github.com/bitcoin/bitcoin/pull/19478#pullrequestreview-482400727) review (verified with `git range-diff`).

Tree-SHA512: f5c30a4936fcde6ae32a02823c303b3568a747c2681d11f87df88a149f984a6d3b4c81f391859afbeb68864ef7f6a3d8779f74a58e3de701b3d51f78e498682e
2020-09-07 12:06:55 +02:00
.github doc: Remove label from good first issue template 2020-08-24 09:31:24 +02:00
.tx tx: Bump transifex slug to 020x 2020-03-16 10:52:55 +01:00
build-aux/m4 Merge #17396: build: modest Android improvements 2020-08-24 21:27:29 +08:00
build_msvc Move Win32 defines to configure.ac to ensure they are globally defined 2020-08-20 17:55:06 +00:00
ci util: Hard code previous release tarball checksums 2020-08-29 11:28:53 +03:00
contrib scripted-diff: Move previous_release.py to test/get_previous_releases.py 2020-08-29 11:26:25 +03:00
depends Merge #19685: depends: CMake invocation cleanup 2020-09-02 21:03:05 +08:00
doc Merge #19405: rpc, cli: add network in/out connections to getnetworkinfo and -getinfo 2020-09-04 15:09:37 +02:00
share doc: Use precise permission flags where possible 2020-07-10 15:37:42 +02:00
src Merge #19478: Remove CTxMempool::mapLinks data structure member 2020-09-07 12:06:55 +02:00
test Merge #19619: Remove wallet.dat path handling from wallet.cpp, rpcwallet.cpp 2020-09-07 11:45:36 +12:00
.appveyor.yml Update the vcpkg checkout commit ID in appveyor config. 2020-08-31 08:10:02 +01:00
.cirrus.yml ci: Double tsan CPU and Memory to avoid global timeout 2020-09-05 14:02:55 +02:00
.fuzzbuzz.yml ci: Add fuzzbuzz integration 2020-04-14 16:38:26 +00:00
.gitattributes Separate protocol versioning from clientversion 2014-10-29 00:24:40 -04:00
.gitignore build: Add missed fuzz.coverage/ directory to .gitignore 2020-08-08 23:52:18 +03:00
.python-version .python-version: Specify full version 3.5.6 2019-03-02 12:06:26 -05:00
.style.yapf test: .style.yapf: Set column_limit=160 2019-03-04 18:28:13 -05:00
.travis.yml ci: Run valgrind fuzzer on cirrus 2020-08-17 11:52:02 +02:00
autogen.sh scripted-diff: Bump copyright of files changed in 2019 2019-12-30 10:42:20 +13:00
configure.ac Merge #19803: Bugfix: Define and use HAVE_FDATASYNC correctly outside LevelDB 2020-08-31 13:07:24 +08:00
CONTRIBUTING.md Replace hidden service with onion service 2020-08-07 14:55:02 +02:00
COPYING doc: Update license year range to 2020 2019-12-26 23:11:21 +01:00
INSTALL.md Update INSTALL landing redirection notice for build instructions. 2016-10-06 12:27:23 +13:00
libbitcoinconsensus.pc.in build: remove libcrypto as internal dependency in libbitcoinconsensus.pc 2019-11-19 15:03:44 +01:00
Makefile.am build: add /usr/local/ to LCOV_FILTER_PATTERN for macOS builds 2020-09-02 15:17:13 -04:00
README.md doc: Mention repo split in the READMEs 2020-06-08 10:06:14 -04:00
SECURITY.md doc: Remove explicit mention of version from SECURITY.md 2019-06-14 06:39:17 -04:00

Bitcoin Core integration/staging tree

https://bitcoincore.org

What is Bitcoin?

Bitcoin is an experimental digital currency that enables instant payments to anyone, anywhere in the world. Bitcoin uses peer-to-peer technology to operate with no central authority: managing transactions and issuing money are carried out collectively by the network. Bitcoin Core is the name of open source software which enables the use of this currency.

For more information, as well as an immediately usable, binary version of the Bitcoin Core software, see https://bitcoincore.org/en/download/, or read the original whitepaper.

License

Bitcoin Core is released under the terms of the MIT license. See COPYING for more information or see https://opensource.org/licenses/MIT.

Development Process

The master branch is regularly built (see doc/build-*.md for instructions) and tested, but it is not guaranteed to be completely stable. Tags are created regularly from release branches to indicate new official, stable release versions of Bitcoin Core.

The https://github.com/bitcoin-core/gui repository is used exclusively for the development of the GUI. Its master branch is identical in all monotree repositories. Release branches and tags do not exist, so please do not fork that repository unless it is for development reasons.

The contribution workflow is described in CONTRIBUTING.md and useful hints for developers can be found in doc/developer-notes.md.

Testing

Testing and code review is the bottleneck for development; we get more pull requests than we can review and test on short notice. Please be patient and help out by testing other people's pull requests, and remember this is a security-critical project where any mistake might cost people lots of money.

Automated Testing

Developers are strongly encouraged to write unit tests for new code, and to submit new unit tests for old code. Unit tests can be compiled and run (assuming they weren't disabled in configure) with: make check. Further details on running and extending unit tests can be found in /src/test/README.md.

There are also regression and integration tests, written in Python, that are run automatically on the build server. These tests can be run (if the test dependencies are installed) with: test/functional/test_runner.py

The Travis CI system makes sure that every pull request is built for Windows, Linux, and macOS, and that unit/sanity tests are run automatically.

Manual Quality Assurance (QA) Testing

Changes should be tested by somebody other than the developer who wrote the code. This is especially important for large or high-risk changes. It is useful to add a test plan to the pull request description if testing the changes is not straightforward.

Translations

Changes to translations as well as new translations can be submitted to Bitcoin Core's Transifex page.

Translations are periodically pulled from Transifex and merged into the git repository. See the translation process for details on how this works.

Important: We do not accept translation changes as GitHub pull requests because the next pull from Transifex would automatically overwrite them again.

Translators should also subscribe to the mailing list.