Bitcoin Core integration/staging tree
Find a file
Ava Chow 43e71f7498
Merge bitcoin/bitcoin#27432: contrib: add tool to convert compact-serialized UTXO set to SQLite database
4080b66cbe test: add test for utxo-to-sqlite conversion script (Sebastian Falbesoner)
ec99ed7380 contrib: add tool to convert compact-serialized UTXO set to SQLite database (Sebastian Falbesoner)

Pull request description:

  ## Problem description

  There is demand from users to get the UTXO set in form of a SQLite database (#24628). Bitcoin Core currently only supports dumping the UTXO set in a binary _compact-serialized_ format, which was crafted specifically for AssumeUTXO snapshots (see PR #16899), with the primary goal of being as compact as possible. Previous PRs tried to extend the `dumptxoutset` RPC with new formats, either in human-readable form (e.g. #18689, #24202), or most recently, directly as SQLite database (#24952). Both are not optimal: due to the huge size of the ever-growing UTXO set with already more than 80 million entries on mainnet, human-readable formats are practically useless, and very likely one of the first steps would be to put them in some form of database anyway. Directly adding SQLite3 dumping support on the other hand introduces an additional dependency to the non-wallet part of bitcoind and the risk of increased maintenance burden (see e.g. https://github.com/bitcoin/bitcoin/pull/24952#issuecomment-1163551060, https://github.com/bitcoin/bitcoin/issues/24628#issuecomment-1108469715).

  ## Proposed solution

  This PR follows the "external tooling" route by adding a simple Python script for achieving the same goal in a two-step process (first create compact-serialized UTXO set via `dumptxoutset`, then convert it to SQLite via the new script). Executive summary:
  - single file, no extra dependencies (sqlite3 is included in Python's standard library [1])
  - ~150 LOC, mostly deserialization/decompression routines ported from the Core codebase and (probably the most difficult part) a little elliptic curve / finite field math to decompress pubkeys (essentialy solving the secp256k1 curve equation y^2 = x^3 + 7 for y given x, respecting the proper polarity as indicated by the compression tag)
  - creates a database with only one table `utxos` with the following schema:
    ```(txid TEXT, vout INT, value INT, coinbase INT, height INT, scriptpubkey TEXT)```
  - the resulting file has roughly 2x the size of the compact-serialized UTXO set (this is mostly due to encoding txids and scriptpubkeys as hex-strings rather than bytes)

  [1] note that there are some rare cases of operating systems like FreeBSD though, where the sqlite3 module has to installed explicitly (see #26819)

  A functional test is also added that creates UTXO set entries with various output script types (standard and also non-standard, for e.g. large scripts) and verifies that the UTXO sets of both formats match by comparing corresponding MuHashes. One MuHash is supplied by the bitcoind instance via `gettxoutsetinfo muhash`, the other is calculated in the test by reading back the created SQLite database entries and hashing them with the test framework's `MuHash3072` module.

  ## Manual test instructions
  I'd suggest to do manual tests also by comparing MuHashes. For that, I've written a go tool some time ago which would calculate the MuHash of a sqlite database in the created format (I've tried to do a similar tool in Python, but it's painfully slow).
  ```
  $ [run bitcoind instance with -coinstatsindex]
  $ ./src/bitcoin-cli dumptxoutset ~/utxos.dat
  $ ./src/bitcoin-cli gettxoutsetinfo muhash <block height returned in previous call>
  (outputs MuHash calculated from node)

  $ ./contrib/utxo-tools/utxo_to_sqlite.py ~/utxos.dat ~/utxos.sqlite
  $ git clone https://github.com/theStack/utxo_dump_tools
  $ cd utxo_dump_tools/calc_utxo_hash
  $ go run calc_utxo_hash.go ~/utxos.sqlite
  (outputs MuHash calculated from the SQLite UTXO set)

  => verify that both MuHashes are equal
  ```
  For a demonstration what can be done with the resulting database, see https://github.com/bitcoin/bitcoin/pull/24952#pullrequestreview-956290477 for some example queries. Thanks go to LarryRuane who gave me to the idea of rewriting this script in Python and adding it to `contrib`.

ACKs for top commit:
  ajtowns:
    ACK 4080b66cbe - light review
  achow101:
    ACK 4080b66cbe
  romanz:
    tACK 4080b66cbe on signet (using [calc_utxo_hash](8981aa3e85/calc_utxo_hash/calc_utxo_hash.go)):
  tdb3:
    ACK 4080b66cbe

Tree-SHA512: be8aa0369a28c8421a3ccdf1402e106563dd07c082269707311ca584d1c4c8c7b97d48c4fcd344696a36e7ab8cdb64a1d0ef9a192a15cff6d470baf21e46ee7b
2025-02-14 15:22:10 -08:00
.github Merge bitcoin/bitcoin#31428: ci: Allow build dir on CI host 2025-01-30 19:34:00 -05:00
.tx Update Transifex slug for 29.x 2025-02-06 09:38:49 +00:00
ci Merge bitcoin/bitcoin#31844: cmake: add a component for each binary 2025-02-14 14:19:12 +01:00
cmake Merge bitcoin/bitcoin#31359: cmake: Add CheckLinkerSupportsPIE module 2025-02-14 18:02:35 +01:00
contrib Merge bitcoin/bitcoin#27432: contrib: add tool to convert compact-serialized UTXO set to SQLite database 2025-02-14 15:22:10 -08:00
depends depends: avoid an unset CMAKE_OBJDUMP 2025-02-13 13:02:53 +01:00
doc Merge bitcoin/bitcoin#30529: Fix -norpcwhitelist, -norpcallowip, and similar corner case behavior 2025-02-14 15:10:09 -08:00
share build: Rename PACKAGE_* variables to CLIENT_* 2024-10-28 12:35:55 +00:00
src Merge bitcoin/bitcoin#30529: Fix -norpcwhitelist, -norpcallowip, and similar corner case behavior 2025-02-14 15:10:09 -08:00
test Merge bitcoin/bitcoin#27432: contrib: add tool to convert compact-serialized UTXO set to SQLite database 2025-02-14 15:22:10 -08:00
.cirrus.yml ci: Bump fuzz task timeout 2025-02-06 22:21:48 +01:00
.editorconfig code style: update .editorconfig file 2024-09-13 17:55:10 +02:00
.gitattributes Separate protocol versioning from clientversion 2014-10-29 00:24:40 -04:00
.gitignore build: Remove Autotools-based build system 2024-08-30 21:31:39 +01:00
.python-version Bump python minimum supported version to 3.10 2024-08-28 15:53:07 +02:00
.style.yapf Update .style.yapf 2023-06-01 23:35:10 +05:30
CMakeLists.txt Merge bitcoin/bitcoin#31359: cmake: Add CheckLinkerSupportsPIE module 2025-02-14 18:02:35 +01:00
CMakePresets.json cmake: Remove unused BUILD_TESTING variable from "dev-mode" preset 2024-12-19 22:25:11 +00:00
CONTRIBUTING.md doc: remove PR Review Club frequency 2024-11-20 11:16:39 +01:00
COPYING doc: upgrade license to 2025. 2025-01-06 12:23:11 +00:00
INSTALL.md doc: Added hyperlink for doc/build 2021-09-09 19:53:12 +05:30
libbitcoinkernel.pc.in build: use CLIENT_NAME in libbitcoinkernel.pc.in 2025-02-07 16:11:48 +00:00
README.md doc: cmake: prepend and explain "build/" where needed 2024-10-11 11:24:21 -06:00
SECURITY.md Update security.md contact for achow101 2023-12-14 18:14:54 -05:00
vcpkg.json Remove wallet::ParseISO8601DateTime, use ParseISO8601DateTime instead 2024-12-02 15:09:31 +01:00

Bitcoin Core integration/staging tree

https://bitcoincore.org

For an immediately usable, binary version of the Bitcoin Core software, see https://bitcoincore.org/en/download/.

What is Bitcoin Core?

Bitcoin Core connects to the Bitcoin peer-to-peer network to download and fully validate blocks and transactions. It also includes a wallet and graphical user interface, which can be optionally built.

Further information about Bitcoin Core is available in the doc folder.

License

Bitcoin Core is released under the terms of the MIT license. See COPYING for more information or see https://opensource.org/licenses/MIT.

Development Process

The master branch is regularly built (see doc/build-*.md for instructions) and tested, but it is not guaranteed to be completely stable. Tags are created regularly from release branches to indicate new official, stable release versions of Bitcoin Core.

The https://github.com/bitcoin-core/gui repository is used exclusively for the development of the GUI. Its master branch is identical in all monotree repositories. Release branches and tags do not exist, so please do not fork that repository unless it is for development reasons.

The contribution workflow is described in CONTRIBUTING.md and useful hints for developers can be found in doc/developer-notes.md.

Testing

Testing and code review is the bottleneck for development; we get more pull requests than we can review and test on short notice. Please be patient and help out by testing other people's pull requests, and remember this is a security-critical project where any mistake might cost people lots of money.

Automated Testing

Developers are strongly encouraged to write unit tests for new code, and to submit new unit tests for old code. Unit tests can be compiled and run (assuming they weren't disabled during the generation of the build system) with: ctest. Further details on running and extending unit tests can be found in /src/test/README.md.

There are also regression and integration tests, written in Python. These tests can be run (if the test dependencies are installed) with: build/test/functional/test_runner.py (assuming build is your build directory).

The CI (Continuous Integration) systems make sure that every pull request is built for Windows, Linux, and macOS, and that unit/sanity tests are run automatically.

Manual Quality Assurance (QA) Testing

Changes should be tested by somebody other than the developer who wrote the code. This is especially important for large or high-risk changes. It is useful to add a test plan to the pull request description if testing the changes is not straightforward.

Translations

Changes to translations as well as new translations can be submitted to Bitcoin Core's Transifex page.

Translations are periodically pulled from Transifex and merged into the git repository. See the translation process for details on how this works.

Important: We do not accept translation changes as GitHub pull requests because the next pull from Transifex would automatically overwrite them again.