This is to mitigate CVE-2017-12842. Along the way, also error when
deserializing transactions that have the witness marker flag set
but have no witnesses. This matches Bitcoin Core's behaviour initially
introduced here https://github.com/bitcoin/bitcoin/pull/14039. Allowing
such transactions is benign, but this makes sure that our parsing code
matches Core's exactly.
In this commit, we update the top-level btcd package to use the latest
version of btcutil and also the chainhash package. With this version
bump, we can now use the new optimized dsha256 routine where applicable.
With this commit, I've covered most of the areas we'll hash an entire
transaction/block/header, but we may want to optimize some other areas
further, in particular, the witness sighash calc.
In this commit, we fix a bug that would cause nodes to be unable to
parse a given block from the wire. The block would be properly accepted
if fed in via other mechanisms.
The issue here is that the old checks for the maximum witness size,
circa segwit v0 where placed in the wire package _as well_ as the tx
engine. This check should only be in the engine, since it's properly
gated by other related scrip validation flags.
The fix itself is simple: limit witnesses only based on the maximum
block size in bytes, or ~4MB.
In this commit, we add a total of 2760 taproot reference tests generated
by the bitcoind functional tests located at:
https://github.com/bitcoin/bitcoin/blob/master/test/functional/feature_taproot.py.
The tests aren't deterministic (fresh private keys are generated), so we
time we go to update the set of tests, we'll end up with fresh hashes
(the file name is the sha1 of the raw json test) and tests.
Summary of changes:
- Add a new const TxFlagMarker to indicate the flag prefix byte.
- Add a new TxFlag type to enumerate the flags supported by the
tx parser.
This allows us to avoid hardcoded magics, and will make it easier
to support new flags in future.
- Improve code comments.
Closes#1598.
This modifies the utxoset in the database and related UtxoViewpoint to
store and work with unspent transaction outputs on a per-output basis
instead of at a transaction level. This was inspired by similar recent
changes in Bitcoin Core.
The primary motivation is to simplify the code, pave the way for a
utxo cache, and generally focus on optimizing runtime performance.
The tradeoff is that this approach does somewhat increase the size of
the serialized utxoset since it means that the transaction hash is
duplicated for each output as a part of the key and some additional
details such as whether the containing transaction is a coinbase and the
block height it was a part of are duplicated in each output.
However, in practice, the size difference isn't all that large, disk
space is relatively cheap, certainly cheaper than memory, and it is much
more important to provide more efficient runtime operation since that is
the ultimate purpose of the daemon.
While performing this conversion, it also simplifies the code to remove
the transaction version information from the utxoset as well as the
spend journal. The logic for only serializing it under certain
circumstances is complicated and it isn't actually used anywhere aside
from the gettxout RPC where it also isn't used by anything important
either. Consequently, this also removes the version field of the
gettxout RPC result.
The utxos in the database are automatically migrated to the new format
with this commit and it is possible to interrupt and resume the
migration process.
Finally, it also updates the tests for the new format and adds a new
function to the tests to convert the old test data to the new format for
convenience. The data has already been converted and updated in the
commit.
An overview of the changes are as follows:
- Remove transaction version from both spent and unspent output entries
- Update utxo serialization format to exclude the version
- Modify the spend journal serialization format
- The old version field is now reserved and always stores zero and
ignores it when reading
- This allows old entries to be used by new code without having to
migrate the entire spend journal
- Remove version field from gettxout RPC result
- Convert UtxoEntry to represent a specific utxo instead of a
transaction with all remaining utxos
- Optimize for memory usage with an eye towards a utxo cache
- Combine details such as whether the txout was contained in a
coinbase, is spent, and is modified into a single packed field of
bit flags
- Align entry fields to eliminate extra padding since ultimately
there will be a lot of these in memory
- Introduce a free list for serializing an outpoint to the database
key format to significantly reduce pressure on the GC
- Update all related functions that previously dealt with transaction
hashes to accept outpoints instead
- Update all callers accordingly
- Only add individually requested outputs from the mempool when
constructing a mempool view
- Modify the spend journal to always store the block height and coinbase
information with every spent txout
- Introduce code to handle fetching the missing information from
another utxo from the same transaction in the event an old style
entry is encountered
- Make use of a database cursor with seek to do this much more
efficiently than testing every possible output
- Always decompress data loaded from the database now that a utxo entry
only consists of a specific output
- Introduce upgrade code to migrate the utxo set to the new format
- Store versions of the utxoset and spend journal buckets
- Allow migration process to be interrupted and resumed
- Update all tests to expect the correct encodings, remove tests that no
longer apply, and add new ones for the new expected behavior
- Convert old tests for the legacy utxo format deserialization code to
test the new function that is used during upgrade
- Update the utxostore test data and add function that was used to
convert it
- Introduce a few new functions on UtxoViewpoint
- AddTxOut for adding an individual txout versus all of them
- addTxOut to handle the common code between the new AddTxOut and
existing AddTxOuts
- RemoveEntry for removing an individual txout
- fetchEntryByHash for fetching any remaining utxo for a given
transaction hash
This commit implements most of BIP0143 by adding logic to implement the
new sighash calculation, signing, and additionally introduces the
HashCache optimization which eliminates the O(N^2) computational
complexity for the SIGHASH_ALL sighash type.
The HashCache struct is the equivalent to the existing SigCache struct,
but for caching the reusable midstate for transactions which are
spending segwitty outputs.
This commit implements the new witness encoding/decoding for
transactions as specified by BIP0144. After segwit activation, a
special transaction encoding is used to signal to upgraded nodes that
the transaction being deserialized bares witness data. The prior
BtcEncode and BtcDecode methods have been extended to be aware of the
new signaling bytes and the encoding of witness data within
transactions.
Additionally, a new method has been added to calculate the “stripped
size” of a transaction/block which is defined as the size of a
transaction/block *excluding* any witness data.
This commit modifies the existing wire.Message interface to introduce a
new MessageEncoding variant which dictates the exact encoding to be
used when serializing and deserializing messages. Such an option is now
necessary due to the segwit soft-fork package, as btcd will need to be
able to optionally encode transactions/blocks without witness data to
un-upgraded peers.
Two new functions have been introduced: ReadMessageWithEncodingN and
WriteMessageWithEncodingN which wrap BtcDecode/BtcEncode with the
desired encoding format.
This simplifies the code based on the recommendations of the gosimple
lint tool.
Also, it increases the deadline for the linters to run to 10 minutes and
reduces the number of threads that is uses. This is being done because
the Travis environment has become increasingly slower and it also seems
to be hampered by too many threads running concurrently.
This modifies the NewMsgTx function to accept the transaction version as
a parameter and updates all callers.
The reason for this change is so the transaction version can be bumped
in wire without breaking existing tests and to provide the caller with
the flexibility to create the specific transaction version they desire.
This commit introduces the concept of “sequence locks” borrowed from
Bitcoin Core for converting an input’s relative time-locks to an
absolute value based on a particular block for input maturity
evaluation.
A sequence lock is computed as the most distant maturity height/time
amongst all the referenced outputs within a particular transaction.
A transaction with sequence locks activated within any of its inputs
can *only* be included within a block if from the point-of-view of that
block either the time-based or height-based maturity for all referenced
inputs has been met.
A transaction with sequence locks can only be accepted to the mempool
iff from the point-of-view of the *next* (yet to be found block) all
referenced inputs within the transaction are mature.
This is mostly a backport of some of the same modifications made in
Decred along with a few additional things cleaned up. In particular,
this updates the code to make use of the new chainhash package.
Also, since this required API changes anyways and the hash algorithm is
no longer tied specifically to SHA, all other functions throughout the
code base which had "Sha" in their name have been changed to Hash so
they are not incorrectly implying the hash algorithm.
The following is an overview of the changes:
- Remove the wire.ShaHash type
- Update all references to wire.ShaHash to the new chainhash.Hash type
- Rename the following functions and update all references:
- wire.BlockHeader.BlockSha -> BlockHash
- wire.MsgBlock.BlockSha -> BlockHash
- wire.MsgBlock.TxShas -> TxHashes
- wire.MsgTx.TxSha -> TxHash
- blockchain.ShaHashToBig -> HashToBig
- peer.ShaFunc -> peer.HashFunc
- Rename all variables that included sha in their name to include hash
instead
- Update for function name changes in other dependent packages such as
btcutil
- Update copyright dates on all modified files
- Update glide.lock file to use the required version of btcutil
This commit drastically reduces the number of allocations needed to
deserialize a transaction and its scripts by using the combination of a
free list for initially deserializing the individual scripts along with
copying them into a single contiguous byte slice after the final size is
known and modifying each script in the transaction to point to its
location within the contiguous blob.
The end result is only a single allocation that holds all of the scripts
for a transaction regardless of the total number of scripts it has.
The script free list allows a maximum of 12,500 items with each buffer
being 512 bytes. This implies it will have a peak usage of 6.1MB. The
values were chosen based on profiling data and a desire to allow at
least 100 scripts per transaction to be simultaneously deserialized by
125 peers.
Also, while optimizing, decode directly into the existing previous
outpoint structure of each transaction input in order to avoid the extra
allocation per input that is otherwise caused when the local escapes to
the heap.
The following is a before and after comparison of the allocations
with the benchmarks that did not change removed:
benchmark old allocs new allocs delta
-----------------------------------------------------------
ReadTxOut 1 0 -100.00%
ReadTxIn 2 0 -100.00%
DeserializeTxSmall 7 5 -28.57%
DeserializeTxLarge 11146 6 -99.95%