We used to use a lot of small buffers for serialization, but now we'll
use one buffer large enough, and slice into it when needed.
``
name old time/op new time/op delta
CalcWitnessSigHash-8 31.5µs ± 0% 29.2µs ± 0% -7.05% (p=0.000 n=10+10)
name old alloc/op new alloc/op delta
CalcWitnessSigHash-8 19.9kB ± 0% 18.5kB ± 0% -7.14% (p=0.000 n=10+10)
name old allocs/op new allocs/op delta
CalcWitnessSigHash-8 801 ± 0% 445 ± 0% -44.44% (p=0.000 n=10+10)
```
In this commit, we optimize the sighash calc further by writing directly
into the buffer used for serialization by the sha256.New() instance
rather than to an intermediate buffer, which is then write to the hash
buffer.
In this commit, we update the top-level btcd package to use the latest
version of btcutil and also the chainhash package. With this version
bump, we can now use the new optimized dsha256 routine where applicable.
With this commit, I've covered most of the areas we'll hash an entire
transaction/block/header, but we may want to optimize some other areas
further, in particular, the witness sighash calc.
The testing function in export_test.go is changed to just export.go so
that callers outside the ffldb package will be able to call the
function.
The main use for this is so that the prune code can be triggered from
the blockchain package. This allows testing code to have less than
1.5GB worth of blocks to trigger the prune.
If the prune will delete block past the last flush hash of the
utxocache, the cache will need to be flushed first to avoid a case
where the utxocache is irrecoverable. The newly added code adds this
flush logic to connectBlock.
flushNeededAfterPrune returns true if the utxocache needs to be flushed
after the pruning of the given slice of block hashes. For the utxo
cache to be recoverable while pruning is enabled, we need to make sure
that there exists blocks since the last utxo cache flush. If there are
blocks that are deleted after the last utxo cache flush, the utxo set is
irrecoverable. The added method provides a way to tell if a flush is
needed.
PruneBlocks used to delete files immediately before the database
transaction finished. By making the prune atomic, we can guarantee that
the database flush will happen before the utxo cache is flushed,
ensuring that the utxo cache is never in an irrecoverable state.
This change is part of the effort to add utxocache support to btcd.
utxo cache is now used by the BlockChain struct. By default it's used
and the minimum cache is set to 250MiB. The change made helps speed up
block/tx validation as the cache allows for much faster lookup of utxos.
The initial block download in particular is improved as the db i/o
bottleneck is remedied by the cache.
The implemented utxocache implements connectTransactions just like
utxoviewpoint and can be used as a drop in replacement for
connectTransactions.
One thing to note is that unlike the utxoViewpoint, the utxocache
immediately deletes the spent entry from the cache. This means that the
utxocache is unfit for functions like checkConnectBlock where you expect
the entry to still exist but be marked as spent.
disconnectTransactions is purposely not implemented as using the cache
during reorganizations may leave the utxo state inconsistent if there is
an unexpected shutdown. The utxoViewpoint will still have to be used
for reorganizations.
This change is part of the effort to add utxocache support to btcd.
fetchInputUtxos had mainly 2 functions:
1: Figure out which outpoints to fetch
2: Call fetchUtxosMain to fetch those outpoints
Functionality for (1) is refactored out to fetchInputsToFetch. This is
done to allow fetchInputUtxos to use the cache to fetch the outpoints
as well in a later commit.
This change is part of the effort to add utxocache support to btcd.
Require the caller to pass in the utxoBucket as the caller may be
fetching many utxos in one loop. Having the caller pass it in removes
the need for dbFetchUtxoEntry to grab the bucket on every single fetch.
This change is part of the effort to add utxocache support to btcd.
connectBlock may have an empty utxoviewpoint as the block verification
process may be using the utxo cache directly. In that case, a nil utxo
viewpoint will be passed in. Just return early on a nil utxoviewpoint.
This change is part of the effort to add utxocache support to btcd.
dbPutUtxoView handled putting and deleting new/spent utxos from the
database. These two functinalities are refactored to their own
functions: dbDeleteUtxoEntry and dbPutUtxoEntry.
Refactoring these out allows the cache to call these two functions
directly instead of having to create a view and saving that view to
disk.
This change is part of the effort to add utxocache support to btcd.
The utxoStateConsistency indicates what the last block that the utxo
cache got flush at. This is useful for recovery purposes as if the node
is unexpectdly shut down, we know which block to start rebuilding the
utxo state from.
This change is part of the effort to add utxocache support to btcd.
Getting the memory usage of an entry is very useful for the utxo cache
as we need to know how much memory all the cached entries are using to
guarantee a cache usage limit for the end user.
This change is part of the effort to add utxocache support to btcd.
The fresh flag indicates that the entry is fresh and that the parent
view (the database) hasn't yet seen the entry. This is very useful as
a performance optimization for the utxo cache as if a fresh entry is
spent, we can simply remove it from the cache and don't bother trying to
delete it from the database.
This change is part of the effort to add utxocache support to btcd.
mapslice allows the caller to allocate a fixed amount of memory for the
utxo cache maps without the mapslice going over that fixed amount of
memory. This is useful as we can have variable sizes (1GB, 1.1GB, 2.3GB,
etc) while guaranteeing a memory limit.
This change is part of the effort to add utxocache support to btcd.
sizehelper introduces code for 2 main things:
1: Calculating how many entries to allocate for a map given a size
in bytes.
2: Calculating how much a map takes up in memory given the entries
were allocated for the map.
These functionality are useful for allocating maps so that they'll be
allocating below a certain number of bytes. Since go maps will always
allocate in powers of B (where B is the bucket size for the given map),
it may allocate too much memory. For example, for a map that can store
8GB of entries, the map will grow to be 16GB once the map is full and
the caller puts an extra entry onto the map.
If we want to give a memory guarantee to the user, we can either:
1: Limit the cache size to fixed sizes (4GB, 8GB, ...).
2: Allocate a slice of maps.
The sizehelper code helps with (2).
To simplify building the release-grade (stripped and
reproducible) binaries from source, we add the install and
release-install make goals. Running either of the commands will create
binaries in the $GOPATH/bin directories.
The main difference between the two goals is that the release-install
will not contain any local paths and no debug information.
The use of the GO111MODULE environment variable doesn't have any effect
anymore and hasn't for a couple of versions. The default was set to "on"
a while back, so we can remove that variable everywhere.
On startup, Ancestor call was taking a lot of time when the node was
loading the blockindex onto memory. This change speeds up the Ancestor
function significantly and speeds up the node during startup.
On testnet3 at blockheight ~2,500,000, the startup was around 30seconds
on current main and was 5 seconds with this change. Below is a benchstat
result showing the significant speedup.
goos: darwin
goarch: arm64
pkg: github.com/utreexo/utreexod/blockchain
│ old.txt │ new.txt │
│ sec/op │ sec/op vs base │
Ancestor-8 120819.301µ ± 5% 7.013µ ± 19% -99.99% (p=0.000 n=10)
│ old.txt │ new.txt │
│ B/op │ B/op vs base │
Ancestor-8 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹
¹ all samples are equal
│ old.txt │ new.txt │
│ allocs/op │ allocs/op vs base │
Ancestor-8 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹
¹ all samples are equal
isSyncCandidate is now changed to return true even if the peer is a
pruned node if and only if our chaintip is within 288 blocks of the
peer.
Rationale:
Pruned nodes that signal NODE_NETWORK_LIMITED MUST serve 288 blocks from
their chaintip. If our chaintip is within that range, this peer can be
a sync candidate even if they aren't an archival node.
DoubleHashRaw provides a simple function for doing double hashes. Since
it uses the digest instead of making the caller allocate a byte slice, it
can be more memory efficient vs other double hash functions.
now testing that GetBestBlockHashAsync sends the getbestblockhash command via websocket connection and that the channel returned can be used to send the response when it is received