In this commit, we add test vectors for filter and header construction and
the code to generate them. The included test vectors are for testnet with a
value of 20 for P. The code generates filters and headers for values of 1
through 32 for P using testnet blocks. Currently, to run the code,
the `Roasbeef` fork of `btcd` (at https://github.com/roasbeef/btcd) is
required to be running locally in testnet mode; this will be changed in a
future commit after the code is merged into the `btcsuite` mainline.
In this commit, we modify regular filter construction slightly. Rather
than including each pushed data in the script, we instead just include
the script directly, which will eventually be hashed. The rationale for
doing this is two-fold:
* Most scripts today and in the foreseeable future will just be a
commitment.
* Including only the script itself and not the hash of the script
reduces the worst case filter size. Otherwise, an attacker could
include a bunch of 2 byte push datas and blow up the filter size for
all nodes.
The BIP 39 wordlist contained two significant technical errors:
- Byte Order Marker (BOM) U+FEFF at the beginning of the first line,
preceding the word "abaisser".
- No newline '\n' char terminating the last line, after "zoologie".
The former may cause user loss of funds. An implementation which
generates a mnemonic phrase and also turns it into a BIP 39 seed value
may feed the string "<U+FEFF>abaisser" to the KDF, while displaying the
word "abaisser" to the user. Of course, it cannot be expected that the
user would enter "<U+FEFF>abaisser" upon attempt to restore a wallet.
In the face of a buggy wordlist, whitespace handling and normalization
cannot be absolutely relied on to remove a notoriously mischievous
character. Those who provide technical support may be well advised to
ask French users with unrestorable wallets, "Did your mnemonic phrase
contain the word 'abaisser'?"
The latter broke the shell script I use to massage wordlists into C
sources when building https://github.com/nym-zone/easyseed .
I know of only one commonplace platform where software regularly
prepends UTF-8 files with a spurious U+FEFF, and oftentimes omits a line
terminator on the last line even when asked to create a Unix ('\n') text
file. It is RECOMMENDED that new wordlists be examined for correctness
using standard shell tools on a sane platform.