mirror of
https://github.com/bitcoin/bips.git
synced 2025-01-18 21:35:13 +01:00
139 lines
5.4 KiB
Plaintext
139 lines
5.4 KiB
Plaintext
|
<pre>
|
||
|
BIP: BIP-0039
|
||
|
Title: Mnemonic code for generating deterministic keys
|
||
|
Author: Pavol Rusnak <stick@gk2.sk>
|
||
|
Marek Palatinus <info@bitcoin.cz>
|
||
|
Aaron Voisine <voisine@gmail.com>
|
||
|
Status: Draft
|
||
|
Type: Standards Track
|
||
|
Created: 10-09-2013
|
||
|
</pre>
|
||
|
|
||
|
==Abstract==
|
||
|
|
||
|
This BIP proposes a scheme for translating binary data (usually master seeds
|
||
|
for deterministic keys, but it can be applied to any binary data) into a group
|
||
|
of easy to remember words also known as mnemonic code or mnemonic sentence.
|
||
|
|
||
|
==Motivation==
|
||
|
|
||
|
Such mnemonic code or mnemonic sentence is much easier to work with than working
|
||
|
with the binary data directly (or its hexadecimal interpretation). The sentence
|
||
|
could be writen down on paper (e.g. for storing in a secure location such as
|
||
|
safe), told over telephone or other voice communication method, or memorized
|
||
|
in ones memory (this method is called brainwallet).
|
||
|
|
||
|
==Backwards Compatibility==
|
||
|
|
||
|
As this BIP is written, only one Bitcoin client (Electrum) implements mnemonic
|
||
|
codes, but it uses a different wordlist than the proposed one.
|
||
|
|
||
|
For compatibility reasons we propose adding a checkbox to Electrum, which will
|
||
|
allow user to indicate if the legacy code is being entered during import or
|
||
|
it is a new one that is BIP-0039 compatible. For exporting, only the new format
|
||
|
will be used, so this is not an issue.
|
||
|
|
||
|
==Rationale==
|
||
|
|
||
|
Our proposal is inspired by implementation used in Electrum, but we enhanced
|
||
|
the wordlist and algorithm so it meets the following criteria:
|
||
|
|
||
|
a) smart selection of words
|
||
|
- wordlist is created in such way that it's enough to type just first four
|
||
|
letters to unambiguously identify the word
|
||
|
|
||
|
b) similar words avoided
|
||
|
- words as "build" and "built", "woman" and "women" or "quick" or "quickly"
|
||
|
not only make remembering the sentence difficult, but are also more error
|
||
|
prone and more difficult to guess (see point below)
|
||
|
- we avoid these words by carefully selecting them during addition
|
||
|
|
||
|
c) sorted wordlists
|
||
|
- wordlist is sorted which allow more efficient lookup of the code words
|
||
|
(i.e. implementation can use binary search instead of linear search)
|
||
|
- this also allows trie (prefix tree) to be used, e.g. for better compression
|
||
|
|
||
|
d) localized wordlists
|
||
|
- we would like to allow localized wordlists, so it is easier for users
|
||
|
to remember the code in their native language
|
||
|
- by using wordlists with no colliding words among languages, it's easy to
|
||
|
determine which language was used just by checking the first word of
|
||
|
the sentence
|
||
|
|
||
|
e) mnemonic checksum
|
||
|
- this leads to better user experience, because user can be notified
|
||
|
if the mnemonic sequence is wrong, instead of showing the confusing
|
||
|
data generated from the wrong sequence.
|
||
|
|
||
|
f) seed stretching
|
||
|
- before the encoding and after the decoding the input binary sequence is
|
||
|
stretched using a symmetric cipher (Blowfish) in order to prevent
|
||
|
brute-force attacks in case some of the mnemonic words are leaked
|
||
|
|
||
|
==Specification==
|
||
|
|
||
|
<pre>
|
||
|
Our proposal implements two methods - "encode" and "decode".
|
||
|
|
||
|
The first method takes a binary data which have to length (L) in bytes divisable
|
||
|
by four and returns a sentence that consists of (L/4*3) words from the wordlist.
|
||
|
|
||
|
The second method takes sentences generated by first method (number of words in
|
||
|
the sentence has to be divisable by 3) and reconstructs the original binary data.
|
||
|
|
||
|
Words can repeat in the sentence more than one time.
|
||
|
|
||
|
Wordlist contains 2048 words (instead of 1626 words in Electrum), allowing
|
||
|
the code to compute the checksum of the whole mnemonic sequence.
|
||
|
Each 32 bits of input data add 1 bit of checksum.
|
||
|
|
||
|
See the following table for relation between input lengths, output lengths and
|
||
|
checksum sizes for the most common usecases:
|
||
|
|
||
|
+--------+---------+---------+----------+
|
||
|
| input | input | output | checksum |
|
||
|
| (bits) | (bytes) | (words) | (bits) |
|
||
|
+--------+---------+---------+----------+
|
||
|
| 128 | 16 | 12 | 4 |
|
||
|
| 192 | 24 | 18 | 6 |
|
||
|
| 256 | 32 | 24 | 8 |
|
||
|
+--------+---------+---------+----------+
|
||
|
</pre>
|
||
|
|
||
|
===Algorithm:===
|
||
|
|
||
|
<pre>
|
||
|
Encoding:
|
||
|
1. Read input data (I).
|
||
|
2. Make sure its length (L) is divisable by 64 bits.
|
||
|
3. Encrypt input data 1000x with Blowfish (ECB) using the word "mnemonic" as key.
|
||
|
4. Compute the length of the checkum (LC). LC = L/32
|
||
|
5. Split I into chunks of LC bits (I1, I2, I3, ...).
|
||
|
6. XOR them altogether and produce the checksum C. C = I1 xor I2 xor I3 ... xor In.
|
||
|
7. Concatenate I and C into encoded data (E). Length of E is divisable by 33 bits.
|
||
|
8. Keep taking 11 bits from E until there are none left.
|
||
|
9. Treat them as integer W, add word with index W to the output.
|
||
|
|
||
|
Decoding:
|
||
|
1. Read input mnemonic (M).
|
||
|
2. Make sure its wordcount is divisable by 6.
|
||
|
3. Figure out word indexes in a dictionary and output them as binary stream E.
|
||
|
4. Length of E (L) is divisable by 33 bits.
|
||
|
5. Split E into two parts: B and C, where B are first L/33*32 bits, C are last L/33 bits.
|
||
|
6. Make sure C is the checksum of B (using the step 5 from the above paragraph).
|
||
|
7. If it's not we have invalid mnemonic code.
|
||
|
8. Treat B as binary data.
|
||
|
9. Decrypt this data 1000x with Blowfish (ECB) using the word "mnemonic" as key.
|
||
|
10. Return the result as output.
|
||
|
</pre>
|
||
|
|
||
|
==Test vectors==
|
||
|
|
||
|
See https://github.com/trezor/python-mnemonic/blob/master/vectors.json
|
||
|
|
||
|
==Reference Implementation==
|
||
|
|
||
|
Reference implementation including wordlists is available from
|
||
|
|
||
|
http://github.com/trezor/python-mnemonic
|