This document proposes a convenient, human usable encoding to refer to a '''confirmed transaction position''' within the Bitcoin blockchain--known as '''"TxRef"'''. The primary purpose of this encoding is to allow users to refer to a confirmed transaction (and optionally, a particular outpoint index within the transaction) in a standard, reliable, and concise way.
''Please note: Unlike a transaction ID, '''"TxID"''', where there is a strong cryptographic link between the ID and the actual transaction, a '''TxRef''' only provides a weak link to a particular transaction. A '''TxRef''' locates an offset within a blockchain for a transaction, that may - or may not - point to an actual transaction, which in fact may change with reorganisations. We recommend that '''TxRef'''s should be not used for positions within the blockchain having a maturity less than 100 blocks.''
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [https://tools.ietf.org/html/rfc2119 RFC 2119].
Since the first version of Bitcoin, '''TxID'''s have been a core part of the consensus protocol and are routinely used to identify individual transactions between users.
It is possible to reference transactions not only by their '''TxID''', but by their location within the blockchain itself. Rather than use the 64 character '''TxID''', an encoding of the position coordinates can be made friendly for occasional human transcription. In this document, we propose a standard for doing this.
A '''confirmed transaction position reference''', or '''TxRef''', is a reference to a particular location within the blockchain, specified by the block height and a transaction index within the block, and optionally, an outpoint index within the transaction.
Therefore, implementers must be careful not to display '''TxRef'''s to users prematurely:
* Applications MUST NOT display '''TxRef'''s for transactions with less than 6 confirmations.
* Application MUST show a warning for '''TxRef'''s for transactions with less than 100 confirmations.
** This warning SHOULD state that in the case of a large reorganisation, the '''TxRef'''s displayed may point to a different transaction, or to no transaction at all.
=== TxRef Format ===
'''TxRef''' MUST use the '''Bech32m'''<ref>'''Why use Bech32 Encoding for Confirmed Transaction References?''' The error detection and correction properties of this encoding format make it very attractive. We expect that it will be reasonable for software to correct a maximum of two characters; however, we haven’t specified this yet.</ref> encoding as defined in [https://github.com/bitcoin/bips/blob/master/bip-0173.mediawiki BIP-0173] and later refined in [https://github.com/bitcoin/bips/blob/master/bip-0350.mediawiki BIP-0350]. The Bech32m encoding consists of:
The '''HRP''' can be thought of as a label. We have chosen labels to distinguish between Main, Test, and Regtest networks:
* Mainnet: '''"tx"'''.
* Testnet: '''"txtest"'''.
* Regtest: '''"txrt"'''.
==== Separator ====
The separator is the character '''"1"'''.
==== Data Part ====
The data part for a '''TxRef''' consists of the transaction's block height, transaction index within the block, and optionally, an outpoint index. Specific encoding details for the data are given below.
''Please note: other specifications, such as [https://w3c-ccg.github.io/did-spec/ the Decentralized Identifiers spec], have implicitly encoded the information contained within the HRP elsewhere. In this case they may choose to not include the HRP as specified here.''
==== Readability ====
To increase portability and readability, additional separator characters SHOULD be added to the '''TxRef''':
* A Colon<ref>'''Why add a colon here?''' This allows it to conform better with W3C URN/URL standards.</ref> '''":"''' added after the separator character '1'.
* Hyphens<ref>'''Why hyphens within the TxRef?''' As '''TxRef'''s are short, we expect that they will be quoted via voice or written by hand. The inclusion of hyphens every 4 characters breaks up the string and means people don't lose their place so easily.</ref> '''"-"''' added after every 4 characters beyond the colon.
Encoding a '''TxRef''' requires 4 or 5 pieces of data: a magic code denoting which network is being used; a version number (currently always 0); the block height of the block containing the transaction; the index of the transaction within the block; and optionally, the index of the outpoint within the transaction. Only a certain number of bits are supported for each of these values, see the following table for details.
| style="background: #99DDFF; color: black; text-align : center;" | 5
|'''3''': Mainnet<br>'''4''': Mainnet with Outpoint<br>'''6''': Testnet<br>'''7''': Testnet with Outpoint<br>'''0''': Regtest<br>'''1''': Regtest with Outpoint
|-
| style="background: #DDDDDD; color: black; text-align : center;" | Version
|For Future Use
|uint8
| style="background: #DDDDDD; color: black; text-align : center;" | 1
|Must be '''0'''
|-
| style="background: #EEDD88; color: black; text-align : center;" | Block<br>Height
|The Block Height of the Tx
|uint32
| style="background: #EEDD88; color: black; text-align : center;" | 24
|Block 0 to Block 16777215
|-
| style="background: #FFAABB; color: black; text-align : center;" | Transaction<br>Index
|The index of the Tx inside the block
|uint16, uint32
| style="background: #FFAABB; color: black; text-align : center;" | 15
|Tx 0 to Tx 32767
|-
| style="background: #BBCC33; color: black; text-align : center;" | Outpoint<br>Index
|The index of the Outpoint inside the Tx
|uint16, uint32
| style="background: #BBCC33; color: black; text-align : center;" | 15
We want to encode a '''TxRef''' that refers to Transaction #1234 of Block #456789 on the Mainnet chain. We use this data in preparation for the Bech32 encoding algorithm:
As shown in the last column, we take the necessary bits of each binary value and copy them into nine unsigned chars illustrated in the next table. We only set the lower five bits of each unsigned char as the bech32 algorithm only uses those bits.
The Bech32 algorithm encodes the nine unsigned chars above and computes a checksum of those chars and encodes that as well--this gives a six character checksum (in this case, '''utt3p0''') which is appended to the final '''TxRef'''. The final '''TxRef''' given is: '''tx1:r29u-mqjx-putt-3p0''' and is illustrated in the following table:
TxRef character indexes and descriptions
{| class="wikitable" style="text-align: top"
!style="width:2em"|Index
!style="width:2em"|0
!style="width:2em"|1
!style="width:2em"|2
!style="width:2em"|3
!style="width:2em"|4
!style="width:2em"|5
!style="width:2em"|6
!style="width:2em"|7
!style="width:2em"|8
!style="width:2em"|9
!style="width:2em"|10
!style="width:2em"|11
!style="width:2em"|12
!style="width:2em"|13
!style="width:2em"|14
!style="width:2em"|15
!style="width:2em"|16
!style="width:2em"|17
!style="width:2em"|18
!style="width:2em"|19
!style="width:2em"|20
!style="width:2em"|21
|-
|Char:
| style="background: #BBCCEE; color: black; text-align : center;" | t
| style="background: #BBCCEE; color: black; text-align : center;" | x
| style="background: #FFCCCC; color: black; text-align : center;" | 1
| style="background: #CCDDAA; color: black; text-align : center;" | :
| style="background: #EEEEBB; color: black; text-align : center;" | r
| style="background: #EEEEBB; color: black; text-align : center;" | 2
| style="background: #EEEEBB; color: black; text-align : center;" | 9
| style="background: #EEEEBB; color: black; text-align : center;" | u
| style="background: #CCDDAA; color: black; text-align : center;" | -
| style="background: #EEEEBB; color: black; text-align : center;" | m
| style="background: #EEEEBB; color: black; text-align : center;" | q
| style="background: #EEEEBB; color: black; text-align : center;" | j
| style="background: #EEEEBB; color: black; text-align : center;" | x
| style="background: #CCDDAA; color: black; text-align : center;" | -
| style="background: #EEEEBB; color: black; text-align : center;" | p
| style="background: #EEEEBB; color: black; text-align : center;" | u
| style="background: #EEEEBB; color: black; text-align : center;" | t
| style="background: #EEEEBB; color: black; text-align : center;" | t
| style="background: #CCDDAA; color: black; text-align : center;" | -
| style="background: #EEEEBB; color: black; text-align : center;" | 3
| style="background: #EEEEBB; color: black; text-align : center;" | p
| style="background: #EEEEBB; color: black; text-align : center;" | 0
|}
==== Outpoint Index ====
Some uses of '''TxRef''' may want to refer to a specific outpoint of the transaction. In the previous example, since we did not specify the outpoint index, the '''TxRef''' '''tx1:r29u-mqjx-putt-3p0''' implicitly references the first (index 0) outpoint of the 1234th transaction in the 456789th block in the blockchain.
If instead, for example, we want to reference the second (index 1) outpoint, we need to change the magic code from '''3''' to '''4''' and would include the following in the data to be encoded:
After Bech32 encoding all twelve unsigned chars above, we get the checksum: '''sfp2tt'''. The final '''TxRef''' given is: '''tx1:y29u-mqjx-ppqq-sfp2-tt''' and is illustrated in the following table:
TxRef character indexes and descriptions
{| class="wikitable" style="text-align: top"
!style="width:2em"|Index
!style="width:2em"|0
!style="width:2em"|1
!style="width:2em"|2
!style="width:2em"|3
!style="width:2em"|4
!style="width:2em"|5
!style="width:2em"|6
!style="width:2em"|7
!style="width:2em"|8
!style="width:2em"|9
!style="width:2em"|10
!style="width:2em"|11
!style="width:2em"|12
!style="width:2em"|13
!style="width:2em"|14
!style="width:2em"|15
!style="width:2em"|16
!style="width:2em"|17
!style="width:2em"|18
!style="width:2em"|19
!style="width:2em"|20
!style="width:2em"|21
!style="width:2em"|22
!style="width:2em"|23
!style="width:2em"|24
!style="width:2em"|25
|-
|Char:
| style="background: #BBCCEE; color: black; text-align : center;" | t
| style="background: #BBCCEE; color: black; text-align : center;" | x
| style="background: #FFCCCC; color: black; text-align : center;" | 1
| style="background: #CCDDAA; color: black; text-align : center;" | :
| style="background: #EEEEBB; color: black; text-align : center;" | y
| style="background: #EEEEBB; color: black; text-align : center;" | 2
| style="background: #EEEEBB; color: black; text-align : center;" | 9
| style="background: #EEEEBB; color: black; text-align : center;" | u
| style="background: #CCDDAA; color: black; text-align : center;" | -
| style="background: #EEEEBB; color: black; text-align : center;" | m
| style="background: #EEEEBB; color: black; text-align : center;" | q
| style="background: #EEEEBB; color: black; text-align : center;" | j
| style="background: #EEEEBB; color: black; text-align : center;" | x
| style="background: #CCDDAA; color: black; text-align : center;" | -
| style="background: #EEEEBB; color: black; text-align : center;" | p
| style="background: #EEEEBB; color: black; text-align : center;" | p
| style="background: #EEEEBB; color: black; text-align : center;" | q
| style="background: #EEEEBB; color: black; text-align : center;" | q
| style="background: #CCDDAA; color: black; text-align : center;" | -
| style="background: #EEEEBB; color: black; text-align : center;" | s
| style="background: #EEEEBB; color: black; text-align : center;" | f
| style="background: #EEEEBB; color: black; text-align : center;" | p
| style="background: #EEEEBB; color: black; text-align : center;" | 2
| style="background: #CCDDAA; color: black; text-align : center;" | -
| style="background: #EEEEBB; color: black; text-align : center;" | t
| style="background: #EEEEBB; color: black; text-align : center;" | t
The Bech32 spec defines 32 valid characters as its "alphabet". All non-Bech32-alphabet characters present in a '''TxRef''' after the Bech32 separator character MUST be ignored/removed when parsing (except for terminating characters). We do not wish to expect the users to keep their '''TxRef'''s in good form and '''TxRef'''s may contains hyphens, colons, invisible spaces, uppercase or random characters. We expect users to copy, paste, write by-hand, write in a mix of character sets, etc. Parsers SHOULD attempt to correct for these and other common errors, reporting to the user any '''TxRef'''s that violate a proper Bech32 encoding.
As of early 2021, '''TxRef''' has been in limited use for a couple of years and it is possible that there are some '''TxRef'''s in use which were created with the original specification of Bech32 before the Bech32m refinement was codified. Due to this possibility, a '''TxRef''' parser SHOULD be able to decode both Bech32m and Bech32 encoded '''TxRef'''s. In such a case, a '''TxRef''' parser SHOULD display or somehow notify the user that they are using an obsolete '''TxRef''' and that they should upgrade it to the Bech32m version. Additionally, the parser MAY also display the Bech32m version.
The following examples show values for various combinations on mainnet and testnet; encoding block height, transaction index, and an optional output index.
* In early April, 2021, there have been 677700 blocks
* There are roughly (365 days * 24 hours * 6 blocks / hour) = 52560 blocks every year, implying about (16777216 - 677700) / 52560 = 306 more years of addressable blocks.
* Some time before year 2327 this specification should be extended.