# BOLT #1: Base Protocol ## Overview This protocol assumes an underlying authenticated and ordered transport mechanism that takes care of framing individual messages. [BOLT #8](08-transport.md) specifies the canonical transport layer used in Lightning, though it can be replaced by any transport that fulfills the above guarantees. The default TCP port is 9735. This corresponds to hexadecimal `0x2607`: the Unicode code point for LIGHTNING.[1](#reference-1) All data fields are unsigned big-endian unless otherwise specified. ## Table of Contents * [Connection Handling and Multiplexing](#connection-handling-and-multiplexing) * [Lightning Message Format](#lightning-message-format) * [Type-Length-Value Format](#type-length-value-format) * [Fundamental Types](#fundamental-types) * [Setup Messages](#setup-messages) * [The `init` Message](#the-init-message) * [The `error` Message](#the-error-message) * [Control Messages](#control-messages) * [The `ping` and `pong` Messages](#the-ping-and-pong-messages) * [Acknowledgments](#acknowledgments) * [References](#references) * [Authors](#authors) ## Connection Handling and Multiplexing Implementations MUST use a single connection per peer; channel messages (which include a channel ID) are multiplexed over this single connection. ## Lightning Message Format After decryption, all Lightning messages are of the form: 1. `type`: a 2-byte big-endian field indicating the type of message 2. `payload`: a variable-length payload that comprises the remainder of the message and that conforms to a format matching the `type` The `type` field indicates how to interpret the `payload` field. The format for each individual type is defined by a specification in this repository. The type follows the _it's ok to be odd_ rule, so nodes MAY send _odd_-numbered types without ascertaining that the recipient understands it. A sending node: - MUST NOT send an evenly-typed message not listed here without prior negotiation. A receiving node: - upon receiving a message of _odd_, unknown type: - MUST ignore the received message. - upon receiving a message of _even_, unknown type: - MUST fail the channels. The messages are grouped logically into four groups, ordered by the most significant bit that is set: - Setup & Control (types `0`-`31`): messages related to connection setup, control, supported features, and error reporting (described below) - Channel (types `32`-`127`): messages used to setup and tear down micropayment channels (described in [BOLT #2](02-peer-protocol.md)) - Commitment (types `128`-`255`): messages related to updating the current commitment transaction, which includes adding, revoking, and settling HTLCs as well as updating fees and exchanging signatures (described in [BOLT #2](02-peer-protocol.md)) - Routing (types `256`-`511`): messages containing node and channel announcements, as well as any active route exploration (described in [BOLT #7](07-routing-gossip.md)) The size of the message is required by the transport layer to fit into a 2-byte unsigned int; therefore, the maximum possible size is 65535 bytes. A node: - MUST ignore any additional data within a message beyond the length that it expects for that type. - upon receiving a known message with insufficient length for the contents: - MUST fail the channels. - that negotiates an option in this specification: - MUST include all the fields annotated with that option. ### Rationale By default `SHA2` and Bitcoin public keys are both encoded as big endian, thus it would be unusual to use a different endian for other fields. Length is limited to 65535 bytes by the cryptographic wrapping, and messages in the protocol are never more than that length anyway. The _it's ok to be odd_ rule allows for future optional extensions without negotiation or special coding in clients. The "ignore additional data" rule similarly allows for future expansion. Implementations may prefer to have message data aligned on an 8-byte boundary (the largest natural alignment requirement of any type here); however, adding a 6-byte padding after the type field was considered wasteful: alignment may be achieved by decrypting the message into a buffer with 6-bytes of pre-padding. ## Type-Length-Value Format Throughout the protocol, a TLV (Type-Length-Value) format is used to allow for the backwards-compatible addition of new fields to existing message types. A `tlv_record` represents a single field, encoded in the form: * [`varint`: `type`] * [`varint`: `length`] * [`length`: `value`] A `tlv_stream` is a series of (possibly zero) `tlv_record`s, represented as the concatenation of the encoded `tlv_record`s. When used to extend existing messages, a `tlv_stream` is typically placed after all currently defined fields. The `type` is a varint encoded using the bitcoin CompactSize format. It functions as a message-specific, 64-bit identifier for the `tlv_record` determining how the contents of `value` should be decoded. The `length` is a varint encoded using the bitcoin CompactSize format signaling the size of `value` in bytes. The `value` depends entirely on the `type`, and should be encoded or decoded according to the message-specific format determined by `type`. ### Requirements The sending node: - MUST order `tlv_record`s in a `tlv_stream` by monotonically-increasing `type`. - MUST minimally encode `type` and `length`. - SHOULD NOT use redundant, variable-length encodings in a `tlv_record`. The receiving node: - if zero bytes remain before parsing a `type`: - MUST stop parsing the `tlv_stream`. - if a `type` or `length` is not minimally encoded: - MUST fail to parse the `tlv_stream`. - if decoded `type`s are not monotonically-increasing: - MUST fail to parse the `tlv_stream`. - if `type` is known: - MUST decode the next `length` bytes using the known encoding for `type`. - otherwise, if `type` is unknown: - if `type` is even: - MUST fail to parse the `tlv_stream`. - otherwise, if `type` is odd: - MUST discard the next `length` bytes. ### Rationale The primary advantage in using TLV is that a reader is able to ignore new fields that it does not understand, since each field carries the exact size of the encoded element. Without TLV, even if a node does not wish to use a particular field, the node is forced to add parsing logic for that field in order to determine the offset of any fields that follow. The monotonicity constraint ensures that all `type`s are unique and can appear at most once. Fields that map to complex objects, e.g. vectors, maps, or structs, should do so by defining the encoding such that the object is serialized within a single `tlv_record`. The uniqueness constraint, among other things, enables the following optimizations: - canonical ordering is defined independent of the encoded `value`s. - canonical ordering can be known at compile-time, rather that being determined dynamically at the time of encoding. - verifying canonical ordering requires less state and is less-expensive. - variable-size fields can reserve their expected size up front, rather than appending elements sequentially and incurring double-and-copy overhead. The use of a varint for `type` and `length` permits a space savings for small `type`s or short `value`s. This potentially leaves more space for application data over the wire or in an onion payload. All `type`s must appear in increasing order to create a canonical encoding of the underlying `tlv_record`s. This is crucial when computing signatures over a `tlv_stream`, as it ensures verifiers will be able to recompute the same message digest as the signer. Note that the canonical ordering over the set of fields can be enforced even if the verifier does not understand what the fields contain. Writers should avoid using redundant, variable-length encodings in a `tlv_record` since this results in encoding the length twice and complicates computing the outer length. As an example, when writing a variable length byte array, the `value` should contain only the raw bytes and forgo an additional internal length since the `tlv_record` already carries the number of bytes that follow. On the other hand, if a `tlv_record` contains multiple, variable-length elements then this would not be considered redundant, and is needed to allow the receiver to parse individual elements from `value`. ## Fundamental Types Various fundamental types are referred to in the message specifications: * `byte`: an 8-bit byte * `u16`: a 2 byte unsigned integer * `u32`: a 4 byte unsigned integer * `u64`: an 8 byte unsigned integer Inside TLV records which contain a single value, leading zeros in integers can be omitted: * `tu16`: a 0 to 2 byte unsigned integer * `tu32`: a 0 to 4 byte unsigned integer * `tu64`: a 0 to 8 byte unsigned integer The following convenience types are also defined: * `chain_hash`: a 32-byte chain identifier (see [BOLT #0](00-introduction.md#glossary-and-terminology-guide)) * `channel_id`: a 32-byte channel_id (see [BOLT #2](02-peer-protocol.md#definition-of-channel-id) * `sha256`: a 32-byte SHA2-256 hash * `signature`: a 64-byte bitcoin Elliptic Curve signature * `point`: a 33-byte Elliptic Curve point (compressed encoding as per [SEC 1 standard](http://www.secg.org/sec1-v2.pdf#subsubsection.2.3.3)) * `short_channel_id`: an 8 byte value identifying a channel (see [BOLT #7](07-routing-gossip.md#definition-of-short-channel-id)) ## Setup Messages ### The `init` Message Once authentication is complete, the first message reveals the features supported or required by this node, even if this is a reconnection. [BOLT #9](09-features.md) specifies lists of global and local features. Each feature is generally represented in `globalfeatures` or `localfeatures` by 2 bits. The least-significant bit is numbered 0, which is _even_, and the next most significant bit is numbered 1, which is _odd_. Both fields `globalfeatures` and `localfeatures` MUST be padded to bytes with 0s. 1. type: 16 (`init`) 2. data: * [`u16`:`gflen`] * [`gflen*byte`:`globalfeatures`] * [`u16`:`lflen`] * [`lflen*byte`:`localfeatures`] The 2-byte `gflen` and `lflen` fields indicate the number of bytes in the immediately following field. #### Requirements The sending node: - MUST send `init` as the first Lightning message for any connection. - MUST set feature bits as defined in [BOLT #9](09-features.md). - MUST set any undefined feature bits to 0. - SHOULD use the minimum lengths required to represent the feature fields. The receiving node: - MUST wait to receive `init` before sending any other messages. - MUST respond to known feature bits as specified in [BOLT #9](09-features.md). - upon receiving unknown _odd_ feature bits that are non-zero: - MUST ignore the bit. - upon receiving unknown _even_ feature bits that are non-zero: - MUST fail the connection. #### Rationale This semantic allows both future incompatible changes and future backward compatible changes. Bits should generally be assigned in pairs, in order that optional features may later become compulsory. Nodes wait for receipt of the other's features to simplify error diagnosis when features are incompatible. The feature masks are split into local features (which only affect the protocol between these two nodes) and global features (which can affect HTLCs and are thus also advertised to other nodes). ### The `error` Message For simplicity of diagnosis, it's often useful to tell a peer that something is incorrect. 1. type: 17 (`error`) 2. data: * [`channel_id`:`channel_id`] * [`u16`:`len`] * [`len*byte`:`data`] The 2-byte `len` field indicates the number of bytes in the immediately following field. #### Requirements The channel is referred to by `channel_id`, unless `channel_id` is 0 (i.e. all bytes are 0), in which case it refers to all channels. The funding node: - for all error messages sent before (and including) the `funding_created` message: - MUST use `temporary_channel_id` in lieu of `channel_id`. The fundee node: - for all error messages sent before (and not including) the `funding_signed` message: - MUST use `temporary_channel_id` in lieu of `channel_id`. A sending node: - when sending `error`: - MUST fail the channel referred to by the error message. - SHOULD send `error` for protocol violations or internal errors that make channels unusable or that make further communication unusable. - SHOULD send `error` with the unknown `channel_id` in reply to messages of type `32`-`255` related to unknown channels. - MAY send an empty `data` field. - when failure was caused by an invalid signature check: - SHOULD include the raw, hex-encoded transaction in reply to a `funding_created`, `funding_signed`, `closing_signed`, or `commitment_signed` message. - when `channel_id` is 0: - MUST fail all channels with the receiving node. - MUST close the connection. - MUST set `len` equal to the length of `data`. The receiving node: - upon receiving `error`: - MUST fail the channel referred to by the error message, if that channel is with the sending node. - if no existing channel is referred to by the message: - MUST ignore the message. - MUST truncate `len` to the remainder of the packet (if it's larger). - if `data` is not composed solely of printable ASCII characters (For reference: the printable character set includes byte values 32 through 126, inclusive): - SHOULD NOT print out `data` verbatim. #### Rationale There are unrecoverable errors that require an abort of conversations; if the connection is simply dropped, then the peer may retry the connection. It's also useful to describe protocol violations for diagnosis, as this indicates that one peer has a bug. It may be wise not to distinguish errors in production settings, lest it leak information — hence, the optional `data` field. ## Control Messages ### The `ping` and `pong` Messages In order to allow for the existence of long-lived TCP connections, at times it may be required that both ends keep alive the TCP connection at the application level. Such messages also allow obfuscation of traffic patterns. 1. type: 18 (`ping`) 2. data: * [`u16`:`num_pong_bytes`] * [`u16`:`byteslen`] * [`byteslen*byte`:`ignored`] The `pong` message is to be sent whenever a `ping` message is received. It serves as a reply and also serves to keep the connection alive, while explicitly notifying the other end that the receiver is still active. Within the received `ping` message, the sender will specify the number of bytes to be included within the data payload of the `pong` message. 1. type: 19 (`pong`) 2. data: * [`u16`:`byteslen`] * [`byteslen*byte`:`ignored`] #### Requirements A node sending a `ping` message: - SHOULD set `ignored` to 0s. - MUST NOT set `ignored` to sensitive data such as secrets or portions of initialized memory. - if it doesn't receive a corresponding `pong`: - MAY terminate the network connection, - and MUST NOT fail the channels in this case. - SHOULD NOT send `ping` messages more often than once every 30 seconds. A node sending a `pong` message: - SHOULD set `ignored` to 0s. - MUST NOT set `ignored` to sensitive data such as secrets or portions of initialized memory. A node receiving a `ping` message: - SHOULD fail the channels if it has received significantly in excess of one `ping` per 30 seconds. - if `num_pong_bytes` is less than 65532: - MUST respond by sending a `pong` message, with `byteslen` equal to `num_pong_bytes`. - otherwise (`num_pong_bytes` is **not** less than 65532): - MUST ignore the `ping`. A node receiving a `pong` message: - if `byteslen` does not correspond to any `ping`'s `num_pong_bytes` value it has sent: - MAY fail the channels. ### Rationale The largest possible message is 65535 bytes; thus, the maximum sensible `byteslen` is 65531 — in order to account for the type field (`pong`) and the `byteslen` itself. This allows a convenient cutoff for `num_pong_bytes` to indicate that no reply should be sent. Connections between nodes within the network may be long lived, as payment channels have an indefinite lifetime. However, it's likely that no new data will be exchanged for a significant portion of a connection's lifetime. Also, on several platforms it's possible that Lightning clients will be put to sleep without prior warning. Hence, a distinct `ping` message is used, in order to probe for the liveness of the connection on the other side, as well as to keep the established connection active. Additionally, the ability for a sender to request that the receiver send a response with a particular number of bytes enables nodes on the network to create _synthetic_ traffic. Such traffic can be used to partially defend against packet and timing analysis — as nodes can fake the traffic patterns of typical exchanges without applying any true updates to their respective channels. When combined with the onion routing protocol defined in [BOLT #4](04-onion-routing.md), careful statistically driven synthetic traffic can serve to further bolster the privacy of participants within the network. Limited precautions are recommended against `ping` flooding, however some latitude is given because of network delays. Note that there are other methods of incoming traffic flooding (e.g. sending _odd_ unknown message types, or padding every message maximally). Finally, the usage of periodic `ping` messages serves to promote frequent key rotations as specified within [BOLT #8](08-transport.md). ## Acknowledgments [ TODO: (roasbeef); fin ] ## References 1. http://www.unicode.org/charts/PDF/U2600.pdf ## Authors [ FIXME: Insert Author List ] ![Creative Commons License](https://i.creativecommons.org/l/by/4.0/88x31.png "License CC-BY")
This work is licensed under a [Creative Commons Attribution 4.0 International License](http://creativecommons.org/licenses/by/4.0/).