[https://en.wikipedia.org/wiki/Elliptic_Curve_Digital_Signature_Algorithm ECDSA] signatures over the [https://www.secg.org/sec2-v2.pdf secp256k1 curve] for authenticating
transactions. These are [https://www.secg.org/sec1-v2.pdf standardized], but have a number of downsides
* '''Provable security''': Schnorr signatures are provably secure. In more detail, they are ''strongly unforgeable under chosen message attack (SUF-CMA)''<ref>Informally, this means that without knowledge of the secret key but given valid signatures of arbitrary messages, it is not possible to come up with further valid signatures.</ref> [https://www.di.ens.fr/~pointche/Documents/Papers/2000_joc.pdf in the random oracle model assuming the hardness of the elliptic curve discrete logarithm problem (ECDLP)] and [http://www.neven.org/papers/schnorr.pdf in the generic group model assuming variants of preimage and second preimage resistance of the used hash function]<ref>A detailed security proof in the random oracle model, which essentially restates [https://www.di.ens.fr/~pointche/Documents/Papers/2000_joc.pdf the original security proof by Pointcheval and Stern] more explicitly, can be found in [https://eprint.iacr.org/2016/191 a paper by Kiltz, Masny and Pan]. All these security proofs assume a variant of Schnorr signatures that use ''(e,s)'' instead of ''(R,s)'' (see Design above). Since we use a unique encoding of ''R'', there is an efficiently computable bijection that maps ''(R,s)'' to ''(e,s)'', which allows to convert a successful SUF-CMA attacker for the ''(e,s)'' variant to a successful SUF-CMA attacker for the ''(r,s)'' variant (and vice-versa). Furthermore, the proofs consider a variant of Schnorr signatures without key prefixing (see Design above), but it can be verified that the proofs are also correct for the variant with key prefixing. As a result, all the aforementioned security proofs apply to the variant of Schnorr signatures proposed in this document.</ref>. The [https://dl.acm.org/citation.cfm?id=2978413 best known security proof for ECDSA] relies on stronger assumptions.
* '''Non-malleability''': The SUF-CMA security of Schnorr signatures implies that they are non-malleable. On the other hand, ECDSA signatures are inherently malleable; a third party without access to the secret key can alter an existing valid signature for a given public key and message into another signature that is valid for the same key and message. This issue is discussed in [https://github.com/bitcoin/bips/blob/master/bip-0062.mediawiki BIP62] and [https://github.com/bitcoin/bips/blob/master/bip-0066.mediawiki BIP66].
* '''Linearity''': Schnorr signatures have the remarkable property that multiple parties can collaborate to produce a signature that is valid for the sum of their public keys. This is the building block for various higher-level constructions that improve efficiency and privacy, such as multisignatures and others (see Applications below).
For all these advantages, there are virtually no disadvantages, apart
from not being standardized. This document seeks to change that. As we
propose a new standard, a number of improvements not specific to Schnorr signatures can be
* '''Signature encoding''': Instead of using [https://en.wikipedia.org/wiki/X.690#DER_encoding DER]-encoding for signatures (which are variable size, and up to 72 bytes), we can use a simple fixed 64-byte format.
* '''Public key encoding''': Instead of using ''compressed'' 33-byte encodings of elliptic curve points which are common in Bitcoin today, public keys in this proposal are encoded as 32 bytes.
* '''Batch verification''': The specific formulation of ECDSA signatures that is standardized cannot be verified more efficiently in batch compared to individually, unless additional witness data is added. Changing the signature scheme offers an opportunity to avoid this.
By reusing the same curve as Bitcoin uses for ECDSA, we are able to retain existing mechanisms for choosing secret and public keys, and we avoid introducing new assumptions about elliptic curve group security.
'''Schnorr signature variant''' Elliptic Curve Schnorr signatures for message ''m'' and public key ''P'' generally involve a point ''R'', integers ''e'' and ''s'' picked by the signer, and generator ''G'' which satisfy ''e = hash(R || m)'' and ''sâ‹…G = R + eâ‹…P''. Two formulations exist, depending on whether the signer reveals ''e'' or ''R'':
# Signatures are ''(e, s)'' that satisfy ''e = hash(sâ‹…G - eâ‹…P || m)''. This supports more compact signatures, since [http://www.neven.org/papers/schnorr.pdf the hash ''e'' can be made as small as 16 bytes without sacrificing security], whereas an encoding of ''R'' inherently needs about 32 bytes. Moreover, this variant avoids minor complexity introduced by the encoding of the point ''R'' in the signature (see paragraphs "Encoding the sign of R" and "Implicit Y coordinate" further below in this subsection).
# Signatures are ''(R, s)'' that satisfy ''sâ‹…G = R + hash(R || m)â‹…P''. This supports batch verification, as there are no elliptic curve operations inside the hashes. Batch verification enables significant speedups.
[[File:bip-schnorr/speedup-batch.png|center|frame|This graph shows the ratio between the time it takes to verify ''n'' signatures individually and to verify a batch of ''n'' signatures. This ratio goes up logarithmically with the number of signatures, or in other words: the total time to verify ''n'' signatures grows with ''O(n / log n)''.]]
'''Key prefixing''' When using the verification rule above directly, it is possible for a third party to convert a signature ''(R, s)'' for key ''P'' into a signature ''(R, s + aâ‹…hash(R || m))'' for key ''P + aâ‹…G'' and the same message, for any integer ''a''. To combat this, we choose ''key prefixed''<ref>A limitation of committing to the public key (rather than to a short hash of it, or not at all) is that it removes the ability for public key recovery or verifying signatures against a short public key hash. These constructions are generally incompatible with batch verification.</ref> Schnorr signatures; changing the equation to ''sâ‹…G = R + hash(R || P || m)â‹…P''. Key prefixing also seems to be a requirement for the security proof of the MuSig multisignature scheme (see Applications below). It is not strictly necessary to do this explicitly for Bitcoin currently, as all signature hashes indirectly commit to the public keys already. However, this may change with proposals such as SIGHASH_NOINPUT ([https://github.com/bitcoin/bips/blob/master/bip-0118.mediawiki BIP 118]), or when the signature scheme is used for other purposes—especially in combination with schemes like [https://github.com/bitcoin/bips/blob/master/bip-0032.mediawiki BIP32]'s unhardened derivation.
'''Encoding R and public key point P''' There exist several possibilities for encoding elliptic curve points:
# Encoding the full X and Y coordinates of ''P'' and ''R'', resulting in a 64-byte public key and a 96-byte signature.
# Encoding the full X coordinate and one bit of the Y coordinate to determine one of the two possible Y coordinates. This would result in 33-byte public keys and 65-byte signatures.
# Encoding only the X coordinate, resulting in 32-byte public keys and 64-byte signatures.
'''Implicit Y coordinates''' In order to support efficient verification and batch verification, the Y coordinate of ''P'' and of ''R'' cannot be ambiguous (every valid X coordinate has two possible Y coordinates). We have a choice between several options for symmetry breaking:
# Implicitly choosing the Y coordinate that is in the lower half.
# Implicitly choosing the Y coordinate that is even<ref>Since ''p'' is odd, negation modulo ''p'' will map even numbers to odd numbers and the other way around. This means that for a valid X coordinate, one of the corresponding Y coordinates will be even, and the other will be odd.</ref>.
# Implicitly choosing the Y coordinate that is a quadratic residue (has a square root modulo the field size, or "is a square" for short)<ref>A product of two numbers is a square when either both or none of the factors are squares. As ''-1'' is not a square, and the two Y coordinates corresponding to a given X coordinate are each other's negation, this means exactly one of the two must be a square.</ref>.
In the case of ''R'' the third option is slower at signing time but a bit faster to verify, as it is possible to directly compute whether the Y coordinate is a square when the points are represented in
[https://en.wikibooks.org/wiki/Cryptography/Prime_Curve/Jacobian_Coordinates Jacobian coordinates] (a common optimization to avoid modular inverses
for elliptic curve operations). The two other options require a possibly
expensive conversion to affine coordinates first. This would even be the case if the sign or oddness were explicitly coded (option 2 in the previous design choice). We therefore choose option 3.
For ''P'' the speed of signing and verification does not significantly differ between any of the three options because affine coordinates of the point have to be computed anyway. For consistency reasons we choose the same option as for ''R''. The signing algorithm ensures that the signature is valid under the correct public key by negating the secret key if necessary.
It is important not to mix up the 32-byte bip-schnorr public key format and other existing public key formats (e.g. encodings used in Bitcoin's ECDSA). Concretely, a verifier should only accept 32-byte public keys and not, for example, convert a 33-byte public key by throwing away the first byte. Otherwise, two public keys would be valid for a single signature which can result in subtle malleability issues (although this type of malleability already exists in the case of ECDSA signatures).
Implicit Y coordinates are not a reduction in security when expressed as the number of elliptic curve operations an attacker is expected to perform to compute the secret key. An attacker can normalize any given public key to a point whose Y coordinate is a square by negating the point if necessary. This is just a subtraction of field elements and not an elliptic curve operation<ref>This can be formalized by a simple reduction that reduces an attack on Schnorr signatures with implicit Y coordinates to an attack to Schnorr signatures with explicit Y coordinates. The reduction works by reencoding public keys and negating the result of the hash function, which is modeled as random oracle, whenever the challenge public key has an explicit Y coordinate that is not a square.</ref>.
'''Tagged Hashes''' Cryptographic hash functions are used for multiple purposes in the specification below and in Bitcoin in general. To make sure hashes used in one context can't be reinterpreted in another one, hash functions can be tweaked with a context-dependent tag name, in such a way that collisions across contexts can be assumed to be infeasible. Such collisions obviously can not be ruled out completely, but only for schemes using tagging with a unique name. As for other schemes collisions are at least less likely with tagging than without.
For example, without tagged hashing a bip-schnorr signature could also be valid for a signature scheme where the only difference is that the arguments to the hash function are reordered. Worse, if the bip-schnorr nonce derivation function was copied or independently created, then the nonce could be accidentally reused in the other scheme leaking the secret key.
This proposal suggests to include the tag by prefixing the hashed data with ''SHA256(tag) || SHA256(tag)''. Because this is a 64-byte long context-specific constant and the ''SHA256'' block size is also 64 bytes, optimized implementations are possible (identical to SHA256 itself, but with a modified initial state). Using SHA256 of the tag name itself is reasonably simple and efficient for implementations that don't choose to use the optimization.
'''Final scheme''' As a result, our final scheme ends up using public key ''pk'' which is the X coordinate of a point ''P'' on the curve whose Y coordinate is a square and signatures ''(r,s)'' where ''r'' is the X coordinate of a point ''R'' whose Y coordinate is a square. The signature satisfies ''sâ‹…G = R + tagged_hash(r || pk || m)â‹…P''.
** ''x(P)'' and ''y(P)'' are integers in the range ''0..p-1'' and refer to the X and Y coordinates of a point ''P'' (assuming it is not infinity).
** The constant ''G'' refers to the generator, for which ''x(G) = 0x79BE667EF9DCBBAC55A06295CE870B07029BFCDB2DCE28D959F2815B16F81798'' and ''y(G) = 0x483ADA7726A3C4655DA4FBFC0E1108A8FD17B448A68554199C47D08FFB10D4B8''.
** Addition of points refers to the usual [https://en.wikipedia.org/wiki/Elliptic_curve#The_group_law elliptic curve group operation].
** [https://en.wikipedia.org/wiki/Elliptic_curve_point_multiplication Multiplication (â‹…) of an integer and a point] refers to the repeated application of the group operation.
** The function ''x[i:j]'', where ''x'' is a byte array, returns a ''(j - i)''-byte array with a copy of the ''i''-th byte (inclusive) to the ''j''-th byte (exclusive) of ''x''.
** The function ''bytes(x)'', where ''x'' is an integer, returns the 32-byte encoding of ''x'', most significant byte first.
** The function ''is_square(x)'', where ''x'' is an integer, returns whether or not ''x'' is a quadratic residue modulo ''p''. Since ''p'' is prime, it is equivalent to the Legendre symbol ''(x / p) = x<sup>(p-1)/2</sup> mod p'' being equal to ''1'' (see [https://en.wikipedia.org/wiki/Euler%27s_criterion Euler's criterion])<ref>For points ''P'' on the secp256k1 curve it holds that ''x<sup>(p-1)/2</sup> ≠ 0 mod p''.</ref>.
** The function ''is_positive(P)'', where ''P'' is a point, is defined as ''not is_infinite(P) and is_square(y(P))''<ref>For points ''P'' on the secp256k1 curve it holds that ''is_positive(P) = not is_positive(-P)''.</ref>.
** The function ''lift_x(x)'', where ''x'' is an integer in range ''0..p-1'', returns the point ''P'' for which ''x(P) = x'' and ''is_positive(P)'', or fails if no such point exists<ref>Given an candidate X coordinate ''x'' in the range ''0..p-1'', there exist either exactly two or exactly zero valid Y coordinates. If no valid Y coordinate exists, then ''x'' is not a valid X coordinate either, i.e., no point ''P'' exists for which ''x(P) = x''. Given a candidate ''x'', the valid Y coordinates are the square roots of ''c = x<sup>3</sup> + 7 mod p'' and they can be computed as ''y = ±c<sup>(p+1)/4</sup> mod p'' (see [https://en.wikipedia.org/wiki/Quadratic_residue#Prime_or_prime_power_modulus Quadratic residue]) if they exist, which can be checked by squaring and comparing with ''c''. Due to [https://en.wikipedia.org/wiki/Euler%27s_criterion Euler's criterion] it then holds that ''c<sup>(p-1)/2</sup> = 1 mod p''. The same criterion applied to ''y'' results in ''y<sup>(p-1)/2</sup> mod p = ±c<sup>((p+1)/4)((p-1)/2)</sup> mod p = ±1 mod p''. Therefore ''y = +c<sup>(p+1)/4</sup> mod p'' is a quadratic residue and ''-y mod p'' is not.</ref>. The function ''lift_x(x)'' is equivalent to the following pseudocode:
** The function ''hash<sub>tag</sub>(x)'' where ''tag'' is a UTF-8 encoded tag name and ''x'' is a byte array returns the 32-byte hash ''SHA256(SHA256(tag) || SHA256(tag) || x)''.
Alternatively, the public key can be created according to [https://github.com/bitcoin/bips/blob/master/bip-0032.mediawiki BIP32] which describes the derivation of 33-byte compressed public keys.
In order to translate such public keys into bip-schnorr compatible keys, the first byte must be dropped.
* Let ''k' = int(hash<sub>BIPSchnorrDerive</sub>(bytes(d) || m)) mod n''<ref>Note that in general, taking the output of a hash function modulo the curve order will produce an unacceptably biased result. However, for the secp256k1 curve, the order is sufficiently close to ''2<sup>256</sup>'' that this bias is not observable (''1 - n / 2<sup>256</sup>'' is around ''1.27 * 2<sup>-128</sup>'').</ref>.
'''Above deterministic derivation of ''R'' is designed specifically for this signing algorithm and may not be secure when used in other signature schemes.'''
For example, using the same derivation in the MuSig multi-signature scheme leaks the secret key (see the [https://eprint.iacr.org/2018/068 MuSig paper] for details).
Note that this is not a ''unique signature'' scheme: while this algorithm will always produce the same signature for a given message and public key, ''k'' (and hence ''R'') may be generated in other ways (such as by a CSPRNG) producing a different, but still valid, signature.
* Generate ''u-1'' random integers ''a<sub>2...u</sub>'' in the range ''1...n-1''. They are generated deterministically using a [https://en.wikipedia.org/wiki/Cryptographically_secure_pseudorandom_number_generator CSPRNG] seeded by a cryptographic hash of all inputs of the algorithm, i.e. ''seed = seed_hash(pk<sub>1</sub>..pk<sub>u</sub> || m<sub>1</sub>..m<sub>u</sub> || sig<sub>1</sub>..sig<sub>u</sub> )''. A safe choice is to instantiate ''seed_hash'' with SHA256 and use [https://tools.ietf.org/html/rfc8439 ChaCha20] with key ''seed'' as a CSPRNG to generate 256-bit integers, skipping integers not in the range ''1...n-1''.
* For ''i = 1 .. u'':
** Let ''P<sub>i</sub> = point(pk<sub>i</sub>)''; fail if ''point(pk<sub>i</sub>)'' fails.
If all individual signatures are valid (i.e., ''Verify'' would return success for them), ''BatchVerify'' will always return success. If at least one signature is invalid, ''BatchVerify'' will return success with at most a negligible probability.
Many techniques are known for optimizing elliptic curve implementations. Several of them apply here, but are out of scope for this document. Two are listed below however, as they are relevant to the design decisions:
'''Squareness testing''' The function ''is_square(x)'' is defined as above, but can be computed more efficiently using an [https://en.wikipedia.org/wiki/Jacobi_symbol#Calculating_the_Jacobi_symbol extended GCD algorithm].
'''Jacobian coordinates''' Elliptic Curve operations can be implemented more efficiently by using [https://en.wikibooks.org/wiki/Cryptography/Prime_Curve/Jacobian_Coordinates Jacobian coordinates]. Elliptic Curve operations implemented this way avoid many intermediate modular inverses (which are computationally expensive), and the scheme proposed in this document is in fact designed to not need any inversions at all for verification. When operating on a point ''P'' with Jacobian coordinates ''(x,y,z)'' which is not the point at infinity and for which ''x(P)'' is defined as ''x / z<sup>2</sup>'' and ''y(P)'' is defined as ''y / z<sup>3</sup>'':
There are several interesting applications beyond simple signatures.
While recent academic papers claim that they are also possible with ECDSA, consensus support for Schnorr signature verification would significantly simplify the constructions.
By means of an interactive scheme such as [https://eprint.iacr.org/2018/068 MuSig], participants can aggregate their public keys into a single public key which they can jointly sign for. This allows ''n''-of-''n'' multisignatures which, from a verifier's perspective, are no different from ordinary signatures, giving improved privacy and efficiency versus ''CHECKMULTISIG'' or other means.
Moreover, Schnorr signatures are compatible with [https://web.archive.org/web/20031003232851/http://www.research.ibm.com/security/dkg.ps distributed key generation], which enables interactive threshold signatures schemes, e.g., the schemes described by [http://cacr.uwaterloo.ca/techreports/2001/corr2001-13.ps Stinson and Strobl (2001)] or [https://web.archive.org/web/20060911151529/http://theory.lcs.mit.edu/~stasio/Papers/gjkr03.pdf Genaro, Jarecki and Krawczyk (2003)]. These protocols make it possible to realize ''k''-of-''n'' threshold signatures, which ensure that any subset of size ''k'' of the set of ''n'' signers can sign but no subset of size less than ''k'' can produce a valid Schnorr signature. However, the practicality of the existing schemes is limited: most schemes in the literature have been proven secure only for the case ''k-1 < n/2'', are not secure when used concurrently in multiple sessions, or require a reliable broadcast mechanism to be secure. Further research is necessary to improve this situation.
[https://download.wpsoftware.net/bitcoin/wizardry/mw-slides/2018-05-18-l2/slides.pdf Adaptor signatures] can be produced by a signer by offsetting his public nonce with a known point ''T = tG'', but not offsetting his secret nonce.
A correct signature (or partial signature, as individual signers' contributions to a multisignature are called) on the same message with same nonce will then be equal to the adaptor signature offset by ''t'', meaning that learning ''t'' is equivalent to learning a correct signature.
This can be used to enable atomic swaps or even [https://eprint.iacr.org/2018/472 general payment channels] in which the atomicity of disjoint transactions is ensured using the signatures themselves, rather than Bitcoin script support. The resulting transactions will appear to verifiers to be no different from ordinary single-signer transactions, except perhaps for the inclusion of locktime refund logic.
Adaptor signatures, beyond the efficiency and privacy benefits of encoding script semantics into constant-sized signatures, have additional benefits over traditional hash-based payment channels. Specifically, the secret values ''t'' may be reblinded between hops, allowing long chains of transactions to be made atomic while even the participants cannot identify which transactions are part of the chain. Also, because the secret values are chosen at signing time, rather than key generation time, existing outputs may be repurposed for different applications without recourse to the blockchain, even multiple times.
A blind signature protocol is an interactive protocol that enables a signer to sign a message at the behest of another party without learning any information about the signed message or the signature. Schnorr signatures admit a very [https://www.math.uni-frankfurt.de/~dmst/research/papers/schnorr.blind_sigs_attack.2001.pdf simple blind signature scheme] which is however insecure because it's vulnerable to [https://www.iacr.org/archive/crypto2002/24420288/24420288.pdf Wagner's attack]. A known mitigation is to let the signer abort a signing session with a certain probability, and the resulting scheme can be [https://eprint.iacr.org/2019/877 proven secure under non-standard cryptographic assumptions].
Blind Schnorr signatures could for example be used in [https://github.com/ElementsProject/scriptless-scripts/blob/master/md/partially-blind-swap.md Partially Blind Atomic Swaps], a construction to enable transferring of coins, mediated by an untrusted escrow agent, without connecting the transactors in the public blockchain transaction graph.
For development and testing purposes, we provide a [[bip-schnorr/test-vectors.csv|collection of test vectors in CSV format]] and a naive, highly inefficient, and non-constant time [[bip-schnorr/reference.py|pure Python 3.7 reference implementation of the signing and verification algorithm]].
The reference implementation is for demonstration purposes only and not to be used in production environments.
== Footnotes ==
<references />
== Acknowledgements ==
This document is the result of many discussions around Schnorr based signatures over the years, and had input from Johnson Lau, Greg Maxwell, Jonas Nick, Andrew Poelstra, Tim Ruffing, Rusty Russell, and Anthony Towns.