In this commit, we turn the package into a new Go module (version 2),
and then port over the current set of types and functions to mainly
alias to the more optimized and maintained dcrec variant.
Taking a look at the benchmarks, most operations other than
normalization (which IIRC is a bit slower now due to constant time
fixes) enjoy some nice speeds up:
```
benchcmp is deprecated in favor of benchstat: https://pkg.go.dev/golang.org/x/perf/cmd/benchstat
benchmark old ns/op new ns/op delta
BenchmarkAddJacobian-8 464 328 -29.20%
BenchmarkAddJacobianNotZOne-8 1138 372 -67.27%
BenchmarkScalarBaseMult-8 47336 31531 -33.39%
BenchmarkScalarBaseMultLarge-8 42465 32057 -24.51%
BenchmarkScalarMult-8 123355 117579 -4.68%
BenchmarkNAF-8 582 168 -71.12%
BenchmarkSigVerify-8 175414 120794 -31.14%
BenchmarkFieldNormalize-8 23.8 24.4 +2.39%
BenchmarkParseCompressedPubKey-8 24282 10907 -55.08%
```
The implementation has been adapted from the dcrec module in dcrd. The
bug was initially fixed in decred/dcrd@3d9cda1 while transitioning to a
constant time algorithm. A large set of test vectors were subsequently
added in decred/dcrd@8c6b52d.
The function signature has been preserved for backwards compatibility.
This means that returning whether the value has overflowed, and the
corresponding test vectors have not been backported.
This fixes#1170 and closes a previous attempt to fix the bug in #1178.
This commit optimizes the decompressPoint subroutine, used in extracting
compressed pubkeys and performing pubkey recovery. We do so by replacing
the use of big.Int.Exp with with square-and-multiply exponentiation of
btcec's more optimized fieldVals, reducing the overall latency and
memory requirements of decompressPoint.
Instead of operating on bits of Q = (P+1)/4, the exponentiation applies
the square-and-multiply operations on full bytes of Q. Compared to the
original speedup. Compared the bit-wise version, the improvement is
roughly 10%.
A new pair fieldVal methods called Sqrt and SqrtVal are added, which
applies the square-and-multiply exponentiation using precomputed
byte-slice of the value Q.
Comparison against big.Int sqrt and SAM sqrt over bytes of Q:
benchmark old ns/op new ns/op delta
BenchmarkParseCompressedPubKey-8 35545 23119 -34.96%
benchmark old allocs new allocs delta
BenchmarkParseCompressedPubKey-8 35 6 -82.86%
benchmark old bytes new bytes delta
BenchmarkParseCompressedPubKey-8 2777 256 -90.78%
This modifies the normalize function of the internal field value to
both optimize it and address an issue where the reduction could
lead to an incorrect result with a small range of values. It also adds
tests to ensure the behavior is correct.
The following benchmark shows the relative speedups as a result of the
optimization on my system. In particular, the changes result in
approximately a 14% speedup in Normalize, which ultimately translates to
a 2% speedup in signature verifies.
benchmark old ns/op new ns/op delta
--------------------------------------------------------------------
BenchmarkAddJacobian 1364 1289 -5.50%
BenchmarkAddJacobianNotZOne 3150 3091 -1.87%
BenchmarkScalarBaseMult 134117 132816 -0.97%
BenchmarkScalarBaseMultLarge 135067 132966 -1.56%
BenchmarkScalarMult 411218 402217 -2.19%
BenchmarkSigVerify 671585 657833 -2.05%
BenchmarkFieldNormalize 36.0 31.0 -13.89%
As noted in issue #706, the existing code had an issue where the
normalized result was > P when both the first and second words of the
field representation being normalized were BOTH greater than or equal to
the first and second words of P. Although this condition is rare in
practice, it needs to be handled properly.
This resolves the issue by comparing the low words in the final
reduction step against the normalized low order prime bits to ensure the
final subtraction occurs correctly any time they're > P. This approach
retains the constant time property as well.
This commit contains various modifications for code and comment
consistency in the btcec package:
- Call out references at the top and reference them by their identifier in
the other comments
- Remove a TODO that no longer applies
- Add comments to the fields in the KoblitzCurve struct and reorder them
slightly
- Make comments wrap to 80
- Cleanup code that was far exceeding col 80 (only function declarations
typically do this)
- Extend block comments to use as much of the 80 cols as available
- Add a bit more explanation in a couple of places
- Update copyright year on secp256k1.go
- Fix a couple of typos in the comments