mirror of
https://github.com/bitcoin/bips.git
synced 2025-01-19 05:45:07 +01:00
BIP 158: Correct statements about false positive rate.
This commit is contained in:
parent
f4948ddb4f
commit
b88d5b6b5c
@ -65,11 +65,10 @@ For each block, compact filters are derived containing sets of items associated
|
||||
with the block (eg. addresses sent to, outpoints spent, etc.). A set of such
|
||||
data objects is compressed into a probabilistic structure called a
|
||||
''Golomb-coded set'' (GCS), which matches all items in the set with probability
|
||||
1, and matches other items with probability <code>2^(-P)</code> for some
|
||||
integer parameter <code>P</code>. We also introduce parameter <code>M</code>
|
||||
which allows filter to uniquely tune the range that items are hashed onto
|
||||
before compressing. Each defined filter also selects distinct parameters for P
|
||||
and M.
|
||||
1, and matches other items with probability <code>1/M</code> for some
|
||||
integer parameter <code>M</code>. The encoding is also parameterized by
|
||||
<code>P</code>, the bit length of the remainder code. Each filter defined
|
||||
specifies values for <code>P</code> and <code>M</code>.
|
||||
|
||||
At a high level, a GCS is constructed from a set of <code>N</code> items by:
|
||||
# hashing all items to 64-bit integers in the range <code>[0, N * M)</code>
|
||||
@ -88,8 +87,8 @@ one is able to select both Parameters independently, then more optimal values
|
||||
can be
|
||||
selected<ref>https://gist.github.com/sipa/576d5f09c3b86c3b1b75598d799fc845</ref>.
|
||||
Set membership queries against the hash outputs will have a false positive rate
|
||||
of <code>2^(-P)</code>. To avoid integer overflow, the
|
||||
number of items <code>N</code> MUST be <2^32 and <code>M</code> MUST be <2^32.
|
||||
of <code>M</code>. To avoid integer overflow, the number of items <code>N</code>
|
||||
MUST be <2^32 and <code>M</code> MUST be <2^32.
|
||||
|
||||
The items are first passed through the pseudorandom function ''SipHash'', which
|
||||
takes a 128-bit key <code>k</code> and a variable-sized byte vector and produces
|
||||
@ -189,9 +188,10 @@ golomb_decode(stream, P: uint) -> uint64:
|
||||
|
||||
==== Set Construction ====
|
||||
|
||||
A GCS is constructed from three parameters:
|
||||
A GCS is constructed from four parameters:
|
||||
* <code>L</code>, a vector of <code>N</code> raw items
|
||||
* <code>P</code>, which determines the false positive rate
|
||||
* <code>P</code>, the bit parameter of the Golomb-Rice coding
|
||||
* <code>M</code>, the target false positive rate
|
||||
* <code>k</code>, the 128-bit key used to randomize the SipHash outputs
|
||||
|
||||
The result is a byte vector with a minimum size of <code>N * (P + 1)</code>
|
||||
|
Loading…
Reference in New Issue
Block a user