`cargo bench` sets `cfg(test)`, causing us to hit some test-only
code in the router when benchmarking, throwing off our benchmarks
substantially. Here we swap from the `unstable` feature to a more
clearly internal feature (`_bench_unstable`) and also checking for
it when enabling test-only code.
ProbabilisticScorer uses successful and unsuccessful payments to gain
more certainty of a channel's liquidity balance. Decay this knowledge
over time to indicate decreasing certainty about the liquidity balance.
Add a Score implementation based on "Optimally Reliable & Cheap Payment
Flows on the Lightning Network" by Rene Pickhardt and Stefan Richter[1].
Given the uncertainty of channel liquidity balances, probability
distributions are defined based on knowledge learned from successful and
unsuccessful attempts. Then the negative log of the success probability
is used to determine the cost of routing a specific HTLC amount through
a channel.
[1]: https://arxiv.org/abs/2107.05322
A channel's capacity may be inferred or learned and is used to make
routing decisions, including as a parameter to channel scoring. Define
an EffectiveCapacity for this purpose. Score::channel_penalty_msat takes
the effective capacity (less in-flight HTLCs for the same payment), and
never None. Thus, for hops given in an invoice, the effective capacity
is now considered (near) infinite if over a private channel or based on
learned information if over a public channel.
If a Score implementations needs the effective capacity when updating a
channel's score, i.e. in payment_path_failed or payment_path_successful,
it can access the channel's EffectiveCapacity via the NetworkGraph by
first looking up the channel and then specifying which direction is
desired using ChannelInfo::as_directed.
Scorers may have different performance characteristics after seeing
failed and successful paths. Seed the scorer with some random data
before executing the benchmark in order to exercise such behavior.
Passing first_hops to get_route increases the coverage of the benchmark
test. For scorers needing the sending node, it allows for using a single
scorer in the benchmark rather than re-initializing on each iteration.
As a consequence, the scorer can be seeded with success and failure
data.
Refactor generate_routes and generate_mpp_routes into a single utility
for benchmarking. The utility is parameterized with features in order to
test both single path and multi-path routing. Additionally, it is
parameterized with a Score to be used with other scorers.
Because we time out channel info that is older than two weeks now,
we should also reject new channel info that is older than two
weeks, in addition to rejecting future channel info.
We define "stale" as "haven't heard an updated channel_update in
two weeks", as described in BOLT 7.
We also filter out stale channels at write-time for `std` users, as
we can look up the current time.
An unchecked shift of more than 64 bits on u64 values causes a shift
overflow panic. This may happen if a channel is penalized only once and
(1) is not successfully routed through and (2) after 64 or more half
life decays. Use a checked shift to prevent this from happening.
If a payment failed to route through a channel, a penalty is applied to
the channel in the future when finding a route. This penalty decays over
time. Immediately decay the penalty by one half life when a payment is
successfully routed through the channel.
Expand the Score trait with a payment_path_successful function for
scoring successful payment paths. Called by InvoicePayer's EventHandler
implementation when processing PaymentPathSuccessful events. May be used
by Score implementations to revert any channel penalties that were
applied by calls to payment_path_failed.
`scoring::Time` exists in part to make testing the passage of time
in `Scorer` practical. To allow no-std users to provide a time
source it was exposed as a trait as well. However, it seems
somewhat unlikely that a no-std user is going to have a use for
providing their own time source (otherwise they wouldn't be a
no-std user), and likely they won't have a graph in memory either.
`scoring::Time` as currently written is also exceptionally hard to
write C bindings for - the C bindings trait mappings relies on the
ability to construct trait implementations at runtime with function
pointers (i.e. `dyn Trait`s). `scoring::Time`, on the other hand,
is a supertrait of `core::ops::Sub` which requires a `sub` method
which takes a type parameter and returns a type parameter. Both of
which aren't practical in bindings, especially given the
`Sub::Output` associated type is not bound by any trait bounds at
all (implying we cannot simply map the `sub` function to return an
opaque trait object).
Thus, for simplicity, we here simply seal `scoring::Time` and make
it effectively-private, ensuring the bindings don't need to bother
with it.
Ultimately we likely need to wrap the locked `Score` in a struct
that exposes writeable somehow, but because all traits have to be
fully concretized for C bindings we'll still need `Writeable` on
all `Score` in order to expose `Writeable` on the locked score.
Otherwise, we'll only have a `LockedScore` with a `Score` visible
that only has the `Score` methods, never the original type.
Even if our gossip hasn't changed, we should be willing to
re-broadcast it to our peers. All our peers may have been
disconnected the last time we broadcasted it.
Traits in top-level modules is somewhat confusing - generally
top-level modules are just organizational modules and don't contain
things themselves, instead placing traits and structs in
sub-modules. Further, its incredibly awkward to have a `scorer`
sub-module, but only have a single struct in it, with the relevant
trait it is the only implementation of somewhere else. Not having
`Score` in the `scorer` sub-module is further confusing because
it's the only module anywhere that references scoring at all.
Sending HTLCs which are any greater than a very small fraction of the
channel size tend to fail at a much higher rate. Thus, by default
we start applying a penalty at only 1/8th the channel size and
increase it linearly as the amount reaches the channel's capacity,
20 msat per 1024th of the channel capacity.
This should allow `Score` implementations to make substantially
better decisions, including of the form "willing to pay X to avoid
routing over this channel which may have a high failure rate".
In order to test Scorer hermetically, sleeps must be avoided. Add a
SinceEpoch abstraction for manually advancing time. Implement the Time
trait for SinceEpoch so that it can be used with ScorerUsingTime in
tests.
The bindings don't currently support passing `Vec`s of objects
which it mappes as "opaque types". This is because it will require
clones to convert its own list of references to Rust's list of
objects.
In the near future we should resolve this limitation, allowing us
to revert this (and make `find_route`'s method signature similarly
cleaner), but for now we must avoid `Vec<OpaqueType>`.
Scorer should be serialized to retain penalty data between restarts.
Implement (de)serialization for Scorer by serializing last failure times
as duration since the UNIX epoch. For no-std, the zero-Duration is used.
Scorer uses time to determine how much to penalize a channel after a
failure occurs. Parameterizing it by time cleans up the code such that
no-std support is in a single AlwaysPresent struct, which implements the
Time trait. Time is implemented for std::time::Instant when std is
available.
This parameterization also allows for deterministic testing since a
clock could be devised to advance forward as needed.
Move channel failure penalty logic into a ChannelFailure abstraction.
This encapsulates the logic for accumulating penalties and decaying them
over time. It also is responsible for the no-std behavior. This cleans
up Scorer and will make it easier to serialize it.