When using the normal default constructors, we should have some
fee maximum to ensure our default behavior is safe. Here we pick
1% + 50 sats to ensure we're always willing to pay
reasonabl(y high) fees, but not anything too wild.
We exclude any candidate hops if we find that using them would let the
aggregated path routing fees exceed `max_total_routing_fee_msat`.
Moreover, we return an error if the aggregated fees over all paths of
the selected route would surpass `max_total_routing_fee_msat`.
Currently, users have no means to upper-bound the total fees accruing
when finding a route. Here, we add a corresponding field to
`RouteParameters` which will be used to limit the candidate set during
path finding in the following commits.
In bindings, we can't use unbounded generic types, and thus have to
rip out the `ScoreParams` and replace them with static
`ProbabilisticScoringFeeParams` universally. To make this easier,
using `Default::default()` everywhere allows the type to change out
from under the test without the test needing to change.
Our "what is the success probability of paying over a channel with
the given liquidity bounds" calculation currently assumes the
probability of where the liquidity lies in a channel is constant
across the entire capacity of a channel. This is obviously a
somewhat dubious assumption given most nodes don't materially
rebalance and flows within the network often push liquidity
"towards the edges".
Here we add an option to consider this when scoring channels during
routefinding. Specifically, if a new `linear_success_probability`
flag is unset on `ProbabilisticScoringFeeParameters`, rather than
assuming a PDF of `1` (across the channel's capacity scaled from 0
to 1), we use `(x - 0.5)^2`.
This assumes liquidity is likely to be near the edges, which
matches experimental results. Further, calculating the CDF (i.e.
integral) between arbitrary points on the PDF is trivial, which we
do as our main scoring function.
While this (finally) introduces floats in our scoring, its not
practical to exponentiate using fixed-precision, and benchmarks
show this is a performance regression, but not a huge one, more
than made up for by the increase in payment success rates.
When we started considering the in-flight amounts when scoring, we
took the approach of considering the in-flight amount as an
effective reduction in the channel's total capacity. When we were
scoring using a flat success probability PDF, that was fine,
however in the next commit we'll move to a highly nonlinear one,
which makes this a pretty confusing heuristic.
Here, instead, we move to considering the in-flight amount as
simply an extension of the amount we're trying to send over the
channel, which is equivalent for the flat success probability PDF,
but makes much more sense in a nonlinear world.
If we are examining a channel for which we have no information at
all, we traditionally assume the HTLC success probability is
proportional to the channel's capacity. While this may be the case,
it is not the case that a tiny payment over a huge channel is
guaranteed to succeed, as we assume. Rather, the probability of
such success is likely closer to 50% than 100%.
Here we try to capture this by simply scaling the success
probability for channels where we have no information down
linearly. We pick 75% as the upper bound rather arbitrarily - while
50% may be more accurate, its possible it would lead to an
over-reliance on channels which we have paid through in the past,
which aren't necessarily always the best candidates.
Note that we only do this scaling for the historical bucket
tracker, as there we can be confident we've never seen a successful
HTLC completion on the given channel. If we were to apply the same
scaling to the simple liquidity bounds based scoring we'd penalize
channels we've never tried over those we've only ever fails to pay
over, which is obviously not a good outcome.
Our "what is the success probability of paying over a channel with
the given liquidity bounds" calculation is reused in a few places,
and is a key assumption across our main score calculation and the
historical bucket score calculations.
Here we break it out into a function to make it easier to
experiment with different success probability calculations.
Note that this drops the numerator +1 in the liquidity scorer,
which was added to compensate for the divisor + 1 (which exists to
avoid divide-by-zero), making the new math slightly less correct
but not by any material amount.
When sending preflight probes, we want to exclude last hops that are
possibly announced. To this end, we here include a new field in
`RouteHop` that will be `true` when we either def. know the hop to be
announced, or, if there exist public channels between the hop's
counterparties that this hop might refer to (i.e., be an alias for).
Scoring buckets are stored as fixed point ints, with a 5-bit
fractional part (i.e. a value of 1.0 is stored as "32"). Now that
we also have 32 buckets, this leads to the codebase having many
references to 32 which could reasonably be confused for each other.
Thus, we add a constant here for the value 1.0 in our fixed-point
scheme.
`historical_estimated_channel_liquidity_probabilities` previously
decayed to `Some(([0; 8], [0; 8]))`. This was thought to be useful
in that it allowed identification of cases where data was previously
available but is now decayed away vs cases where data was never
available. However, with the introduction of
`historical_estimated_payment_success_probability` (which uses the
existing scoring routines so will decay to `None`) this is
unnecessarily confusing.
Given data which has decayed to zero will also not be used anyway,
there's little reason to keep the old behavior, and we now decay to
`None`.
We also take this opportunity to split the overloaded
`get_decayed_buckets`, removing uneccessary code during scoring.
Points in the 0th minimum bucket either indicate we sent a payment
which is < 1/16,384th of the channel's capacity or, more likely,
we failed to send a payment. In either case, averaging the success
probability across the full range of upper-bounds doesn't make a
whole lot of sense - if we've never managed to send a "real"
payment over a channel, we should be considering it quite poor.
To address this, we special-case the 0th minimum bucket and only
look at the largest-offset max bucket when calculating the success
probability.
The lower-bound of the scoring history buckets generally never get
used - if we try to send a payment and it fails, we don't learn
a new lower-bound for the liquidity of a channel, and if we
successfully send a payment we only learn a lower-bound that
applied *before* we sent the payment, not after it completed.
If we assume channels have some "steady-state" liquidity, then
tracking our liquidity estimates *after* a payment doesn't really
make sense - we're not super likely to make a second payment across
the same channel immediately (or, if we are, we can use our
un-decayed liquidity estimates for that). By the time we do go to
use the same channel again, we'd assume that its back at its
"steady-state" and the impacts of our payment have been lost.
To combat both of these effects, here we "subtract" the impact of
any just-successful payments from our liquidity estimates prior to
updating the historical buckets.
Currently we store our historical estimates of channel liquidity in
eight evenly-sized buckets, each representing a full octile of the
channel's total capacity. This lacks precision, especially at the
edges of channels where liquidity is expected to lie.
To mitigate this, we'd originally checked if a payment lies within
a bucket by comparing it to a sliding scale of 64ths of the
channel's capacity. This allowed us to assign penalties to payments
that fall within any more than the bottom 64th or lower than the
top 64th of a channel.
However, this still lacks material precision - on a 1 BTC channel
we could only consider failures for HTLCs above 1.5 million sats.
With today's lightning usage often including 1-100 sat payments in
tips, this is a rather significant lack of precision.
Here we rip out the existing buckets and replace them with 32
*unequal* sized buckets. This allows us to focus our precision at
the edges of a channel (where the liquidity is likely to lie, and
where precision helps the most).
We set the size of the edge buckets to 1/16,384th of the channel,
with the size increasing exponentially until it approaches the
inner buckets. For backwards compatibility, the buckets divide
evenly into octets, allowing us to convert the existing buckets
into the new ones cleanly.
This allows us to consider HTLCs down to 6,000 sats for 1 BTC
channels. In order to avoid failing to penalize channels which have
always failed, we drop the sliding scale for comparisons and simply
check if the payment is above the minimum bucket we're analyzing and
below *or in* the maximum one. This generates somewhat more
pessimistic scores, but fixes the lower bound where we suddenly
assign a 0% failure probability.
While this does represent a regression in routing performance, in
some cases the impact of not having to examine as many nodes
dominates, leading to a performance increase.
On a Xeon E3-1220 v5, the `large_mpp_routes` benchmark shows a 15%
performance increase, while the more stable benchmarks show an 8%
and 15% performance regression.
- Split Score from LockableScore to ScoreLookUp to handle read
operations and ScoreUpdate to handle write operations
- Change all struct that implemented Score to implement ScoreLookUp
and/or ScoreUpdate
- Change Mutex's to RwLocks to allow multiple data readers
- Change LockableScore to Deref in ScorerAccountingForInFlightHtlcs
as we only need to read
- Add ScoreLookUp and ScoreUpdate docs
- Remove reference(&'a) and Sized from Score in ScorerAccountingForInFlightHtlcs
as Score implements Deref
- Split MultiThreadedScoreLock into MultiThreadedScoreLockWrite and MultiThreadedScoreLockRead.
After splitting LockableScore, we split MultiThreadedScoreLock following
the same way, splitting a single score into two srtucts, one for read and
other for write.
MultiThreadedScoreLock is used in c_bindings.
This previously led to a debug panic in the router because we wouldn't account
for the blinded path fee when calculating first_hop<>intro_node hop's available
liquidity and construct an invalid path that forwarded more over said hop than
was actually available.
This also led to us hitting unreachable code, see direct_to_matching_intro_nodes
test description.
Because a `UtxoLookup` implementation is likely to need a reference
to the `PeerManager` which contains a reference to the
`P2PGossipSync`, it is likely to be impossible to get a mutable
reference to the `P2PGossipSync` by the time we want to add a
`UtxoLookup` without a ton of boilerplate and trait wrapping.
Instead, we simply place the `UtxoLookup` in a `RwLock`, allowing
us to modify it without a mutable self reference.
The lifetime bounds updates in tests required in this commit are
entirely unclear to me, but do allow tests to continue building, so
somehow make rustc happier.
In 3f32f60ae7 we exposed the
historical success probability buckets directly, with a long method
doc explaining how to use it. While this is great for logging
exactly what the internal model thinks, its also helpful to let
users know what the internal model thinks the success probability
is directly, allowing them to compare route success probabilities.
Here we do so but only for the historical tracking buckets.
When we attempt to score a channel which has a success probability
very low, we may have a log well above our cut-off of two. For the
liquidity penalties this works great, we bound it by
`NEGATIVE_LOG10_UPPER_BOUND` and `min` the two scores. For the
amount liquidity penalty we didn't do any `min`ing at all.
This fix is to min the log itself first and then reuse the min'd
log in both calculations.
Currently we let an `htlc_amount >= channel_capacity` pass through
from `penalty_msat` to
`calculate_success_probability_times_billion`, but only if its only
marginally bigger (less than 65/64ths). This is fine as
`calculate_success_probability_times_billion` handles bogus values
just fine (it will always return a zero probability in such cases).
However, this is risky, and in fact breaks in the coming commits,
so instead check it before ever calling through to the historical
bucket probability calculations.