When we broadcast a node announcement, the features we support are really a
combination of all the various features our different handlers support. This
commit captures this concept by OR'ing our NodeFeatures across both our channel
and routing message handlers.
When we go to send an Init message to new peers, the features we
support are really a combination of all the various features our
different handlers support. This commit captures this concept by
OR'ing our InitFeatures across both our Channel and Routing
handlers.
Note that this also disables setting the `initial_routing_sync`
flag in init messages, as was intended in
e742894492, per the comment added on
`clear_initial_routing_sync`, though this should not be a behavior
change in practice as nodes which support gossip queries ignore the
initial routing sync flag.
The `rejected_by_dest` field of the `PaymentPathFailed` event has
always been a bit of a misnomer, as its really more about retry
than where a payment failed. Now is as good a time as any to
rename it.
Added two methods, `process_path_inflight_htlcs` and
`remove_path_inflight_htlcs`, that updates that `payment_cache` map with
path information that may have failed, succeeded, or have been given up
on.
Introduced `AccountForInflightHtlcs`, which will wrap our user-provided
scorer. We move the `S:Score` type parameterization from the `Router` to
`find_route`, so we can use our newly introduced
`AccountForInflightHtlcs`.
`AccountForInflightHtlcs` keeps track of a map of inflight HTLCs by
their short channel id, direction, and give us the value that is being
used up.
This map will in turn be populated prior to calling `find_route`, where
we’ll use `create_inflight_map`, to generate a current map of all
inflight HTLCs based on what was stored in `payment_cache`.
If we receive a ChannelAnnouncement message but we already have the
channel, there's no reason to do a chain lookup. Instead of
immediately calling the user-provided `chain::Access` when handling
a ChannelAnnouncement, we first check if we have the corresponding
channel in the graph.
Note that if we do have the corresponding channel but it was not
previously checked against the blockchain, we should still check
with the `chain::Access` and update if necessary.
The `bitcoin_key_1` and `bitcoin_key_2` fields in
`channel_announcement` messages are sorted according to node_ids
rather than the keys themselves, however the on-chain funding
script is sorted according to the bitcoin keys themselves. Thus,
with some probability, we end up checking that the on-chain script
matches the wrong script and rejecting the channel announcement.
The correct solution is to use our existing channel funding script
generation function which ensure we always match what we generate.
This was found in testing the Java bindings, where a test checks
that retunring the generated funding script in `chain::Access`
results in the constructed channel ending up in our network graph.
Instead of backfilling gossip by buffering (up to) ten messages at
a time, only buffer one message at a time, as the peers' outbound
socket buffer drains. This moves the outbound backfill messages out
of `PeerHandler` and into the operating system buffer, where it
arguably belongs.
Not buffering causes us to walk the gossip B-Trees somewhat more
often, but avoids allocating vecs for the responses. While its
probably (without having benchmarked it) a net performance loss, it
simplifies buffer tracking and leaves us with more room to play
with the buffer sizing constants as we add onion message forwarding
which is an important win.
Note that because we change how often we check if we're out of
messages to send before pinging, we slightly change how many
messages are exchanged at once, impacting the
`test_do_attempt_write_data` constants.
This makes our `ProbabilisticScorer` field names more consistent,
as we add more types of penalties, referring to a penalty as only
the "amount penalty" no longer makes sense - we not have several
amount multiplier penalties.
There's not much reason to not have a per-hop-per-amount penalty in
the `ProbabilisticScorer` to go along with the per-hop penalty to
let it scale up to larger amounts, so we add one here.
Notably, we use a divisor of 2^30 instead of 2^20 (like the
equivalent liquidity penalty) as it allows for more flexibility,
and there's not really any reason to worry about us not being able
to create high enough penalties.
Closes#1616
Saturating a channel beyond 1/4 of its capacity seems like a more
reasonable threshold for avoiding a path than 1/2, especially given
we should still be willing to send a payment with a lower
saturation limit if it comes to that.
This requires an (obvious) change to some router tests, but also
requires a change to the `fake_network_test`, opting to simply
remove some over-limit test code there - `fake_network_test` was
our first ever functional test, and while it worked great to ensure
LDK worked at all on day one, we now have a rather large breadth
of functional tests, and a broad "does it work at all" test is no
longer all that useful.
Currently, after we've selected a number of candidate paths, we
construct a route from a random set of paths repeatedly, and then
select the route with the lowest total cost. In the vast majority
of cases this ends up doing a bunch of additional work in order to
select the path(s) with the total lowest cost, with some vague
attempt at randomization that doesn't actually work.
Instead, here, we simply sort available paths by `cost / amount`
and select the top paths. This ends up in practice having the same
end result with substantially less complexity. In some rare cases
it gets a better result, which also would have been achieved
through more random trials. This implies there may in such cases be
a potential privacy loss, but not a substantial one, given our path
selection is ultimately mostly deterministic in many cases (or, if
it is not, then privacy is achieved through randomization at the
scorer level).
If we end up "paying" for an `htlc_minimum_msat` with fees, we
increment `already_collected_value_msat` by more than the amount
of the path that we collected (who's `value_contribution_msat` is
higher than the total payment amount, despite having been reduced
down to the payment amount).
This throws off our total value collection target, though in the
coming commit(s) it would also throw off our path selection
calculations.
In general we should avoid taking paths that we are confident will
not work as much possible, but we should be willing to try each
payment at least once, even if its over a channel that failed
recently. A full Bitcoin penalty for such a channel seems
reasonable - lightning fees are unlikely to ever reach that point
so such channels will be scored much worse than any other potential
path, while still being below `u64::max_value()`.
When we consider sending an HTLC over a given channel impossible
due to our current knowledge of the channel's liquidity, we
currently always assign a penalty of `u64::max_value()`. However,
because we now refuse to retry a payment along the same path in
the router itself, we can now make this value configurable. This
allows users to have a relatively high knowledge decay interval
without the side-effect of refusing to try the only available path
in cases where a channel is intermittently available.
When an HTLC fails, we currently rely on the scorer learning the
failed channel and assigning an infinite (`u64::max_value()`)
penalty to the channel so as to avoid retrying over the exact same
path (if there's only one available path). This is common when
trying to pay a mobile client behind an LSP if the mobile client is
currently offline.
This leads to the scorer being overly conservative in some cases -
returning `u64::max_value()` when a given path hasn't been tried
for a given payment may not be the best decision, even if that
channel failed 50 minutes ago.
By tracking channels which failed on a payment part level and
explicitly refusing to route over them we can relax the
requirements on the scorer, allowing it to make different decisions
on how to treat channels that failed relatively recently without
causing payments to retry the same path forever.
This does have the drawback that it could allow two separate part
of a payment to traverse the same path even though that path just
failed, however this should only occur if the payment is going to
fail anyway, at least as long as the scorer is properly learning.
Closes#1241, superseding #1252.
ReadOnlyNetworkGraph uses BTreeMap to store its nodes and channels, but
these data structures are not supported by C bindings. Expose look-up
functions on these maps in lieu of such support.
In order to avoid failing to find paths due to the new channel
saturation limit, if we fail to find enough paths, we simply
disable the saturation limit for further path finding iterations.
Because we can now increase the maximum sent over a given channel
during routefinding, we may now generate redundant paths for the
same payment. Because this is wasteful in the network, we add an
additional pass during routefinding to merge redundant paths.
Note that two tests which previously attempted to send exactly the
available liquidity over a channel which charged an absolute fee
need updating - in those cases the router will first collect a path
that is saturation-limited, then attempt to collect a second path
without a saturation limit while stil honoring the existing
utilized capacity on the channel, causing failure as the absolute
fee must be included.
Currently we only opt to split a payment into an MPP if we have
completely and totally used a channel's available capacity (up to
the announced htlc_max or on-chain capacity, whichever is lower).
This is obviously destined to fail as channels are unlikely to have
their full capacity available.
Here we do the minimum viable fix by simply limiting channels to
only using up to a configurable power-of-1/2. We default this new
configuration knob to 1 (1/2 of the channel) so as to avoid a
substantial change but in the future we may consider changing this
to 2 (1/4) or even 3 (1/8).
Because we serialize `Instant`s using wallclock time in
`ProbabilisticScorer`, if time goes backwards across restarts we
may end up with `Instant`s in the future, which causes rustc prior
to 1.60 to panic when calculating durations. Here we simply avoid
this by setting the time to `now` if we get a time in the future.
When we send payment probes, we generate the [`PaymentHash`] based on a
probing cookie secret and a random [`PaymentId`]. This allows us to
discern probes from real payments, without keeping additional state.
Using this field just for MPP doesn't make sense when it could
intuitively also be used to indicate single-path payments. We therefore
rename `max_mpp_path_count` to `max_path_count` and make sure that a
value of 1 ensures MPP is not even tried.
A user might want to explicitly penalize or prioritize a particular
node. We now allow them to do so by specifying a manual penalty
override for a given node that is then returned by the scorer.
Because downstream languages are often garbage-collected, having
the user directly allocate a `ReadOnlyNetworkGraph` and pass a
reference to it to `find_route` often results in holding a read
lock long in excess of the `find_route` call. Worse, some languages
(like JavaScript) tend to only garbage collect when other code is
not running, possibly leading to deadlocks.
Currently, channel balances may be rather easily discovered through
probing. This however poses a privacy risk, since the analysis of
balance changes over adjacent channels could in the worst case empower an adversary to
mount an end-to-end deanonymization attack, i.e., track who payed whom.
The penalty added here is applied so we prefer nodes with a smaller `htlc_maximum_msat`, which makes
balance discovery attacks harder to execute. As this improves privacy network-wide, we
treat such nodes preferentially and hence create an incentive to restrict
`htlc_maximum_msat`.
Users may want to - for whatever reasons - prevent payments to be routed
over certain nodes. This change therefore allows to add `NodeId`s to a
list of banned nodes, which then will be avoided during path finding.
As we prepare to expose an API to update a channel's ChannelConfig,
we'll also want to expose this struct to consumers such that they have
insights into the current ChannelConfig applied for each channel.
Provide a wrapper struct for 32-byte node aliases, which implements
Display for printing. Support the UTF-8 character encoding, but replace
control characters and terminate at the first null character. Fall back
to ASCII if the byte sequence is an invalid encoding.
Instead of implementing EventHandler for P2PGossipSync, implement it on
NetworkGraph. This allows RapidGossipSync to handle events, too, by
delegating to its NetworkGraph.
P2PGossipSync logs before delegating to NetworkGraph in its
EventHandler. In order to share this handling with RapidGossipSync,
NetworkGraph needs to take a logger so that it can implement
EventHandler instead.