We previously assumed that the final node's payload would be ~93 bytes, and had
code to ensure that the filler encoded after that payload is not all 0s. Now
with custom TLVs and metadata supported, the final node's payload may take up
the entire onion packet, so we can't assume that there are 64 bytes of filler
to check.
`RouteGraphNode` currently recalculates scores in its `Ord`
implementation, wasting time while sorting the main Dijkstra's
heap.
Further, some time ago, when implementing the `htlc_maximum_msat`
amount reduction while walking the graph, we added
`PathBuildingHop::was_processed`, looking up the source node in
`dist` each time we pop'ed an element off of the binary heap.
As a result, we now have a reference to our `PathBuildingHop` when
processing a best-node's channels, leading to several fields in
`RouteGraphNode` being entirely redundant.
Here we drop those fields, but add a pre-calculated score field,
as well as force a suboptimal `RouteGraphNode` layout, retaining
its existing 64 byte size.
Without the suboptimal layout, performance is very mixed, but with
it performance is mostly improved, by around 10% in most tests.
Given `PathBuildingHop` is now an even multiple of cache lines, we
can pick which fields "fall off" the cache line we have visible
when dealing with hops, which we do here.
We'd previously aggressively cached elements in the
`PathBuildingHop` struct (and its sub-structs), which resulted in a
rather bloated size. This implied cache misses as we read from and
write to multiple cache lines during processing of a single
channel.
Here, we reduce caching in `DirectedChannelInfo`, fitting the
`(NodeId, PathBuildingHop)` tuple in exactly 128 bytes. While this
should fit in a single cache line, it sadly does not generally lie
in only two lines, as glibc returns large buffers from `malloc`
which are very well aligned, plus 16 bytes (for its own allocation
tracking). Thus, we try to avoid reading from the last 16 bytes of
a `PathBuildingHop`, but luckily that isn't super hard.
Note that here we make accessing
`DirectedChannelInfo::effective_capacity` somewhat slower, but
that's okay as its only ever done once per `DirectedChannelInfo`
anyway.
While our routing benchmarks are quite noisy, this appears to
result in between a 5% and 15% performance improvement in the
probabilistic scoring benchmarks.
This avoids bloating `CandidateRouteHop` with a full 33-byte
node_id (and avoids repeated public key serialization when we do
multiple pathfinding passes).
`TestRouter` tries to make scoring calls that mimic what an actual
router would do, but the changes in f0ecc3ec73
failed to make scoring calls for private hints or if we take a
public hop for the last hop.
This fixes those regressions, though no tests currently depend on
this behavior.
Rather than calling `CandidateRouteHop::FirstHop::node_id` just
`node_id`, we should call it `payer_node_id` to provide more
context.
We also take this opportunity to make it a reference, avoiding
bloating `CandidateRouteHop`.
These are used in the performance-critical routing and scoring
operations, which may happen outside of our crate. Thus, we really
need to allow downstream crates to inline these accessors into
their code, which we do here.
Short channel "ID"s are not globally unique when they come from a
BOLT 11 route hint or a first hop (which can be an outbound SCID
alias). In those cases, its rather confusing that we have a
`short_channel_id` method which mixes them all together, and even
more confusing that we have a `CandidateHopId` which is not, in
fact returning a unique identifier.
In our routing logic this is mostly fine - the cost of a collision
isn't super high and we should still do just fine finding a route,
however the same can't be true for downstream users, as they may or
may not rely on the apparent guarantees.
Thus, here, we privatise the SCID and id accessors.
f0ecc3ec73 introduced a regression in
the `remembers_historical_failures` test, and disabled it by simply
removing the `#[test]` annotation. This fixes the test and marks it
as a test again.
The only remaining step is to use the update_add blinding point in decoding
inbound onion payloads.
Error handling will be completed in upcoming commits.
Previously, we used the auto-download feature of the
`electrsd`/`bitcoind` crates. While convenient, they unnecessarily
introduced a lot of dependecies (`zip`, `zstd`, `time`, etc.) to our
test environment which needed pinning for the MSRV of 1.63.
Here, we introduce a new `no_download` config flag to the
`lightning-transaction-sync` crate allowing us to disable this
auto-download feature in CI, where we now opt to download the
corresponding binaries manually. We keep the default-auto-download as a
convenience feature for running tests locally though.
When enqueuing a message for a node already awaiting a connection,
BufferedAwaitingConnection should be returned when a node is not yet
connected as a peer. However, it was only returned when the first
message was enqueued. Any messages enqueued after but before a
connection was established incorrectly returned Buffered.
Previously, unfunded channels would be stored outside of
`PeerState::channel_by_id`, and thus if there is no channel when
we look in `PeerState::channel_by_id`, `close_channel_internal`
called `force_close_channel_with_peer` to hunt for unfunded
channels.
However, that is no longer the case, so the call is redundant, and
we can simply return an error instead.
Because a `Funded` `Channel` cannot possibly be pre-funding, the
logic in `ChannelManager::close_channel_internal` to handle
pre-funding channels is in the wrong place.
Rather than being handled inside the `Funded` branch, it should be
in an `else` following it, handling either of the two
`ChannelPhases` outside of `Funded`.
Sadly, because of a previous control flow management `loop {}`, the
existing code will infinite loop, which is fixed here.
Since we no longer use `ChannelContext::get_funding_created_msg`,
it can be moved back into `UnfundedOutboundV1` channels only,
where it realistically belongs.
Previously, channels were stored in different maps in `PeerState`
based on whether the funding had been set, keeping the keys across
the maps consistent (pre-funding temporary_channel_ids vs
funding-outpoint-based channel_ids). However, channels are now
stored in a single `channel_by_id` map, making that point moot.
Instead, here, we convert the `ChannelPhase` state transition
boundary to "once we have a `ChannelMonitor`", which makes more
sense now, and was actually the original proposed boundary.
This also requires calling `signer_maybe_unblocked` on a pre-funded
outbound channel, but that nicely also lets us limit the scope of
`FundingCreated` message generation, which we do in the next
commit.
`FundingCreated` and `FundingSent` were mostly named after the
respective `funding_created` and `funding_sent` wire messages. They
include the signature for the initial commitment transaction when
opening a channel. With dual funding, these messages are no longer used,
and instead we rely on the existing `commitment_signed` to exchange
those signatures.
Add tests for onion message buffering checking that messages are cleared
upon disconnection and timed out after MAX_TIMER_TICKS. Also, checks
that ConnectionNeeded events are generated.
lightning-background-processor processes events provided by the
PeerManager's OnionMessageHandler for when a connection is needed. If a
connection is not established in a reasonable amount of time, drop any
buffered onion messages by calling timer_tick_occurred.
OnionMessageHandler implementations now also implement EventsProvider.
Update lightning-background-processor to also process any events the
PeerManager's OnionMessageHandler provides.
OnionMessenger buffers onion messages for nodes that are pending a
connection. To prevent DoS concerns, add a timer_tick_occurred method to
OnionMessageHandler so that buffered messages can be dropped. This will
be called in lightning-background-processor every 10 seconds.
An OnionMessageHandler may buffer messages that can't be sent because
the recipient is not a peer. Have the trait extend EventsProvider so
that implementation so that an Event::ConnectionNeeded can be generated
for any nodes that fall into this category. Also, implement
EventsProvider for OnionMessenger and IgnoringMessageHandler.
A MessageRouter may be unable to find a complete path to an onion
message's destination. This could because no such path exists or any
needs on a potential path don't support onion messages. Add an event
that indicates a connection with a node is needed in order to send the
message.
When there isn't a direct connection with the Destination of an
OnionMessage, look up socket addresses from the NetworkGraph. This is
used to signal to OnionMessenger that a direct connection is needed to
send the message.