This required adapting `onion_utils::decode_next_hop` to work for both payments
and onion messages.
Currently we just print out the path_id of any onion messages we receive. In
the future, these received onion messages will be redirected to their
respective handlers: i.e. an invoice_request will go to an InvoiceHandler,
custom onion messages will go to a custom handler, etc.
This adds several utilities in service of then adding
OnionMessenger::send_onion_message, which can send to either an unblinded
pubkey or a blinded route. Sending custom TLVs and sending an onion message
containing a reply path are not yet supported.
We also need to split the construct_keys_callback macro into two macros to
avoid an unused assignment warning.
This method will help us avoid retrieving our node secret, something we want to
get rid of entirely. It will be used in upcoming commits when decoding the
onion message packet, and in future PRs to help us get rid of
KeysInterface::get_node_secret usages across the codebase
We need to add a new Packet struct because onion message packet hop_data fields
can be of variable length, whereas regular payment packets are always 1366
bytes.
Co-authored-by: Valentine Wallace <vwallace@protonmail.com>
Co-authored-by: Jeffrey Czyz <jkczyz@gmail.com>
It is proportion of the channel value to configure as the
`their_channel_reserve_satoshis` for both outbound and inbound channels.
It decides the minimum balance that the other node has to maintain on their
side, at all times.
Currently `decode_update_add_htlc_onion` returns the `channel_state`
lock to ensure that `internal_update_add_htlc` holds a single
`channel_state` lock in when the entire function execution. This is
unnecessary, and since we are moving the channel storage to the
`per_peer_state`, this no longer achieves the goal it was intended for.
We therefore avoid returning the `channel_state` from
`decode_update_add_htlc_onion`, and just retake the lock in
`internal_update_add_htlc` instead.
Blinded routes can be provided as destinations for onion messages, when the
recipient prefers to remain anonymous.
We also add supporting utilities for constructing blinded path keys, and
control TLVs structs representing blinded payloads prior to being
encoded/encrypted. These utilities and struct will be re-used in upcoming
commits for sending and receiving/forwarding onion messages.
Finally, add utilities for reading the padding from an onion message's
encrypted TLVs without an intermediate Vec.
In 4703d4e725 we changed
PeerManager::socket_disconnected to no longer require that sockets
which the PeerManager decided to disconnect not be disconnected.
However, we forgot to remove the scary warnings on the
`new_{inbound,outbound}_connection` functions which warned of the
old behavior.
We do so here.
We add `HTLCHandlingFailedConditions` to express the failure parameters,
that will be enforced by a new macro, `expect_pending_htlcs_forwardable_conditions`.
Saturating a channel beyond 1/4 of its capacity seems like a more
reasonable threshold for avoiding a path than 1/2, especially given
we should still be willing to send a payment with a lower
saturation limit if it comes to that.
This requires an (obvious) change to some router tests, but also
requires a change to the `fake_network_test`, opting to simply
remove some over-limit test code there - `fake_network_test` was
our first ever functional test, and while it worked great to ensure
LDK worked at all on day one, we now have a rather large breadth
of functional tests, and a broad "does it work at all" test is no
longer all that useful.
When an HTLC fails, we currently rely on the scorer learning the
failed channel and assigning an infinite (`u64::max_value()`)
penalty to the channel so as to avoid retrying over the exact same
path (if there's only one available path). This is common when
trying to pay a mobile client behind an LSP if the mobile client is
currently offline.
This leads to the scorer being overly conservative in some cases -
returning `u64::max_value()` when a given path hasn't been tried
for a given payment may not be the best decision, even if that
channel failed 50 minutes ago.
By tracking channels which failed on a payment part level and
explicitly refusing to route over them we can relax the
requirements on the scorer, allowing it to make different decisions
on how to treat channels that failed relatively recently without
causing payments to retry the same path forever.
This does have the drawback that it could allow two separate part
of a payment to traverse the same path even though that path just
failed, however this should only occur if the payment is going to
fail anyway, at least as long as the scorer is properly learning.
Closes#1241, superseding #1252.
In the next commit we add lockorder testing based on the line each
mutex was created on rather than the particular mutex instance.
This causes some additional test failure because of lockorder
inversions for the same mutex across different tests, which is
fixed here.
In order to avoid failing to find paths due to the new channel
saturation limit, if we fail to find enough paths, we simply
disable the saturation limit for further path finding iterations.
Because we can now increase the maximum sent over a given channel
during routefinding, we may now generate redundant paths for the
same payment. Because this is wasteful in the network, we add an
additional pass during routefinding to merge redundant paths.
Note that two tests which previously attempted to send exactly the
available liquidity over a channel which charged an absolute fee
need updating - in those cases the router will first collect a path
that is saturation-limited, then attempt to collect a second path
without a saturation limit while stil honoring the existing
utilized capacity on the channel, causing failure as the absolute
fee must be included.
As the map values are no longer only `channel_id`s, but also a
`counterparty_node_id`s, the map is renamed to better correspond to
whats actually stored in the map.
When we send payment probes, we generate the [`PaymentHash`] based on a
probing cookie secret and a random [`PaymentId`]. This allows us to
discern probes from real payments, without keeping additional state.
This fixes an insta-panic in `ChannelMonitor` deserialization where
we always `unwrap` a previous value to determine the default value
of a later field. However, because we always ran the `unwrap`
before the previous field is read, we'd always panic.
The fix is rather simple - use a `OptionDeserWrapper` for
`default_value` fields and only fill in the default value if no
value was read while walking the TLV stream.
The only complexity comes from our desire to support
`read_tlv_field` calls that use an explicit field rather than an
`Option` of some sort, which requires some statement which can
assign both an `OptionDeserWrapper<T>` variable and a `T` variable.
We settle on `x = t.into()` and implement `From<T> for
OptionDeserWrapper<T>` which works, though it requires users to
specify types explicitly due to Rust determining expression types
prior to macro execution, completely guessing with no knowlege for
integer expressions (see
https://github.com/rust-lang/rust/issues/91369).
In c02b6a3807 we moved the
`payment_preimage` copy from inside the macro which only runs if we
are spending an output we know is an HTLC output to doing it for
any script that matches our expected length. This can panic if an
inbound channel is created with a bogus funding transaction that
has a witness program of the HTLC-Success/-Offered length but which
does not have a second-to-last witness element which is 32 bytes.
Luckily this panic is relatively simple for downstream users to
work around - if an invalid-length-copy panic occurs, simply remove
the ChannelMonitor from the bogus channel on startup and run
without it. Because the channel must be funded by a bogus script in
order to reach this panic, the channel will already have closed by
the time the funding transaction is spent, and there can be no
local funds in such a channel, so removing the `ChannelMonitor`
wholesale is completely safe.
In order to test this we have to disable an in-line assertion that
checks that our transactions match expected scripts which we do by
checking for the specific bogus script that we now use in
`test_invalid_funding_tx`.
Thanks to Eugene Siegel for reporting this issue.
Because downstream languages are often garbage-collected, having
the user directly allocate a `ReadOnlyNetworkGraph` and pass a
reference to it to `find_route` often results in holding a read
lock long in excess of the `find_route` call. Worse, some languages
(like JavaScript) tend to only garbage collect when other code is
not running, possibly leading to deadlocks.
When we receive a `channel_reestablish` with a `data_loss_protect`
that proves we're running with a stale state, instead of
force-closing the channel, we immediately panic. This lines up with
our refusal to run if we find a `ChannelMonitor` which is stale
compared to our `ChannelManager` during `ChannelManager`
deserialization. Ultimately both are an indication of the same
thing - that the API requirements on `chain::Watch` were violated.
In the "running with outdated state but ChannelMonitor(s) and
ChannelManager lined up" case specifically its likely we're running
off of an old backup, in which case connecting to peers with
channels still live is explicitly dangerous. That said, because
this could be an operator error that is correctable, panicing
instead of force-closing may allow for normal operation again in
the future (cc #1207).
In any case, we provide instructions in the panic message for how
to force-close channels prior to peer connection, as well as a note
on how to broadcast the latest state if users are willing to take
the risk.
Note that this is still somewhat unsafe until we resolve#1563.
If a user restores from a backup that they know is stale, they'd
like to force-close all of their channels (or at least the ones
they know are stale) *without* broadcasting the latest state,
asking their peers to do so instead. This simply adds methods to do
so, renaming the existing `force_close_channel` and
`force_close_all_channels` methods to disambiguate further.
When we see a counterparty revoked commitment transaction on-chain
we shouldn't immediately queue up HTLCs present in it for
resolution until we have spent the HTLC outputs in some kind of
claim transaction.
In order to do so, we first have to change the
`fail_unbroadcast_htlcs!()` call to provide it with the HTLCs which
are present in the (revoked) commitment transaction which was
broadcast. However, this is not sufficient - because all of those
HTLCs had their `HTLCSource` removed when the commitment
transaction was revoked, we also have to update
`fail_unbroadcast_htlcs` to check the payment hash and amount when
the `HTLCSource` is `None`.
Somewhat surprisingly, several tests actually explicitly tested for
the old behavior, which required amending to pass with the new
changes.
Finally, this adds a debug assertion when writing `ChannelMonitor`s
to ensure `HTLCSource`s do not leak.
This is mostly motivated by the fact that payments may happen while the
latest `ChannelUpdate` indicating our new `ChannelConfig` is still
propagating throughout the network. By temporarily allowing the previous
config, we can help reduce payment failures across the network.
We do this to prevent payment failures while the `ChannelUpdate` for the
new `ChannelConfig` still propagates throughout the network. In a follow
up commit, we'll honor forwarding HTLCs that were constructed based on
either the previous or current `ChannelConfig`.
To handle expiration (when we should stop allowing the previous config),
we rely on the ChannelManager's `timer_tick_occurred` method. After
enough ticks, the previous config is cleared from memory, and only the
current config applies moving forward.
A new `update_channel_config` method is exposed on the `ChannelManger`
to update the `ChannelConfig` for a set of channels atomically. New
`ChannelUpdate` events are generated for each eligible channel.
Note that as currently implemented, a buggy and/or
auto-policy-management client could spam the network with updates as
there is no rate-limiting in place. This could already be done with
`broadcast_node_announcement`, though users are less inclined to update
that as frequently as its data is mostly static.