When `ChannelMessageHandler` implementations wish to return an
`InitFeatures` which contain all the known flags that are relevant
to channel handling, but not gossip handling, they currently need
to do so by manually constructing an InitFeatures with all known
flags and then clearing the ones they dont want.
Instead of spreading this logic out across the codebase, this
consolidates such construction to one place in features.rs.
Like we now do for `NodeFeatures`, this converts to asking our
registered `ChannelMessageHandler` for our `InitFeatures` instead
of hard-coding them to the global LDK known set.
This allows handlers to set different feature bits based on what
our configuration actually supports rather than what LDK supports
in aggregate.
Some `NodeFeatures` will, in the future, represent features which
are not enabled by the `ChannelManager`, but by other message
handlers handlers. Thus, it doesn't make sense to determine the
node feature bits in the `ChannelManager`.
The simplest fix for this is to change to generating the
node_announcement in `PeerManager`, asking all the connected
handlers which feature bits they support and simply OR'ing them
together. While this may not be sufficient in the future as it
doesn't consider feature bit dependencies, support for those could
be handled at the feature level in the future.
This commit moves the `broadcast_node_announcement` function to
`PeerHandler` but does not yet implement feature OR'ing.
When we connect to a new peer, immediately send them any
channel_announcement and channel_update messages for any public
channels we have with other peers. This allows us to stop sending
those messages on a timer when they have not changed and ensures
we are sending messages when we have peers connected, rather than
broadcasting at startup when we have no peers connected.
When we fail to forward a probe HTLC at all and immediately fail it
(e.g. due to the first hop channel closing) we'd previously
spuriously generate only a `PaymentPathFailed` event. This violates
the expected API, as users expect a `ProbeFailed` event instead.
This fixes the oversight by ensuring we generate the correct event.
Thanks to @jkczyz for pointing out this issue.
`fail_holding_cell_htlcs` calls through to
`fail_htlc_backwards_internal` for HTLCs that need to be
failed-backwards but opts to generate its own payment failure
events for `HTLCSource:;OutboundRoute` HTLCs. There is no reason
for that as `fail_htlc_backwards_internal` will also happily
generate (now-)equivalent events for `HTLCSource::OutboundRoute`
HTLCs.
Thus, we can drop the redundant code and always call
`fail_htlc_backwards_internal` for each HTLC in
`fail_holding_cell_htlcs`.
The `rejected_by_dest` field of the `PaymentPathFailed` event has
always been a bit of a misnomer, as its really more about retry
than where a payment failed. Now is as good a time as any to
rename it.
When our counterparty is the payment destination and we receive
an `HTLCFailReason::Reason` in `fail_htlc_backwards_internal` we
currently always set `rejected_by_dest` in the `PaymentPathFailed`
event, implying the HTLC should *not* be retried.
There are a number of cases where we use `HTLCFailReason::Reason`,
but most should reasonably be treated as retryable even if our
counterparty was the destination (i.e. `!rejected_by_dest`):
* If an HTLC times out on-chain, this doesn't imply that the
payment is no longer retryable, though the peer may well be
offline so retrying may not be very useful,
* If a commitment transaction "containing" a dust HTLC is
confirmed on-chain, this definitely does not imply the payment
is no longer retryable
* If the channel we intended to relay over was closed (or
force-closed) we should retry over another path,
* If the channel we intended to relay over did not have enough
capacity we should retry over another path,
* If we received a update_fail_malformed_htlc message from our
peer, we likely should *not* retry, however this should be
exceedingly rare, and appears to nearly never appear in practice
Thus, this commit simply disables the behavior here, opting to
treat all `HTLCFailReason::Reason` errors as retryable.
Note that prior to 93e645daf4 this
change would not have made sense as it would have resulted in us
retrying the payment over the same channel in some cases, however
we now "blame" our own channel and will avoid it when routing for
the same payment.
We've seen a bit of user confusion about the requirements for event
handling, largely because the idempotency and consistency
requirements weren't super clearly phrased. While we're at it, we
also consolidate some documentation out of the event handling
function onto the trait itself.
Fixes#1675.
If we receive a channel_update for one of our private channels, we
will not log the message at the usual TRACE log level as the
message falls into the gossip range. However, for our own channels
they aren't *just* gossip, as we store that info and it changes
how we generate invoices. Thus, we add a log in `ChannelManager`
here at the DEBUG log level.
This allows users who don't wish to block a full thread to receive
persistence events.
The `Future` added here is really just a trivial list of callbacks,
but from that we can build a (somewhat ineffecient)
std::future::Future implementation and can (at least once a mapping
for Box<dyn Trait> is added) include the future in no-std bindings
as well.
Fixes#1595
It was always somewhat strange to have a bunch of notification
logic in `channelmanager`, and with the next commit adding a bunch
more, its moved here first.
This required adapting `onion_utils::decode_next_hop` to work for both payments
and onion messages.
Currently we just print out the path_id of any onion messages we receive. In
the future, these received onion messages will be redirected to their
respective handlers: i.e. an invoice_request will go to an InvoiceHandler,
custom onion messages will go to a custom handler, etc.
It is proportion of the channel value to configure as the
`their_channel_reserve_satoshis` for both outbound and inbound channels.
It decides the minimum balance that the other node has to maintain on their
side, at all times.
Currently `decode_update_add_htlc_onion` returns the `channel_state`
lock to ensure that `internal_update_add_htlc` holds a single
`channel_state` lock in when the entire function execution. This is
unnecessary, and since we are moving the channel storage to the
`per_peer_state`, this no longer achieves the goal it was intended for.
We therefore avoid returning the `channel_state` from
`decode_update_add_htlc_onion`, and just retake the lock in
`internal_update_add_htlc` instead.
When an HTLC fails, we currently rely on the scorer learning the
failed channel and assigning an infinite (`u64::max_value()`)
penalty to the channel so as to avoid retrying over the exact same
path (if there's only one available path). This is common when
trying to pay a mobile client behind an LSP if the mobile client is
currently offline.
This leads to the scorer being overly conservative in some cases -
returning `u64::max_value()` when a given path hasn't been tried
for a given payment may not be the best decision, even if that
channel failed 50 minutes ago.
By tracking channels which failed on a payment part level and
explicitly refusing to route over them we can relax the
requirements on the scorer, allowing it to make different decisions
on how to treat channels that failed relatively recently without
causing payments to retry the same path forever.
This does have the drawback that it could allow two separate part
of a payment to traverse the same path even though that path just
failed, however this should only occur if the payment is going to
fail anyway, at least as long as the scorer is properly learning.
Closes#1241, superseding #1252.
As the map values are no longer only `channel_id`s, but also a
`counterparty_node_id`s, the map is renamed to better correspond to
whats actually stored in the map.
When we send payment probes, we generate the [`PaymentHash`] based on a
probing cookie secret and a random [`PaymentId`]. This allows us to
discern probes from real payments, without keeping additional state.
This fixes an insta-panic in `ChannelMonitor` deserialization where
we always `unwrap` a previous value to determine the default value
of a later field. However, because we always ran the `unwrap`
before the previous field is read, we'd always panic.
The fix is rather simple - use a `OptionDeserWrapper` for
`default_value` fields and only fill in the default value if no
value was read while walking the TLV stream.
The only complexity comes from our desire to support
`read_tlv_field` calls that use an explicit field rather than an
`Option` of some sort, which requires some statement which can
assign both an `OptionDeserWrapper<T>` variable and a `T` variable.
We settle on `x = t.into()` and implement `From<T> for
OptionDeserWrapper<T>` which works, though it requires users to
specify types explicitly due to Rust determining expression types
prior to macro execution, completely guessing with no knowlege for
integer expressions (see
https://github.com/rust-lang/rust/issues/91369).
Because downstream languages are often garbage-collected, having
the user directly allocate a `ReadOnlyNetworkGraph` and pass a
reference to it to `find_route` often results in holding a read
lock long in excess of the `find_route` call. Worse, some languages
(like JavaScript) tend to only garbage collect when other code is
not running, possibly leading to deadlocks.
When we receive a `channel_reestablish` with a `data_loss_protect`
that proves we're running with a stale state, instead of
force-closing the channel, we immediately panic. This lines up with
our refusal to run if we find a `ChannelMonitor` which is stale
compared to our `ChannelManager` during `ChannelManager`
deserialization. Ultimately both are an indication of the same
thing - that the API requirements on `chain::Watch` were violated.
In the "running with outdated state but ChannelMonitor(s) and
ChannelManager lined up" case specifically its likely we're running
off of an old backup, in which case connecting to peers with
channels still live is explicitly dangerous. That said, because
this could be an operator error that is correctable, panicing
instead of force-closing may allow for normal operation again in
the future (cc #1207).
In any case, we provide instructions in the panic message for how
to force-close channels prior to peer connection, as well as a note
on how to broadcast the latest state if users are willing to take
the risk.
Note that this is still somewhat unsafe until we resolve#1563.
If a user restores from a backup that they know is stale, they'd
like to force-close all of their channels (or at least the ones
they know are stale) *without* broadcasting the latest state,
asking their peers to do so instead. This simply adds methods to do
so, renaming the existing `force_close_channel` and
`force_close_all_channels` methods to disambiguate further.
This is mostly motivated by the fact that payments may happen while the
latest `ChannelUpdate` indicating our new `ChannelConfig` is still
propagating throughout the network. By temporarily allowing the previous
config, we can help reduce payment failures across the network.