Previously, all inbound channels defaulted to a `user_channel_id` of 0,
which didn't allow for them being discerned on that basis. Here, we
simply randomize the identifier to fix this and enable the use of
`user_channel_id` as a true identifier for channels (assuming an equally
reasonable value is chosen for outbound channels and given upon
`create_channel()`).
Refactor `process_pending_htlc_forwards` to ensure that both branches
that fails `pending_forwards` are placed next to eachother for improved
readability.
As the `short_to_chan_info` map has been removed from the
`channel_state`, there is no longer any consistency guarantees between
the `by_id` and `short_to_chan_info` maps. This commit ensures that we
don't force unwrap channels where the channel_id has been queried from
the `short_to_chan_info` map.
As the `short_to_chan_info` has been moved out of the `channel_state` to
a standalone lock, several macros no longer need the `channel_state`
passed into the macro.
When the `abandon_payment` flow was added there was some concern
that upgrading users may not migrate to the new flow, causing
memory leaks in the pending-payment tracking.
While this is true, now that we're relying on the
pending_outbound_payments map for `send_payment` idempotency, the
risk of removing a payment prematurely goes up from "spurious
retry failure" to "sending a duplicative payment", which is much
worse.
Thus, we simply remove the automated payment timeout here,
explicitly requiring that users call `abandon_payment` when they
give up retrying a payment.
Previously, once a fulfilled outbound payment completed and all
associated HTLCs were resolved, we'd immediately remove the payment
entry from the `pending_outbound_payments` map.
Now that we're using the `pending_outbound_payments` map for send
idempotency, this presents a race condition - if the user makes a
redundant `send_payment` call at the same time that the original
payment's last HTLC is resolved, the user would reasonably expect
the `send_payment` call to fail due to our idempotency guarantees.
However, because the `pending_outbound_payments` entry is being
removed, if it completes first the `send_payment` call will
succeed even though the user has not had a chance to see the
corresponding `Event::PaymentSent`.
Instead, here, we delay removal of `Fulfilled`
`pending_outbound_payments` entries until several timer ticks have
passed without any corresponding event or HTLC pending.
In c986e52ce8, an `MppId` was added
to `HTLCSource` objects as a way of correlating HTLCs which belong
to the same payment when the `ChannelManager` sees an HTLC
succeed/fail. This allows it to have awareness of the state of all
HTLCs in a payment when it generates the ultimate user-facing
payment success/failure events. This was used in the same PR to
avoid generating duplicative success/failure events for a single
payment.
Because the field was only used as an internal token to correlate
HTLCs, and retries were not supported, it was generated randomly by
calling the `KeysInterface`'s 32-byte random-fetching function.
This also provided a backwards-compatibility story as the existing
HTLC randomization key was re-used for older clients.
In 28eea12bbe `MppId` was renamed to
the current `PaymentId` which was then used expose the
`retry_payment` interface, allowing users to send new HTLCs which
are considered a part of an existing payment.
At no point has the payment-sending API seriously considered
idempotency, a major drawback which leaves the API unsafe in most
deployments. Luckily, there is a simple solution - because the
`PaymentId` must be unique, and because payment information for a
given payment is held for several blocks after a payment
completes/fails, it represents an obvious idempotency token.
Here we simply require the user provide the `PaymentId` directly in
`send_payment`, allowing them to use whatever token they may
already have for a payment's idempotency token.
If the initial ChannelMonitor persistence is done asynchronously
but does not complete before the node restarts (with a
ChannelManager persistence), we'll start back up with a channel
present but no corresponding ChannelMonitor.
Because the Channel is pending-monitor-update and has not yet
broadcasted its initial funding transaction or sent channel_ready,
this is not a violation of our API contract nor a safety violation.
However, the previous code would refuse to deserialize the
ChannelManager treating it as an API contract violation.
The solution is to test for this case explicitly and drop the
channel entirely as if the peer disconnected before we received
the funding_signed for outbound channels or before sending the
channel_ready for inbound channels.
If we receive a monitor event from a forwarded-to channel which
contains a preimage for an HTLC, we have to propogate that preimage
back to the forwarded-from channel monitor. However, once we have
that update, we're running in a relatively unsafe state - we have
the preimage in memory, but if we were to crash the forwarded-to
channel monitor will not regenerate the update with the preimage
for us. If we haven't managed to write the monitor update to the
forwarded-from channel by that point, we've lost the preimage, and,
thus, money!
When a `chain::Watch` `ChannelMonitor` update method is called, the
user has three options:
(a) persist the monitor update immediately and return success,
(b) fail to persist the monitor update immediately and return
failure,
(c) return a flag indicating the monitor update is in progress and
will complete in the future.
(c) is rather harmless, and in some deployments should be expected
to be the return value for all monitor update calls, but currently
requires returning `Err(ChannelMonitorUpdateErr::TemporaryFailure)`
which isn't very descriptive and sounds scarier than it is.
Instead, here, we change the return type used to be a single enum
(rather than a Result) and rename `TemporaryFailure`
`UpdateInProgress`.
The `forward_htlc` was prior to this commit only held at the same time
as the `channel_state` lock during the write process of the
`ChannelManager`. This commit removes the lock order dependency, by
taking the `channel_state`lock temporarily during the write process.
As we are eventually removing the `channel_state` lock, this commit
moves the `forward_htlcs` map out of the `channel_state` lock, to ease
that process.
See doc updates for more info on the edge case this prevents, and
there isn't really a strong reason why we would need to broadcast
the latest state immediately. Specifically, in the case of HTLC
claims (the most important reason to ensure we have state on chain
if it cannot be persisted), we will still force-close if there are
HTLCs which need claiming and are going to expire.
Surprisingly, there were no tests which failed as a result of this
change, but a new one has been added.
As we move towards specify supported/required feature bits in the
module(s) where they are supported, the global `known` feature set
constructors no longer make sense.
Here we stop relying on the `known` method in the channel modules.
Historically, LDK has considered the "set of known/supported
feature bits" to be an LDK-level thing. Increasingly this doesn't
make sense - different message handlers may provide or require
different feature sets.
In a previous PR, we began the process of transitioning with
feature bits sent to peers being sourced from the attached message
handler.
This commit makes further progress by moving the concept of which
feature bits are supported by our ChannelManager into
channelmanager.rs itself, via the new `provided_*_features`
methods, rather than in features.rs via the `known_channel_features`
and `known` methods.
As we remove the concept of a global "known/supported" feature set
in LDK, we should also remove the concept of a global "required"
feature set. This does so by moving the checks for specific
required features into handlers.
Specifically, it allows the handler `peer_connected` method to
return an `Err` if the peer should be disconnected. Only one such
required feature bit is currently set - `static_remote_key`, which
is required in `ChannelManager`.
When ChannelMessageHandler implementations wish to return a NodeFeatures which
contain all the known flags that are relevant to channel handling, but not
gossip handling, they currently need to do so by manually constructing a
NodeFeatures with all known flags and then clearing the ones they don't want.
Instead of spreading this logic across the codebase, this consolidates such
construction into one place in features.rs.
When `ChannelMessageHandler` implementations wish to return an
`InitFeatures` which contain all the known flags that are relevant
to channel handling, but not gossip handling, they currently need
to do so by manually constructing an InitFeatures with all known
flags and then clearing the ones they dont want.
Instead of spreading this logic out across the codebase, this
consolidates such construction to one place in features.rs.
Like we now do for `NodeFeatures`, this converts to asking our
registered `ChannelMessageHandler` for our `InitFeatures` instead
of hard-coding them to the global LDK known set.
This allows handlers to set different feature bits based on what
our configuration actually supports rather than what LDK supports
in aggregate.
Some `NodeFeatures` will, in the future, represent features which
are not enabled by the `ChannelManager`, but by other message
handlers handlers. Thus, it doesn't make sense to determine the
node feature bits in the `ChannelManager`.
The simplest fix for this is to change to generating the
node_announcement in `PeerManager`, asking all the connected
handlers which feature bits they support and simply OR'ing them
together. While this may not be sufficient in the future as it
doesn't consider feature bit dependencies, support for those could
be handled at the feature level in the future.
This commit moves the `broadcast_node_announcement` function to
`PeerHandler` but does not yet implement feature OR'ing.
When we connect to a new peer, immediately send them any
channel_announcement and channel_update messages for any public
channels we have with other peers. This allows us to stop sending
those messages on a timer when they have not changed and ensures
we are sending messages when we have peers connected, rather than
broadcasting at startup when we have no peers connected.
When we fail to forward a probe HTLC at all and immediately fail it
(e.g. due to the first hop channel closing) we'd previously
spuriously generate only a `PaymentPathFailed` event. This violates
the expected API, as users expect a `ProbeFailed` event instead.
This fixes the oversight by ensuring we generate the correct event.
Thanks to @jkczyz for pointing out this issue.
`fail_holding_cell_htlcs` calls through to
`fail_htlc_backwards_internal` for HTLCs that need to be
failed-backwards but opts to generate its own payment failure
events for `HTLCSource:;OutboundRoute` HTLCs. There is no reason
for that as `fail_htlc_backwards_internal` will also happily
generate (now-)equivalent events for `HTLCSource::OutboundRoute`
HTLCs.
Thus, we can drop the redundant code and always call
`fail_htlc_backwards_internal` for each HTLC in
`fail_holding_cell_htlcs`.
The `rejected_by_dest` field of the `PaymentPathFailed` event has
always been a bit of a misnomer, as its really more about retry
than where a payment failed. Now is as good a time as any to
rename it.
When our counterparty is the payment destination and we receive
an `HTLCFailReason::Reason` in `fail_htlc_backwards_internal` we
currently always set `rejected_by_dest` in the `PaymentPathFailed`
event, implying the HTLC should *not* be retried.
There are a number of cases where we use `HTLCFailReason::Reason`,
but most should reasonably be treated as retryable even if our
counterparty was the destination (i.e. `!rejected_by_dest`):
* If an HTLC times out on-chain, this doesn't imply that the
payment is no longer retryable, though the peer may well be
offline so retrying may not be very useful,
* If a commitment transaction "containing" a dust HTLC is
confirmed on-chain, this definitely does not imply the payment
is no longer retryable
* If the channel we intended to relay over was closed (or
force-closed) we should retry over another path,
* If the channel we intended to relay over did not have enough
capacity we should retry over another path,
* If we received a update_fail_malformed_htlc message from our
peer, we likely should *not* retry, however this should be
exceedingly rare, and appears to nearly never appear in practice
Thus, this commit simply disables the behavior here, opting to
treat all `HTLCFailReason::Reason` errors as retryable.
Note that prior to 93e645daf4 this
change would not have made sense as it would have resulted in us
retrying the payment over the same channel in some cases, however
we now "blame" our own channel and will avoid it when routing for
the same payment.
We've seen a bit of user confusion about the requirements for event
handling, largely because the idempotency and consistency
requirements weren't super clearly phrased. While we're at it, we
also consolidate some documentation out of the event handling
function onto the trait itself.
Fixes#1675.
If we receive a channel_update for one of our private channels, we
will not log the message at the usual TRACE log level as the
message falls into the gossip range. However, for our own channels
they aren't *just* gossip, as we store that info and it changes
how we generate invoices. Thus, we add a log in `ChannelManager`
here at the DEBUG log level.
This allows users who don't wish to block a full thread to receive
persistence events.
The `Future` added here is really just a trivial list of callbacks,
but from that we can build a (somewhat ineffecient)
std::future::Future implementation and can (at least once a mapping
for Box<dyn Trait> is added) include the future in no-std bindings
as well.
Fixes#1595