If, after forwarding a payment to our counterparty, we restart with
a ChannelMonitor update having been persisted, but the
corresponding ChannelManager update was not persisted, we'll still
have the forwarded HTLC in the `forward_htlcs` map on start. This
will cause us to generate a (spurious) `PendingHTLCsForwardable`
event. However, when we go to forward said HTLC, we'll notice the
channel has been closed and leave it up to the `ChannelMontior` to
finalize the HTLC.
This is all fine today - we won't lose any funds, we'll just
generate an excess forwardable event and then fail to forward.
However, in the future when we allow for forward-time channel
changes this could break. Thus, its worth adding tests for this
behavior today, and, while we're at it, removing the spurious
forwardable HTLCs event.
In many complexity-reduced implementations of chain syncing using
esplora `transactions_confirmed` may be called redundantly for
transactions which were already confirmed. To ensure this is
idempotent we add two new `ConnectionStyle`s in our tests which
(a) call `transactions_confirmed` twice for each call, ensuring
simple idempotency is ensured and (b) call `transactions_confirmed`
once for each historical block every time we're connecting a new
block, ensuring we're fully idempotent even if every call is
repeated constantly.
In order to actually behave correctly this requires a simple
already-confirmed check in `ChannelMonitor`, which is included.
When we process a `channel_reestablish` message we free the HTLC
update holding cell as things may have changed while we were
disconnected. However, some time ago, to handle freeing from the
holding cell when a monitor update completes, we added a holding
cell freeing check in `get_and_clear_pending_msg_events`. This
leaves the in-`channel_reestablish` holding cell clear redundant,
as doing it immediately or is `get_and_clear_pending_msg_events` is
not a user-visible difference.
Thus, we remove the redundant code here, substantially simplifying
`handle_chan_restoration_locked` while we're at it.
We increase the `user_channel_id` type from `u64` to `u128`. In order to
maintain backwards compatibility, we have to de-/serialize it as two
separate `u64`s in `Event` as well as in the `Channel` itself.
It was pointed out that its quite confusing that
`AllFailedRetrySafe` does not allow you to call `retry_payment`,
though the documentation on it does specify this. Instead, we
simply rename it to `AllFailedResendSafe` to indicate that the
action that is safe to take is *resending*, not *retrying*.
In c986e52ce8, an `MppId` was added
to `HTLCSource` objects as a way of correlating HTLCs which belong
to the same payment when the `ChannelManager` sees an HTLC
succeed/fail. This allows it to have awareness of the state of all
HTLCs in a payment when it generates the ultimate user-facing
payment success/failure events. This was used in the same PR to
avoid generating duplicative success/failure events for a single
payment.
Because the field was only used as an internal token to correlate
HTLCs, and retries were not supported, it was generated randomly by
calling the `KeysInterface`'s 32-byte random-fetching function.
This also provided a backwards-compatibility story as the existing
HTLC randomization key was re-used for older clients.
In 28eea12bbe `MppId` was renamed to
the current `PaymentId` which was then used expose the
`retry_payment` interface, allowing users to send new HTLCs which
are considered a part of an existing payment.
At no point has the payment-sending API seriously considered
idempotency, a major drawback which leaves the API unsafe in most
deployments. Luckily, there is a simple solution - because the
`PaymentId` must be unique, and because payment information for a
given payment is held for several blocks after a payment
completes/fails, it represents an obvious idempotency token.
Here we simply require the user provide the `PaymentId` directly in
`send_payment`, allowing them to use whatever token they may
already have for a payment's idempotency token.
When a `chain::Watch` `ChannelMonitor` update method is called, the
user has three options:
(a) persist the monitor update immediately and return success,
(b) fail to persist the monitor update immediately and return
failure,
(c) return a flag indicating the monitor update is in progress and
will complete in the future.
(c) is rather harmless, and in some deployments should be expected
to be the return value for all monitor update calls, but currently
requires returning `Err(ChannelMonitorUpdateErr::TemporaryFailure)`
which isn't very descriptive and sounds scarier than it is.
Instead, here, we change the return type used to be a single enum
(rather than a Result) and rename `TemporaryFailure`
`UpdateInProgress`.
As we move towards specify supported/required feature bits in the
module(s) where they are supported, the global `known` feature set
constructors no longer make sense.
Here we stop relying on the `known` method in the
functional_test_utils module.
In the next commit we'll enforce counterparty `InitFeatures`
matching our required set in `ChannelManager`, implying they must
be set for many tests where they previously did not need to be (as
they were enforced in `PeerManager`, which is not used in
functional tests).
Some `NodeFeatures` will, in the future, represent features which
are not enabled by the `ChannelManager`, but by other message
handlers handlers. Thus, it doesn't make sense to determine the
node feature bits in the `ChannelManager`.
The simplest fix for this is to change to generating the
node_announcement in `PeerManager`, asking all the connected
handlers which feature bits they support and simply OR'ing them
together. While this may not be sufficient in the future as it
doesn't consider feature bit dependencies, support for those could
be handled at the feature level in the future.
This commit moves the `broadcast_node_announcement` function to
`PeerHandler` but does not yet implement feature OR'ing.
When we connect to a new peer, immediately send them any
channel_announcement and channel_update messages for any public
channels we have with other peers. This allows us to stop sending
those messages on a timer when they have not changed and ensures
we are sending messages when we have peers connected, rather than
broadcasting at startup when we have no peers connected.
The `rejected_by_dest` field of the `PaymentPathFailed` event has
always been a bit of a misnomer, as its really more about retry
than where a payment failed. Now is as good a time as any to
rename it.
This uses the various new tracking added in the prior commits to
expose a new `Balance` type - `CounterpartyRevokedOutputClaimable`.
Some nontrivial work is required, however, as we now have to track
HTLC outputs as spendable in a transaction that comes *after* an
HTLC-Success/HTLC-Timeout transaction, which we previously didn't
need to do. Thus, we have to check if an
`onchain_events_awaiting_threshold_conf` event spends a commitment
transaction's HTLC output while walking events. Further, because
we now need to track HTLC outputs after the
HTLC-Success/HTLC-Timeout confirms, and because we have to track
the counterparty's `to_self` output as a contentious output which
could be claimed by either party, we have to examine the
`OnchainTxHandler`'s set of outputs to spend when determining if
certain outputs are still spendable.
Two new tests are added which test various different transaction
formats, and hopefully provide good test coverage of the various
revoked output paths.
Instead of backfilling gossip by buffering (up to) ten messages at
a time, only buffer one message at a time, as the peers' outbound
socket buffer drains. This moves the outbound backfill messages out
of `PeerHandler` and into the operating system buffer, where it
arguably belongs.
Not buffering causes us to walk the gossip B-Trees somewhat more
often, but avoids allocating vecs for the responses. While its
probably (without having benchmarked it) a net performance loss, it
simplifies buffer tracking and leaves us with more room to play
with the buffer sizing constants as we add onion message forwarding
which is an important win.
Note that because we change how often we check if we're out of
messages to send before pinging, we slightly change how many
messages are exchanged at once, impacting the
`test_do_attempt_write_data` constants.
We add `HTLCHandlingFailedConditions` to express the failure parameters,
that will be enforced by a new macro, `expect_pending_htlcs_forwardable_conditions`.
When an HTLC fails, we currently rely on the scorer learning the
failed channel and assigning an infinite (`u64::max_value()`)
penalty to the channel so as to avoid retrying over the exact same
path (if there's only one available path). This is common when
trying to pay a mobile client behind an LSP if the mobile client is
currently offline.
This leads to the scorer being overly conservative in some cases -
returning `u64::max_value()` when a given path hasn't been tried
for a given payment may not be the best decision, even if that
channel failed 50 minutes ago.
By tracking channels which failed on a payment part level and
explicitly refusing to route over them we can relax the
requirements on the scorer, allowing it to make different decisions
on how to treat channels that failed relatively recently without
causing payments to retry the same path forever.
This does have the drawback that it could allow two separate part
of a payment to traverse the same path even though that path just
failed, however this should only occur if the payment is going to
fail anyway, at least as long as the scorer is properly learning.
Closes#1241, superseding #1252.
In the next commit we add lockorder testing based on the line each
mutex was created on rather than the particular mutex instance.
This causes some additional test failure because of lockorder
inversions for the same mutex across different tests, which is
fixed here.
This is mostly motivated by the fact that payments may happen while the
latest `ChannelUpdate` indicating our new `ChannelConfig` is still
propagating throughout the network. By temporarily allowing the previous
config, we can help reduce payment failures across the network.
ChannelConfig now has its static fields removed. We introduce a new
LegacyChannelConfig struct that maintains the serialization as
previously defined by ChannelConfig to remain backwards compatible with
clients running 0.0.107 and earlier.
In the near future, we plan to allow users to update their
`ChannelConfig` after the initial channel handshake. In order to reuse
the same struct and expose it to users, we opt to move out all static
fields that cannot be updated after the initial channel handshake.