In many complexity-reduced implementations of chain syncing using
esplora `transactions_confirmed` may be called redundantly for
transactions which were already confirmed. To ensure this is
idempotent we add two new `ConnectionStyle`s in our tests which
(a) call `transactions_confirmed` twice for each call, ensuring
simple idempotency is ensured and (b) call `transactions_confirmed`
once for each historical block every time we're connecting a new
block, ensuring we're fully idempotent even if every call is
repeated constantly.
In order to actually behave correctly this requires a simple
already-confirmed check in `ChannelMonitor`, which is included.
In the next commit we'll add some checks that redundant
transactions aren't confirmed in different blocks, which would
cause test_htlc_ignore_latest_remote_commitment to fail. Here we
fix it to avoid the issue.
LND nodes have very broken fee estimators, causing them to suggest
feerates that don't even meet a current mempool minimum feerate
when fees go up over the course of hours. This can cause us to
reject their feerate estimates as they're not high enough, even
though their new feerate is higher than what we had already (which
is the feerate we'll use to broadcast a closing transaction). This
implies we force-close the channel and broadcast something with a
feerate lower than our counterparty was offering.
Here we simply accept such feerates as they are better than what we
had. We really should also close the channel, but only after we
get their signature on the new feerate. That should happen by
checking channel feerates every time we see a new block so is
orthogonal to this code.
Ultimately the fix is anchor outputs plus package-based relay in
Bitcoin Core, however we're still quite some ways from that, so
worth needlessly closing channels for now.
It was pointed out that its quite confusing that
`AllFailedRetrySafe` does not allow you to call `retry_payment`,
though the documentation on it does specify this. Instead, we
simply rename it to `AllFailedResendSafe` to indicate that the
action that is safe to take is *resending*, not *retrying*.
These claims will never be valid as a previous claim has already
confirmed. If a previous claim is reorged out of the chain, a new claim
will be generated bypassing the new behavior.
While this doesn't change much for our existing transaction-based
claims, as broadcasting an already confirmed transaction acts as a NOP,
it prevents us from yielding redundant event-based claims, which will be
introduced as part of the anchors patchset.
When the `abandon_payment` flow was added there was some concern
that upgrading users may not migrate to the new flow, causing
memory leaks in the pending-payment tracking.
While this is true, now that we're relying on the
pending_outbound_payments map for `send_payment` idempotency, the
risk of removing a payment prematurely goes up from "spurious
retry failure" to "sending a duplicative payment", which is much
worse.
Thus, we simply remove the automated payment timeout here,
explicitly requiring that users call `abandon_payment` when they
give up retrying a payment.
In c986e52ce8, an `MppId` was added
to `HTLCSource` objects as a way of correlating HTLCs which belong
to the same payment when the `ChannelManager` sees an HTLC
succeed/fail. This allows it to have awareness of the state of all
HTLCs in a payment when it generates the ultimate user-facing
payment success/failure events. This was used in the same PR to
avoid generating duplicative success/failure events for a single
payment.
Because the field was only used as an internal token to correlate
HTLCs, and retries were not supported, it was generated randomly by
calling the `KeysInterface`'s 32-byte random-fetching function.
This also provided a backwards-compatibility story as the existing
HTLC randomization key was re-used for older clients.
In 28eea12bbe `MppId` was renamed to
the current `PaymentId` which was then used expose the
`retry_payment` interface, allowing users to send new HTLCs which
are considered a part of an existing payment.
At no point has the payment-sending API seriously considered
idempotency, a major drawback which leaves the API unsafe in most
deployments. Luckily, there is a simple solution - because the
`PaymentId` must be unique, and because payment information for a
given payment is held for several blocks after a payment
completes/fails, it represents an obvious idempotency token.
Here we simply require the user provide the `PaymentId` directly in
`send_payment`, allowing them to use whatever token they may
already have for a payment's idempotency token.
When a `chain::Watch` `ChannelMonitor` update method is called, the
user has three options:
(a) persist the monitor update immediately and return success,
(b) fail to persist the monitor update immediately and return
failure,
(c) return a flag indicating the monitor update is in progress and
will complete in the future.
(c) is rather harmless, and in some deployments should be expected
to be the return value for all monitor update calls, but currently
requires returning `Err(ChannelMonitorUpdateErr::TemporaryFailure)`
which isn't very descriptive and sounds scarier than it is.
Instead, here, we change the return type used to be a single enum
(rather than a Result) and rename `TemporaryFailure`
`UpdateInProgress`.
There's no need to broadcast our local commitment transaction if we've
already seen a confirmed one as it'll be immediately rejected as a
duplicate/conflict.
This will also help prevent dispatching spurious events for bumping
commitment and HTLC transactions through anchor outputs (once
implemented in future work) and the dispatch for said events follows the
same flow as our usual commitment broadcast.
In the next commit we'll enforce counterparty `InitFeatures`
matching our required set in `ChannelManager`, implying they must
be set for many tests where they previously did not need to be (as
they were enforced in `PeerManager`, which is not used in
functional tests).
When we connect to a new peer, immediately send them any
channel_announcement and channel_update messages for any public
channels we have with other peers. This allows us to stop sending
those messages on a timer when they have not changed and ensures
we are sending messages when we have peers connected, rather than
broadcasting at startup when we have no peers connected.
`fail_holding_cell_htlcs` calls through to
`fail_htlc_backwards_internal` for HTLCs that need to be
failed-backwards but opts to generate its own payment failure
events for `HTLCSource:;OutboundRoute` HTLCs. There is no reason
for that as `fail_htlc_backwards_internal` will also happily
generate (now-)equivalent events for `HTLCSource::OutboundRoute`
HTLCs.
Thus, we can drop the redundant code and always call
`fail_htlc_backwards_internal` for each HTLC in
`fail_holding_cell_htlcs`.
The `rejected_by_dest` field of the `PaymentPathFailed` event has
always been a bit of a misnomer, as its really more about retry
than where a payment failed. Now is as good a time as any to
rename it.
When our counterparty is the payment destination and we receive
an `HTLCFailReason::Reason` in `fail_htlc_backwards_internal` we
currently always set `rejected_by_dest` in the `PaymentPathFailed`
event, implying the HTLC should *not* be retried.
There are a number of cases where we use `HTLCFailReason::Reason`,
but most should reasonably be treated as retryable even if our
counterparty was the destination (i.e. `!rejected_by_dest`):
* If an HTLC times out on-chain, this doesn't imply that the
payment is no longer retryable, though the peer may well be
offline so retrying may not be very useful,
* If a commitment transaction "containing" a dust HTLC is
confirmed on-chain, this definitely does not imply the payment
is no longer retryable
* If the channel we intended to relay over was closed (or
force-closed) we should retry over another path,
* If the channel we intended to relay over did not have enough
capacity we should retry over another path,
* If we received a update_fail_malformed_htlc message from our
peer, we likely should *not* retry, however this should be
exceedingly rare, and appears to nearly never appear in practice
Thus, this commit simply disables the behavior here, opting to
treat all `HTLCFailReason::Reason` errors as retryable.
Note that prior to 93e645daf4 this
change would not have made sense as it would have resulted in us
retrying the payment over the same channel in some cases, however
we now "blame" our own channel and will avoid it when routing for
the same payment.
It is proportion of the channel value to configure as the
`their_channel_reserve_satoshis` for both outbound and inbound channels.
It decides the minimum balance that the other node has to maintain on their
side, at all times.
Saturating a channel beyond 1/4 of its capacity seems like a more
reasonable threshold for avoiding a path than 1/2, especially given
we should still be willing to send a payment with a lower
saturation limit if it comes to that.
This requires an (obvious) change to some router tests, but also
requires a change to the `fake_network_test`, opting to simply
remove some over-limit test code there - `fake_network_test` was
our first ever functional test, and while it worked great to ensure
LDK worked at all on day one, we now have a rather large breadth
of functional tests, and a broad "does it work at all" test is no
longer all that useful.
In the next commit we add lockorder testing based on the line each
mutex was created on rather than the particular mutex instance.
This causes some additional test failure because of lockorder
inversions for the same mutex across different tests, which is
fixed here.
In order to avoid failing to find paths due to the new channel
saturation limit, if we fail to find enough paths, we simply
disable the saturation limit for further path finding iterations.
Because we can now increase the maximum sent over a given channel
during routefinding, we may now generate redundant paths for the
same payment. Because this is wasteful in the network, we add an
additional pass during routefinding to merge redundant paths.
Note that two tests which previously attempted to send exactly the
available liquidity over a channel which charged an absolute fee
need updating - in those cases the router will first collect a path
that is saturation-limited, then attempt to collect a second path
without a saturation limit while stil honoring the existing
utilized capacity on the channel, causing failure as the absolute
fee must be included.
In c02b6a3807 we moved the
`payment_preimage` copy from inside the macro which only runs if we
are spending an output we know is an HTLC output to doing it for
any script that matches our expected length. This can panic if an
inbound channel is created with a bogus funding transaction that
has a witness program of the HTLC-Success/-Offered length but which
does not have a second-to-last witness element which is 32 bytes.
Luckily this panic is relatively simple for downstream users to
work around - if an invalid-length-copy panic occurs, simply remove
the ChannelMonitor from the bogus channel on startup and run
without it. Because the channel must be funded by a bogus script in
order to reach this panic, the channel will already have closed by
the time the funding transaction is spent, and there can be no
local funds in such a channel, so removing the `ChannelMonitor`
wholesale is completely safe.
In order to test this we have to disable an in-line assertion that
checks that our transactions match expected scripts which we do by
checking for the specific bogus script that we now use in
`test_invalid_funding_tx`.
Thanks to Eugene Siegel for reporting this issue.
Because downstream languages are often garbage-collected, having
the user directly allocate a `ReadOnlyNetworkGraph` and pass a
reference to it to `find_route` often results in holding a read
lock long in excess of the `find_route` call. Worse, some languages
(like JavaScript) tend to only garbage collect when other code is
not running, possibly leading to deadlocks.