Monitor update failure can no longer lead to a `ChannelUnavailable`
error, but more common cases such as the peer being disconnected
always could, so those should be documented clearer.
Currently, users have no means to upper-bound the total fees accruing
when finding a route. Here, we add a corresponding field to
`RouteParameters` which will be used to limit the candidate set during
path finding in the following commits.
`ChannelManager::finish_force_close_channel` exists to do cleanups
which must happen without the `per_peer_state` mutex held. However,
because it lacked lock assertions, several changes snuck in
recently which resulted in it running with peer-state locks held,
risking a deadlock if some HTLCs need to be failed.
We've run into this several times in the wild, likely due to
https://github.com/ElementsProject/lightning/issues/6200 wherein a node on the
path will error with 0x1000 but not provide a channel update (a spec
violation).
Previously, we would blame the inbound edge even though the buggy peer wanted
us to blame the outbound edge. Since this issue seems to be recurring and our
blaming the inbound edge is causing us to punish innocent channels, trust the
peer that the outbound edge is the one to blame.
Previously this value would be incorrectly set to true because we wouldn't
account for blinded hops when determining if we were processing the last hop's
failure packet.
Our ultimate goal with this field is to set
PaymentPathFailed::payment_failed_permanently, so use this name rather than
flipping a bool back and forth across methods.
While there is no great way to handle a true failure to persist a
`ChannelMonitorUpdate`, it is confusing for users for there to be
no error variant at all on an I/O operation.
Thus, here we re-add the error variant removed over the past
handful of commits, but rather than handle it in a truly unsafe
way, we simply panic, optimizing for maximum mutex poisoning to
ensure any future operations fail and return immediately.
In the future, we may consider changing the handling of this to
instead set some "disconnect all peers and fail all operations"
bool to give the user a better chance to shutdown in a semi-orderly
fashion, but there's only so much that can be done in lightning if
we truly cannot persist new updates.
Now that `handle_new_monitor_update` can no longer return an error,
it similarly no longer needs any code to handle errors. Here we
remove that code, substantially reducing macro variants.
Since we now (almost) support async monitor update persistence, the
documentation on `ChannelMonitorUpdateStatus` can be updated to no
longer suggest users must keep a local copy that persists before
returning. However, because there are still a few remaining issues,
we note that async support is currently beta and explicily warn of
potential for funds-loss.
Fixes#1684
The `MonitorEvent::CommitmentTxConfirmed` has always been a result
of us force-closing the channel, not the counterparty doing so.
Thus, it was always a bit of a misnomer. Worse, it carried over
into the channel's `ClosureReason` in the event API.
Here we simply rename it and use the proper `ClosureReason`.
When a `ChannelMonitorUpdate` fails to apply, it generally means
we cannot reach our storage backend. This, in general, is a
critical issue, but is often only a transient issue.
Sadly, users see the failure variant and return it on any I/O
error, resulting in channel force-closures due to transient issues.
Users don't generally expect force-closes in most cases, and
luckily with async `ChannelMonitorUpdate`s supported we don't take
any risk by "delaying" the `ChannelMonitorUpdate` indefinitely.
Thus, here we drop the `PermanentFailure` variant entirely, making
all failures instead be "the update is in progress, but won't ever
complete", which is equivalent if we do not close the channel
automatically.
Two tests in the payment tests currently rely on failing to persist
ChannelMonitorUpdates as their method of failing payments before
they even get out the door.
In the coming commits we'll drop the persist failure error codes,
so here rewrite these tests to rely on trying to send more than is
available in a channel.
In bindings, we can't use unbounded generic types, and thus have to
rip out the `ScoreParams` and replace them with static
`ProbabilisticScoringFeeParams` universally. To make this easier,
using `Default::default()` everywhere allows the type to change out
from under the test without the test needing to change.
If the last hop was provided by route hint we assume it's not an announced channel.
If furthermore only a single route hint is provided we refrain from probing through
all the way to the end and instead probe up to the second-to-last channel.
Optimally we'd do this not based on above mentioned assumption but
rather by checking inclusion in our network graph. However, we don't
have access to our graph in `ChannelManager`.
We add a `ChannelManager::send_preflight_probes` method that can be used
to send pre-flight probes given some [`RouteParameters`]. Additionally,
we add convenience methods in for spontaneous probes and send pre-flight
probes for a given invoice.
As pre-flight probes might take up some of the available liquidity, we
here introduce that channels whose available liquidity is less than the
required amount times
`UserConfig::preflight_probing_liquidity_limit_multiplier` won't be used
to send pre-flight probes.
This commit is a more or less a carbon copy of the pre-flight
probing code recently added to LDK Node.
When sending preflight probes, we want to exclude last hops that are
possibly announced. To this end, we here include a new field in
`RouteHop` that will be `true` when we either def. know the hop to be
announced, or, if there exist public channels between the hop's
counterparties that this hop might refer to (i.e., be an alias for).
Our `Trusted*` wrappers in `chan_utils` expose additional inner
fields by reference. However, because they were not explicitly
marked as returning a reference with the wrapped struct's
lifetimes, rustc was considering them to return a reference with
the wrapper struct's lifetime.
This is unnecessarily restrictive, and resulted in the addition of
a clone in 9850c5814a which we remove
here.
Previously, we'd leave the payment secret field empty while sending
probes, which resulted in having them rejected
with `(PERM|invalid_onion_payload)` by Eclair nodes.
In order to mitigate the issue, we just set a random payment secret.
`ChannelId` was weirdly listed in the re-export section of the docs and
reachable via multiple paths. Here we opt to make the `channel_id`
module private and leave only the `ChannelId` struct itself exposed.
When we handle an inbound `channel_reestablish` from our peers it
generally doesn't change any state and thus doesn't need a
`ChannelManager` persistence. Here we avoid said persistence where
possible.
If we receive a `ChannelUpdate` message which was invalid, it can
cause us to force-close the channel, which should result in a
`ChannelManager` persistence, though its not critical to do so.
When a peer connects and we send some `channel_reestablish`
messages or create a `per_peer_state` entry there's really no
reason to need to persist the `ChannelManager`. None of the
possible actions we take immediately result in a change to the
persisted contents of a `ChannelManager`, only the peer's later
`channel_reestablish` message does.
Many channel related messages don't actually change the channel
state in a way that changes the persisted channel. For example,
an `update_add_htlc` or `update_fail_htlc` message simply adds the
change to a queue, changing the channel state when we receive a
`commitment_signed` message.
In these cases there's really no reason to wake the background
processor at all - there's no response message and there's no state
update. However, note that if we close the channel we should
persist the `ChannelManager`. If we send an error message without
closing the channel, we should wake the background processor
without persisting.
Here we move to the appropriate `NotifyOption` on some of the
simpler channel message handlers.