As we're moving towards monitor update async being a supported
use-case, we shouldn't call an async monitor update "failed", but
rather "in progress". This simply updates the internal channel.rs
enum name to reflect the new thinking.
If the initial ChannelMonitor persistence is done asynchronously
but does not complete before the node restarts (with a
ChannelManager persistence), we'll start back up with a channel
present but no corresponding ChannelMonitor.
Because the Channel is pending-monitor-update and has not yet
broadcasted its initial funding transaction or sent channel_ready,
this is not a violation of our API contract nor a safety violation.
However, the previous code would refuse to deserialize the
ChannelManager treating it as an API contract violation.
The solution is to test for this case explicitly and drop the
channel entirely as if the peer disconnected before we received
the funding_signed for outbound channels or before sending the
channel_ready for inbound channels.
When a `chain::Watch` `ChannelMonitor` update method is called, the
user has three options:
(a) persist the monitor update immediately and return success,
(b) fail to persist the monitor update immediately and return
failure,
(c) return a flag indicating the monitor update is in progress and
will complete in the future.
(c) is rather harmless, and in some deployments should be expected
to be the return value for all monitor update calls, but currently
requires returning `Err(ChannelMonitorUpdateErr::TemporaryFailure)`
which isn't very descriptive and sounds scarier than it is.
Instead, here, we change the return type used to be a single enum
(rather than a Result) and rename `TemporaryFailure`
`UpdateInProgress`.
See doc updates for more info on the edge case this prevents, and
there isn't really a strong reason why we would need to broadcast
the latest state immediately. Specifically, in the case of HTLC
claims (the most important reason to ensure we have state on chain
if it cannot be persisted), we will still force-close if there are
HTLCs which need claiming and are going to expire.
Surprisingly, there were no tests which failed as a result of this
change, but a new one has been added.
In the next commit we'll enforce counterparty `InitFeatures`
matching our required set in `ChannelManager`, implying they must
be set for many tests where they previously did not need to be (as
they were enforced in `PeerManager`, which is not used in
functional tests).
When we connect to a new peer, immediately send them any
channel_announcement and channel_update messages for any public
channels we have with other peers. This allows us to stop sending
those messages on a timer when they have not changed and ensures
we are sending messages when we have peers connected, rather than
broadcasting at startup when we have no peers connected.
The `rejected_by_dest` field of the `PaymentPathFailed` event has
always been a bit of a misnomer, as its really more about retry
than where a payment failed. Now is as good a time as any to
rename it.
If a user restores from a backup that they know is stale, they'd
like to force-close all of their channels (or at least the ones
they know are stale) *without* broadcasting the latest state,
asking their peers to do so instead. This simply adds methods to do
so, renaming the existing `force_close_channel` and
`force_close_all_channels` methods to disambiguate further.
As like the previous commit, `commit_upfront_shutdown_pubkey` is another
static field that cannot change after the initial channel handshake. We
therefore move it out from its existing place in `ChannelConfig`.
NetGraphMsgHandler implements RoutingMessageHandler to handle gossip
messages defined in BOLT 7 and maintains a view of the network by
updating NetworkGraph. Rename it to P2PGossipSync, which better
describes its purpose, and to contrast with RapidGossipSync.
`ChannelManager::fail_htlc_backwards`' bool return value is quite
confusing - just because it returns false doesn't mean the payment
wasn't (already) failed. Worse, in some race cases around shutdown
where a payment was claimed before an unclean shutdown and then
retried on startup, `fail_htlc_backwards` could return true even
though (a duplicate copy of the same payment) was claimed, but the
claim event has not been seen by the user yet.
While its possible to use it correctly, its somewhat confusing to
have a return value at all, and definitely lends itself to misuse.
Instead, we should push users towards a model where they don't care
if `fail_htlc_backwards` succeeds - either they've locally marked
the payment as failed (prior to seeing any `PaymentReceived`
events) and will fail any attempts to pay it, or they have not and
the payment is still receivable until its timeout time is reached.
We can revisit this decision based on user feedback, but will need
to very carefully document the potential failure modes here if we
do.
This update also includes a minor refactor. The return type of
`pending_monitor_events` has been changed to a `Vec` tuple with the
`OutPoint` type. This associates a `Vec` of `MonitorEvent`s with a
funding outpoint.
We've also renamed `source/sink_channel_id` to `prev/next_channel_id` in
the favour of clarity.
The spec actually requires we never send `announcement_signatures`
(and, thus, `channel_announcement`s) until after six confirmations.
However, we would happily have sent them prior to that as long as
we exchange `funding_locked` messages with our countarparty. Thanks
to re-broadcasting this issue is largely harmless, however it could
have some negative interactions with less-robust peers. Much more
importantly, this represents an important step towards supporting
0-conf channels, where `funding_locked` messages may be exchanged
before we even have an SCID to construct the messages with.
Because there is no ACK mechanism for `announcement_signatures` we
rely on existing channel updates to stop rebroadcasting them - if
we sent a `commitment_signed` after an `announcement_signatures`
and later receive a `revoke_and_ack`, we know our counterparty also
received our `announcement_signatures`. This may resolve some rare
edge-cases where we send a `funding_locked` which our counterparty
receives, but lose connection before the `announcement_signatures`
(usually the very next message) arrives.
Sadly, because the set of places where an `announcement_signatures`
may now be generated more closely mirrors where `funding_locked`
messages may be generated, but they are now separate, there is a
substantial amount of code motion providing relevant parameters
about current block information and ensuring we can return new
`announcement_signatures` messages.
If we have not yet sent `funding_locked` only because of a pending
channel monitor update, we shouldn't consider a channel
`is_usable`. This has a number of downstream effects, including
not attempting to route payments through the channel, not sending
private `channel_update` messages to our counterparty, or sending
channel_announcement messages if our couterparty has already signed
for it.
We further gate generation of `node_announcement`s on `is_usable`,
preventing generation of those or `announcement_signatures` until
we've sent our `funding_locked`.
Finally, `during_funding_monitor_fail` is updated to test a case
where we see the funding transaction lock in but have a pending
monitor update failure, then receive `funding_locked` from our
counterparty and ensure we don't generate the above messages until
after the monitor update completes.
A single PaymentSent event is generated when a payment is fulfilled.
This is occurs when the preimage is revealed on the first claimed HTLC.
For subsequent HTLCs, the event is not generated.
In order to score channels involved with a successful payments, the
scorer must be notified of each successful path involved in the payment.
Add a PaymentPathSuccessful event for this purpose. Generate it whenever
a part is removed from a pending outbound payment. This avoids duplicate
events when reconnecting to a peer.
I realized on my own node that I don't have any visibility into how
long a monitor or manager persistence call takes, potentially
blocking other operations. This makes it much more clear by adding
a relevant log_trace!() print immediately before and immediately
after persistence.
The payment_hash may not uniquely identify the payment if it has been
reused. Include the payment_id in PaymentSent events so it can
correlated with the send_payment call.
In the next commit, we'll be originating monitor updates both from
the ChainMonitor and from the ChannelManager, making simple
sequential update IDs impossible.
Further, the existing async monitor update API was somewhat hard to
work with - instead of being able to generate monitor_updated
callbacks whenever a persistence process finishes, you had to
ensure you only did so at least once all previous updates had also
been persisted.
Here we eat the complexity for the user by moving to an opaque
type for monitor updates, tracking which updates are in-flight for
the user and only generating monitor-persisted events once all
pending updates have been committed.
In the next commit we'll need ChainMonitor to "see" when a monitor
persistence completes, which means `monitor_updated` needs to move
to `ChainMonitor`. The simplest way to then communicate that
information to `ChannelManager` is via `MonitorEvet`s, which seems
to line up ok, even if they're now constructed by multiple
different places.
As ChainMonitor will need to see those errors in a coming PR,
we need to return errors via Persister so that our ChainMonitor
chain::Watch implementation sees them.
test_simple_monitor_permanent_update_fail and
test_simple_monitor_temporary_update_fail both have a mode where
they use either chain::Watch or persister to return errors.
As we won't be doing any returns directly from the chain::Watch
wrapper in a coming commit, the chain::Watch-return form of the
test will no longer make sense.
Exposing a `RwLock<HashMap<>>` directly was always a bit strange,
and in upcoming changes we'd like to change the internal
datastructure in `ChainMonitor`.
Further, the use of `RwLock` and `HashMap` meant we weren't able
to expose the ChannelMonitors themselves to users in bindings,
leaving a bindings/rust API gap.
Thus, we take this opportunity go expose ChannelMonitors directly
via a wrapper, hiding the internals of `ChainMonitor` behind
getters. We also update tests to use the new API.
The interface for get_route will change to take a scorer. Using
get_route_and_payment_hash whenever possible allows for keeping the
scorer inside get_route_and_payment_hash rather than at every call site.
Replace get_route with get_route_and_payment_hash wherever possible.
Additionally, update get_route_and_payment_hash to use the known invoice
features and the sending node's logger.