Because of the merge between peer reconnection and channel monitor
updating channel restoration code, we now sometimes generate
(somewhat spurious) announcement signatures when restoring channel
monitor updating. This should not result in a fuzzing failure.
This fails chanmon_consistency on IgnoreError error events and on
messages left over to be sent to a just-disconnected peer, which
should have been drained.
These should never appear, so consider them a fuzzer fail case.
The channel restoration code in channel monitor updating and peer
reconnection both do incredibly similar things, and there is
little reason to have them be separate. Sadly because they require
holding a lock with a reference to elements in the lock, its not
practical to make them utility functions, so instead we introduce
a two-step macro here which will eventually be used for both.
Because we still support pre-NLL Rust, the macro has to be in two
parts - one which runs with the channel_state lock, and one which
does not.
Our fuzz tests previously only printed the log output of the first
fuzz test case to fail. This commit changes that (with lots of
auto-generated updates) to ensure we print all log outputs.
We currently generate duplicative PaymentFailed/PaymentSent events
in two cases:
a) If we receive a update_fulfill_htlc message, followed by a
disconnect, then a resend of the same update_fulfill_htlc
message, we will generate a PaymentSent event for each message.
b) When a Channel is closed, any outbound HTLCs which were relayed
through it are simply dropped when the Channel is. From there,
the ChannelManager relies on the ChannelMonitor having a copy of
the relevant fail-/claim-back data and processes the HTLC
fail/claim when the ChannelMonitor tells it to.
If, due to an on-chain event, an HTLC is failed/claimed, and
then we serialize the ChannelManager, but do not re-serialize
the relevant ChannelMonitor, we may end up getting a duplicative
event.
In order to provide the expected consistency, we add explicit
tracking of pending outbound payments using their unique
session_priv field which is generated when the payment is sent.
Then, before generating PaymentFailed/PaymentSent events, we check
that the session_priv for the payment is still pending.
Thix fixes#209.
To avoid caller data struct storing HTLC-related information when
a revokeable output is claimed on top of a commitment/second-stage
HTLC transactions, we split `keysinterface::sign_justice_transaction`
in two new halves `keysinterfaces::sign_justice_revoked_output` and
`keysinterfaces::sign_justice_revoked_htlc`.
Further, this split offers more flexibility to signer policy as a
commitment revokeable output might be of a value far more significant
than HTLC ones.
When we had a event which caused us to set the persist flag in a
PersistenceNotifier in between wait calls, we will still wait,
potentially not persisting a ChannelManager when we should.
Worse, for wait_timeout, this caused us to always wait up to the
timeout, but then always return true that a persistence is needed.
Instead, we simply check the persist flag before waiting, returning
immediately if it is set.
Currently, when a user calls `ChannelManager::timer_tick_occurred`
we always set the persister's update flag to true. This results in
a ChannelManager persistence after each timer tick, even when
nothing happened.
Instead, we add a new flag to `PersistenceNotifierGuard` to
indicate if we should skip setting the update flag.
Currently, we only send an update_channel message after
disconnecting a peer and waiting some time. We do not send a
followup when the peer has been reconnected for some time.
This changes that behavior to make the disconnect and reconnect
channel updates symmetric, and also simplifies the state machine
somewhat to make it more clear.
Finally, it serializes the current announcement state so that we
usually know when we need to send a new update_channel.
Early sample testing showed multiple users hitting
EWOULDBLOCK/EAGAIN waiting for an initial response from Bitcoin
Core while it was doing some long operation (eg UTXO cache
flushing). Instead of only waiting 5 seconds for each attempt, we
now wait a full two minutes, but only for the first header
response, not each byte.
Our enforced requirements for HTLC acceptance is that we have at
least HTLC_FAIL_BACK_BUFFER blocks before the HTLC expires. When we
receive an HTLC, the HTLC would be "already expired" if its
`cltv_expiry` is current-block + 1 (ie the next block could
broadcast the commitment transaction and time out the HTLC). From
there, we want an extra HTLC_FAIL_BACK_BUFFER in blocks, plus an
extra block or two to account for any differences in the view of
the current height before send or while the HTLC is transiting the
network.
This increases the CLTV_CLAIM_BUFFER constant to 18, much better
capturing how long it takes to go on chain to claim payments.
This is also more in line with other clients, and the spec, which
sets the default CLTV delay in invoices to 18.
As a side effect, we have to increase MIN_CLTV_EXPIRY_DELTA as
otherwise as are subject to an attack where someone can hold an
HTLC being forwarded long enough that we *also* close the channel
on which we received the HTLC.
In #797, we stopped enforcing that read/sent node_announcements
had their addresses sorted. While this is fine in practice, we
should still make a best-effort to sort them to comply with the
spec's forward-compatibility requirements, which we do here in the
ChannelManager.