Long ago, we used the `no_connection_possible` to signal that a
peer has some unknown feature set or some other condition prevents
us from ever connecting to the given peer. In that case we'd
automatically force-close all channels with the given peer. This
was somewhat surprising to users so we removed the automatic
force-close, leaving the flag serving no LDK-internal purpose.
Distilling the concept of "can we connect to this peer again in the
future" to a simple flag turns out to be ripe with edge cases, so
users actually using the flag to force-close channels would likely
cause surprising behavior.
Thus, there's really not a lot of reason to keep the flag,
especially given its untested and likely to be broken in subtle
ways anyway.
This fixes new errors in `full_stack_target` pointed out by
Chaincode's generous fuzzing infrastructure. Specifically, there's
no reason to check the error message in the
`funding_transaction_generated` return value - it can only return
a failure if the channel has closed since the funding transaction
was generated (which is fine) or if the signer refuses to sign
(which can't happen in fuzzing).
In general, we should be checking if a `Peer` has `their_features`
set as the "is this peer connected and have they finished the
handshake" flag as it indicates an `Init` message was received.
While none of these appear to be reachable bugs, there were a
number of places where we checked other flags for this purpose,
which may lead to sending messages before `Init` in the future.
Here we clean these cases up to always use the correct check (via
the new util method).
If we have a peer that sends a non-`Init` first message, we'll call
`peer_disconnected` without ever having called `peer_connected`
(which has to wait until we have an `Init` message). This is a
violation of our API guarantees, though should generally not be an
issue.
Because this bug was repeated in a few places, we also take this
opportunity to DRY up the logic which checks the peer state before
calling `peer_disconnected`.
Found by the new `ChannelManager` assertions and the
`full_stack_target` fuzzer.
The new in-`ChannelManager` retries logic does retries as two
separate steps, under two separate locks - first it calculates
the amount that needs to be retried, then it actually sends it.
Because the first step doesn't udpate the amount, a second thread
may come along and calculate the same amount and end up retrying
duplicatively.
Because we generally shouldn't ever be processing retries at the
same time, the fix is trivial - simply take a lock at the top of
the retry loop and hold it until we're done.
In anticipation of the next commit(s) adding threaded tests, we
need to ensure our lockorder checks work fine with multiple
threads. Sadly, currently we have tests in the form
`assert!(mutex.try_lock().is_ok())` to assert that a given mutex is
not locked by the caller to a function.
The fix is rather simple given we already track mutexes locked by a
thread in our `debug_sync` logic - simply replace the check with a
new extension trait which (for test builds) checks the locked state
by only looking at what was locked by the current thread.
We're no longer supporting manual retries since
ChannelManager::send_payment_with_retry can be parameterized by a retry
strategy
This commit also updates all docs related to retry_payment and abandon_payment.
Since these docs frequently overlap with changes in preceding commits where we
start abandoning payments on behalf of the user, all the docs are updated in
one go.
Documentation for CustomMessageHandler wasn't clear how it is related to
PeerManager and contained some grammatical and factual errors. Re-write
the docs and link to the lightning_custom_message crate.
When a peer disconnects but still has channels, the peer's `peer_state`
entry in the `per_peer_state` is not removed by the `peer_disconnected`
function. If the channels with that peer is later closed while still
being disconnected (i.e. force closed), we therefore need to remove the
peer from `peer_state` separately.
To remove the peers separately, we push such peers to a separate HashSet
that holds peers awaiting removal, and remove the peers on a timer to
limit the negative effects on parallelism as much as possible.
Updates multiple instances of the `ChannelManager` docs related to the
previous change that moved the storage of the channels to the
`per_peer_state`. This docs update corrects some grammar errors and
incorrect information, as well as clarifies documentation that was
confusing.