`enqueue_message` simply adds the message to the outbound queue, it
still needs to be written to the socket with `do_attempt_write_data`.
However, since we immediately return an error causing the socket to be
closed, the message never actually gets sent.
If thre's a thread currently handling `PeerManager` events, the
next thread which attempts to handle events will block on the first
and then handle events after the first completes. (later threads
will return immediately to avoid blocking more than one thread).
This works fine as long as the user has a spare thread to leave
blocked, but if they don't (e.g. are running with a single-threaded
tokio runtime) this can lead to a full deadlock.
Instead, here, we never block waiting on another event processing
thread, returning immediately after signaling that the first thread
should start over once its complete to ensure all events are
handled.
While this could lead to starvation as we cause one thread to go
around and around and around again, the risk of that should be
relatively low as event handling should be pretty quick, and it's
certainly better than deadlocking.
Fixes https://github.com/lightningdevkit/rapid-gossip-sync-server/issues/32
Atomic lock simplification suggestion from @andrei-21
`NetworkGraph`'s `PartialEq` impl before this commit was deadlock-prone.
Similarly to `ChannelMonitor`'s, `PartialEq` impl, we use position in
memory for a total lockorder. This uses the assumption that the objects
cannot move within memory while the inner locks are held.
When calculating the amount available to send for the next HTLC, if
we over-count we may create routes which are not actually usable.
Historically this has been an issue, which we resolve over a few
commits.
Here we consider the number of in-flight HTLCs which we are allowed
to push towards a counterparty at once, setting the available
balance to zero if we cannot push any further HTLCs.
We also add some testing when sending to ensure that send failures
are accounted for in our balance calculations.
When calculating the amount available to send for the next HTLC, if
we over-count we may create routes which are not actually usable.
Historically this has been an issue, which we resolve over a few
commits.
Here we include the cost of the commitment transaction fee in our
calculation, subtracting the commitment tx fee cost from the
available as we do in `send_payment`.
We also add some testing when sending to ensure that send failures
are accounted for in our balance calculations.
This commit is based on original work by
Gleb Naumenko <naumenko.gs@gmail.com> and modified by
Matt Corallo <git@bluematt.me>.
In the coming commits we redo our next-HTLC-available logic which
requires some minor test changes for tests which relied on
calculating routes which were not usable.
Here we do a minor prefactor to simplify a test which now no longer
requires later changes.
While its nice to be able to push an HTLC which spends balance that
is removed in our local commitment transaction but awaiting an RAA
from our peer for final removal its by no means a critical feature.
Because peers should really be sending RAAs quickly after we send
a commitment, this should be an exceedingly rare case, and we
already don't expose this as available balance when routing, so
this isn't even made available when sending, only forwarding.
Note that `test_pending_claimed_htlc_no_balance_underflow` is
removed as it tested a case which was only possible because of this
and now is no longer possible.
Also get rid of some trailing whitespace because my text editor likes to do
that.
We'll next add a new variant for max_htlc provided in route hints, which will
be treated differently in scoring.
See https://github.com/lightning/bolts/pull/803
This protect the justice claim of counterparty revoked output. As
otherwise if the all the revoked outputs claims are batched in a
single transaction, low-feerate HTLCs transactions can delay our
honest justice claim transaction until BREAKDOWN_TIMEOUT expires.
Previously, we would panic when failing to construct onion messages in
certain circumstances. Here we opt to always rather error out and don't
panic if something goes wrong during OM packet construction.
Rather than using the std benchmark framework (which isn't
maintained and is unlikely to get any further maintenance), we swap
for criterion, which at least gets us a variable number of test
runs so our benchmarks don't take forever.
We also fix the RGS benchmark to pass now that the file in use is
stale compared to today's date.
When benchmarking our router, we previously only ever tested with
amounts under 1,000 sats, which is an incredibly small amount.
While this ensures we have the maximal number of available channels
to consider, it prevents our scorer from getting exercise across
its range. Further, we only score the immediate path we are
expecting to to send over, and not randomly but rather based on the
amount sent.
Here we try to make the benchmarks a bit more realistic by adding
a new benchmark which attempts to send around 100K sats, which is
a reasonable amount to send over a channel today. We also convert
the scoring data to be randomized based on the seed as well as
attempt to (possibly) find a new route for a much larger value and
score based on that. This potentially allows us to score multiple
potential paths between the source and destination as the large
route-find may return an MPP result.
There's a few route tests which do the same thing as the benchmarks
as they're also a good test. However, they didn't share code, which
is somewhat wasteful, so we fix that here.
This PR aims to create a "stateless" scorer. Instead of passing
in fee params at construction-time, we want to parametrize the
scorer with an associated "parameter" type, which is then
passed to the router function itself, and allows passing
different parameters per route-finding call.
Custom message handlers may need to set feature bits that are unknown to
LDK. Provide Features::set_required_custom_bit and
Features::set_optional_custom_bit to allow for this.
Each message handler provides which features it supports. A custom
message handler may support unknown features. Therefore, these features
should be checked against instead of the features known by LDK.
Additionally, fail the connection if the peer requires features unknown
to the handler. The peer should already fail the connection in the
latter case.
The purpose of this payload is to ensure we retry restored packages on a
`ChannelMonitor` that has upgraded from a version that previously did
not have such retry logic. We can verify this works by checking whether
a restored package has a `height_timer` of `None` upon deserializing the
monitor payload.
In the previous commit, we added a helper that constructs blocks
whenever tests demand blocks be connected. This helper moved towards
having all connected blocks have a version of 0x2000_0000 (also known as
NO_SOFT_FORK_SIGNALLING). However, previously, it was possible for some
blocks to be connected with a slighty different version: 0x0200_0000,
resulting in different block hashes.
This block hash divergence prompted a failure in this test when
`ConnectStyle::HighlyRedundantTransactionsFirstSkippingBlocks` is used
for `nodes[0]`, since this block connection style reconfirms
transactions redundantly and the serialized monitor payload kept a
reference to the hash of the block with version 0x0200_0000, when it
should be expecting one with version 0x2000_0000.
`rust-bitcoin v0.30.0` introduces concrete variants for data members of
block `Header`s. To avoid having to update these across every use, we
introduce new helpers to create dummy blocks and headers, such that the
update process is a bit more straight-forward.
`rust-bitcoin v0.30.0` made some changes in this area that no longer
allow us to work with the previously exposed `U256` type. While `Work`
and `Target` (they're inverses of each other) essentially represent the
same concept, it makes more sense from their API's perspective to only
expose difficulty transitions and adjustments on `Target`s.
This makes much clearer at sites generating such events that they
will be lost on restart, to reduce risk of bugs creeping in due to
lost monitor updates.
In d4810087c1 we added logic to apply `ChannelMonitorUpdate`s which
were a part of a channel closure async via a background queue to
address some startup issues. When we did that we persisted those
updates to ensure we replayed them when starting next time.
However, there was no reason to - if we persisted and then
restarted even without those monitor updates we'd find a monitor
without a channel, which we'd tell to broadcast the latest
commitment transaction to force-close.
Since adding that logic, we've used the same background queue for
several purposes.