Commit graph

1073 commits

Author SHA1 Message Date
Elias Rohrer
8419d13d4c
Add list_channels_by_counterparty method
While we already provide a `list_channels` method, it could result in
quite a large `Vec<ChannelDetails>`. Here, we provide the means to query
our channels by `counterparty_node_id` and DRY up the code.
2023-03-07 19:50:01 +01:00
Matt Corallo
c5cc1ede26
Merge pull request #2028 from TheBlueMatt/2023-02-macros-for-wilmer
Reduce macro usage in tests
2023-03-06 16:56:08 +00:00
Matt Corallo
fac5373687
Merge pull request #2068 from jkczyz/2023-03-doc-fixes
Doc and build warning fixes
2023-03-03 22:19:59 +00:00
Jeffrey Czyz
1d1323a3d0
Fix build warnings 2023-03-03 14:23:18 -06:00
Matt Corallo
41bcd68a40
Merge pull request #2066 from TheBlueMatt/2023-02-no-enum-refs
Pass `FailureCode` to `fail_htlc_backwards` by ownership
2023-03-03 19:44:40 +00:00
Matt Corallo
0987b32bed Pass FailureCode to fail_htlc_backwards by ownership
`FaliureCode` is a trivial enum with no body, so we shouldn't be
passing it by reference. Its sufficiently strange that the Java
bindings aren't happy with it, which is fine, we should just fix it
here.
2023-03-03 17:20:58 +00:00
Matt Corallo
0ad1f4c943 Track claimed outbound HTLCs in ChannelMonitors
When we receive an update_fulfill_htlc message, we immediately try
to "claim" the HTLC against the HTLCSource. If there is one, this
works great, we immediately generate a `ChannelMonitorUpdate` for
the corresponding inbound HTLC and persist that before we ever get
to processing our counterparty's `commitment_signed` and persisting
the corresponding `ChannelMonitorUpdate`.

However, if there isn't one (and this is the first successful HTLC
for a payment we sent), we immediately generate a `PaymentSent`
event and queue it up for the user. Then, a millisecond later, we
receive the `commitment_signed` from our peer, removing the HTLC
from the latest local commitment transaction as a side-effect of
the `ChannelMonitorUpdate` applied.

If the user has processed the `PaymentSent` event by that point,
great, we're done. However, if they have not, and we crash prior to
persisting the `ChannelManager`, on startup we get confused about
the state of the payment. We'll force-close the channel for being
stale, and see an HTLC which was removed and is no longer present
in the latest commitment transaction (which we're broadcasting).
Because we claim corresponding inbound HTLCs before updating a
`ChannelMonitor`, we assume such HTLCs have failed - attempting to
fail after having claimed should be a noop. However, in the
sent-payment case we now generate a `PaymentFailed` event for the
user, allowing an HTLC to complete without giving the user a
preimage.

Here we address this issue by storing the payment preimages for
claimed outbound HTLCs in the `ChannelMonitor`, in addition to the
existing inbound HTLC preimages already stored there. This allows
us to fix the specific issue described by checking for a preimage
and switching the type of event generated in response. In addition,
it reduces the risk of future confusion by ensuring we don't fail
HTLCs which were claimed but not fully committed to before a crash.

It does not, however, full fix the issue here - because the
preimages are removed after the HTLC has been fully removed from
available commitment transactions if we are substantially delayed
in persisting the `ChannelManager` from the time we receive the
`update_fulfill_htlc` until after a full commitment signed dance
completes we may still hit this issue. The full fix for this issue
is to delay the persistence of the `ChannelMonitorUpdate` until
after the `PaymentSent` event has been processed. This avoids the
issue entirely, ensuring we process the event before updating the
`ChannelMonitor`, the same as we ensure the upstream HTLC has been
claimed before updating the `ChannelMonitor` for forwarded
payments.

The full solution will be implemented in a later work, however this
change still makes sense at that point as well - if we were to
delay the initial `commitment_signed` `ChannelMonitorUpdate` util
after the `PaymentSent` event has been processed (which likely
requires a database update on the users' end), we'd hold our
`commitment_signed` + `revoke_and_ack` response for two DB writes
(i.e. `fsync()` calls), making our commitment transaction
processing a full `fsync` slower. By making this change first, we
can instead delay the `ChannelMonitorUpdate` from the
counterparty's final `revoke_and_ack` message until the event has
been processed, giving us a full network roundtrip to do so and
avoiding delaying our response as long as an `fsync` is faster than
a network roundtrip.
2023-03-03 17:19:03 +00:00
Jeffrey Czyz
fbe9f47e12
Reference Router in ChannelManager docs 2023-03-03 09:48:25 -06:00
Matt Corallo
35bb0f4676 Replace get_err_msg macro with a function
The `get_err_msg!()` macro has no reason to be a macro so here we
move its logic to a function and leave the macro in place to avoid
touching every line of code in the tests.

This reduces the `--profile=test --lib` `Zpretty=expanded` code
size from 322,183 LoC to 321,985 LoC.
2023-03-01 18:29:27 +00:00
Matt Corallo
90b2f5e1aa Replace get_revoke_commit_msgs macro with a function
The `get_revoke_commit_msgs!()` macro has no reason to be a macro
so here we move its logic to a function and leave the macro in
place to avoid touching every line of code in the tests.

This reduces the `--profile=test --lib` `Zpretty=expanded` code
size from 324,763 LoC to 322,183 LoC.
2023-03-01 18:29:12 +00:00
Wilmer Paulino
8311581fe1
Merge pull request #2006 from TheBlueMatt/2023-02-no-recursive-read-locks
Refuse recursive read locks
2023-02-28 00:24:16 -08:00
Matt Corallo
065dc6e689 Make sure individual mutexes are constructed on different lines
Our lockdep logic (on Windows) identifies a mutex based on which
line it was constructed on. Thus, if we have two mutexes
constructed on the same line it will generate false positives.
2023-02-28 01:06:35 +00:00
Matt Corallo
f082ad40b5 Disallow taking two instances of the same mutex at the same time
Taking two instances of the same mutex may be totally fine, but it
requires a total lockorder that we cannot (trivially) check. Thus,
its generally unsafe to do if we can avoid it.

To discourage doing this, here we default to panicing on such locks
in our lockorder tests, with a separate lock function added that is
clearly labeled "unsafe" to allow doing so when we can guarantee a
total lockorder.

This requires adapting a number of sites to the new API, including
fixing a bug this turned up in `ChannelMonitor`'s `PartialEq` where
no lockorder was guaranteed.
2023-02-28 01:06:35 +00:00
Matt Corallo
2c8a26c6d2 Don't per_peer_state read locks recursively in monitor updating
When handling a `ChannelMonitor` update via the new
`handle_new_monitor_update` macro, we always call the macro with
the `per_peer_state` read lock held and have the macro drop the
per-peer state lock. Then, when handling the resulting updates, we
may take the `per_peer_state` read lock again in another function.

In a coming commit, recursive read locks will be disallowed, so we
have to drop the `per_peer_state` read lock before calling
additional functions in `handle_new_monitor_update`, which we do
here.
2023-02-28 01:06:35 +00:00
Matt Corallo
38374dde42 Expect callers to hold read locks before channel_monitor_updated
Our existing lockorder tests assume that a read lock on a thread
that is already holding the same read lock is totally fine. This
isn't at all true. The `std` `RwLock` behavior is
platform-dependent - on most platforms readers can starve writers
as readers will never block for a pending writer. However, on
platforms where this is not the case, one thread trying to take a
write lock may deadlock with another thread that both already has,
and is attempting to take again, a read lock.

Worse, our in-tree `FairRwLock` exhibits this behavior explicitly
on all platforms to avoid the starvation issue.

Sadly, a user ended up hitting this deadlock in production in the
form of a call to `get_and_clear_pending_msg_events` which holds
the `ChannelManager::total_consistency_lock` before calling
`process_pending_monitor_events` and eventually
`channel_monitor_updated`, which tries to take the same read lock
again.

Luckily, the fix is trivial, simply remove the redundand read lock
in `channel_monitor_updated`.

Fixes #2000
2023-02-28 01:06:35 +00:00
Matt Corallo
1d4bc1e0fb Hold the total_consistency_lock while in outbound_payment fns
We previously avoided holding the `total_consistency_lock` while
doing crypto operations to build onions. However, now that we've
abstracted out the outbound payment logic into a utility module,
ensuring the state is consistent at all times is now abstracted
away from code authors and reviewers, making it likely to break.

Further, because we now call `send_payment_along_path` both with,
and without, the `total_consistency_lock`, and because recursive
read locks may deadlock, it would now be quite difficult to figure
out which paths through `outbound_payment` need the lock and which
don't.

While it may slow writes somewhat, it's not really worth trying to
figure out this mess, instead we just hold the
`total_consistency_lock` before going into `outbound_payment`
functions.
2023-02-28 01:06:35 +00:00
Matt Corallo
f03b7cd448 Remove the final_cltv_expiry_delta in RouteParameters entirely
fbc08477e8 purported to "move" the
`final_cltv_expiry_delta` field to `PaymentParamters` from
`RouteParameters`. However, for naive backwards-compatibility
reasons it left the existing on in place and only added a new,
redundant field in `PaymentParameters`.

It turns out there's really no reason for this - if we take a more
critical eye towards backwards compatibility we can figure out the
correct value in every `PaymentParameters` while deserializing.

We do this here - making `PaymentParameters` a `ReadableArgs`
taking a "default" `cltv_expiry_delta` when it goes to read. This
allows existing `RouteParameters` objects to pass the read
`final_cltv_expiry_delta` field in to be used if the new field
wasn't present.
2023-02-27 22:33:21 +00:00
Matt Corallo
9364154022 Require a non-0 number of non-empty paths when deserializing routes
When we read a `Route` (or a list of `RouteHop`s), we should never
have zero paths or zero `RouteHop`s in a path. As such, its fine to
simply reject these at deserialization-time. Technically this could
lead to something which we can generate not round-trip'ing
serialization, but that seems okay here.
2023-02-27 22:31:11 +00:00
Matt Corallo
a0c65c32a5
Merge pull request #2043 from valentinewallace/2023-02-initial-send-path-fails
`PaymentPathFailed`: add initial send error details
2023-02-27 18:10:27 +00:00
Matt Corallo
16b3c720a6
Merge pull request #2025 from TheBlueMatt/2023-02-no-pub-genesis-hashes
Remove genesis block hash from public API
2023-02-27 17:50:22 +00:00
Valentine Wallace
8d686d83cb
Implement writeable for APIError 2023-02-25 16:13:42 -05:00
Valentine Wallace
70c7161dbe
Rename OptionDeserWrapper->RequiredWrapper
Makes more sense and sets us up for the UpgradableRequired version of it
2023-02-24 20:00:14 -05:00
Valentine Wallace
0e88538b78
Disambiguate ignorable and ignorable_option
Would rather not rename ignorable to ignorable_required, so rename both of them
to upgradable_*
2023-02-24 14:21:11 -05:00
Matt Corallo
d7c818a3ad Rename BestBlock::from_genesis to from_network for clarity 2023-02-24 00:22:58 +00:00
Matt Corallo
2c3e12e309 Remove genesis block hash from public API
Forcing users to pass a genesis block hash has ended up being
error-prone largely due to byte-swapping questions for bindings
users. Further, our API is currently inconsistent - in
`ChannelManager` we take a `Bitcoin::Network` but in `NetworkGraph`
we take the genesis block hash.

Luckily `NetworkGraph` is the only remaining place where we require
users pass the genesis block hash, so swapping it for a `Network`
is a simple change.
2023-02-24 00:22:58 +00:00
Matt Corallo
46fd7035b3
Merge pull request #2014 from valentinewallace/2023-02-rework-partial-pmt-fail
Rework auto-retry send errors
2023-02-23 21:54:16 +00:00
Valentine Wallace
d471d9746c
Rework auto retry send errors
Prior to this, we returned PaymentSendFailure from auto retry send payment
methods. This implied that we might return a PartialFailure from them, which
has never been the case. So it makes sense to rework the errors to be a better
fit for the methods.

We're taking error handling in a totally different direction now to make it
more asynchronous, see send_payment_internal for more information.
2023-02-23 15:50:23 -05:00
Valentine Wallace
5e4f0bcff0
Fix InvalidRoute error to be ChannelUnavailable
InvalidRoute is reserved for malformed routes, not routes where a channel or
its peer is unavailable
2023-02-22 23:05:43 -05:00
Matt Corallo
96c8507fbf
Merge pull request #1897 from TheBlueMatt/2022-11-monitor-updates-always-async
Always process `ChannelMonitorUpdate`s asynchronously
2023-02-22 19:12:31 +00:00
Matt Corallo
685b08d8c1 Use the new monitor persistence flow for funding_created handling
Building on the previous commits, this finishes our transition to
doing all message-sending in the monitor update completion
pipeline, unifying our immediate- and async- `ChannelMonitor`
update and persistence flows.
2023-02-22 17:34:46 +00:00
Matt Corallo
c35253d6a8 Use new monitor persistence flow in funding_signed handling
In the previous commit, we moved all our `ChannelMonitorUpdate`
pipelines to use a new async path via the
`handle_new_monitor_update` macro. This avoids having two message
sending pathways and simply sends messages in the "monitor update
completed" flow, which is shared between sync and async monitor
updates.

Here we reuse the new macro for handling `funding_signed` messages
when doing an initial `ChannelMonitor` persistence. This provides
a similar benefit, simplifying the code a trivial amount, but
importantly allows us to fully remove the original
`handle_monitor_update_res` macro.
2023-02-22 17:34:46 +00:00
Matt Corallo
4e002dcf5c Always process ChannelMonitorUpdates asynchronously
We currently have two codepaths on most channel update functions -
most methods return a set of messages to send a peer iff the
`ChannelMonitorUpdate` succeeds, but if it does not we push the
messages back into the `Channel` and then pull them back out when
the `ChannelMonitorUpdate` completes and send them then. This adds
a substantial amount of complexity in very critical codepaths.

Instead, here we swap all our channel update codepaths to
immediately set the channel-update-required flag and only return a
`ChannelMonitorUpdate` to the `ChannelManager`. Internally in the
`Channel` we store a queue of `ChannelMonitorUpdate`s, which will
become critical in future work to surface pending
`ChannelMonitorUpdate`s to users at startup so they can complete.

This leaves some redundant work in `Channel` to be cleaned up
later. Specifically, we still generate the messages which we will
now ignore and regenerate later.

This commit updates the `ChannelMonitorUpdate` pipeline across all
the places we generate them.
2023-02-22 17:34:46 +00:00
Matt Corallo
46c6fb7f91 Move TODO from handle_monitor_update_res into Channel
The TODO mentioned in `handle_monitor_update_res` about how we
might forget about HTLCs in case of permanent monitor update
failure still applies in spite of all our changes. If a channel is
drop'd in general, monitor-pending updates may be lost if the
monitor update failed to persist.

This was always the case, and is ultimately the general form of the
the specific TODO, so we simply leave comments there
2023-02-22 00:51:13 +00:00
Matt Corallo
9802afa53b Handle MonitorUpdateCompletionActions after monitor update sync
In a previous PR, we added a `MonitorUpdateCompletionAction` enum
which described actions to take after a `ChannelMonitorUpdate`
persistence completes. At the time, it was only used to execute
actions in-line, however in the next commit we'll start (correctly)
leaving the existing actions until after monitor updates complete.
2023-02-22 00:51:13 +00:00
Matt Corallo
d5fb804a32 Limit the number of pending un-funded inbound channel
Because we store some (not large, but not zero) state per-peer,
it's useful to limit the number of peers we have connected, at
least with some buffer.

Much more importantly, each channel has a relatively large cost,
especially around the `ChannelMonitor`s we have to build for each.

Thus, here, we limit the number of channels per-peer which aren't
(yet) on-chain, as well as limit the number of (inbound) peers
which don't have a (funded-on-chain) channel.

Fixes #1889
2023-02-21 22:01:47 +00:00
Matt Corallo
4155f54716 Add an inbound flag to the peer_connected message handlers
Its useful for the message handlers to know if a peer is inbound
for DoS decision-making reasons.
2023-02-21 22:00:42 +00:00
Matt Corallo
e954ee8256
Merge pull request #2035 from TheBlueMatt/2023-02-fix-no-con-discon
Fix (and DRY) the conditionals before calling peer_disconnected
2023-02-21 21:28:05 +00:00
Matt Corallo
be6f263825 Remove the peer_disconnected no_connection_possible flag
Long ago, we used the `no_connection_possible` to signal that a
peer has some unknown feature set or some other condition prevents
us from ever connecting to the given peer. In that case we'd
automatically force-close all channels with the given peer. This
was somewhat surprising to users so we removed the automatic
force-close, leaving the flag serving no LDK-internal purpose.

Distilling the concept of "can we connect to this peer again in the
future" to a simple flag turns out to be ripe with edge cases, so
users actually using the flag to force-close channels would likely
cause surprising behavior.

Thus, there's really not a lot of reason to keep the flag,
especially given its untested and likely to be broken in subtle
ways anyway.
2023-02-21 19:17:06 +00:00
Matt Corallo
0f07d5c0b0 Add a further debug_assert that disconnecting peers are connected 2023-02-21 18:54:52 +00:00
Matt Corallo
10e06331f3 Correct funding_transaction_generated err msg and fix fuzz check
This fixes new errors in `full_stack_target` pointed out by
Chaincode's generous fuzzing infrastructure. Specifically, there's
no reason to check the error message in the
`funding_transaction_generated` return value - it can only return
a failure if the channel has closed since the funding transaction
was generated (which is fine) or if the signer refuses to sign
(which can't happen in fuzzing).
2023-02-21 18:54:52 +00:00
Valentine Wallace
9f41bd7f64
Pass pending_events into pay_internal
Useful for generating Payment(Path)Failed events in this method
2023-02-19 18:01:01 -05:00
Valentine Wallace
a2489b126f
Deduplicate PendingHTLCsForwardable events when queueing 2023-02-17 17:41:21 -05:00
Valentine Wallace
5ea433f71f
Check for abandon-able payments on startup 2023-02-17 17:14:43 -05:00
Matt Corallo
ba07622d05 Add a new monitor update result handling macro
Over the next few commits, this macro will replace the
`handle_monitor_update_res` macro. It takes a different approach -
instead of receiving the message(s) that need to be re-sent after
the monitor update completes and pushing them back into the
channel, we'll not get the messages from the channel at all until
we're ready for them.

This will unify our message sending into only actually fetching +
sending messages in the common monitor-update-completed code,
rather than both there *and* in the functions that call `Channel`
when new messages are originated.
2023-02-17 19:09:28 +00:00
Matt Corallo
56f979be3e Track actions to execute after a ChannelMonitor is updated
When a `ChannelMonitor` update completes, we may need to take some
further action, such as exposing an `Event` to the user initiating
another `ChannelMonitor` update, etc. This commit adds the basic
structure to track such actions and serialize them as required.

Note that while this does introduce a new map which is written as
an even value which users cannot opt out of, the map is only filled
in when users use the asynchronous `ChannelMonitor` updates. As
these are still considered beta, breaking downgrades for such users
is considered acceptable here.
2023-02-17 19:09:28 +00:00
Matt Corallo
5be29c6add Fix the err msg provided when a send fails due to peer disconnected
We haven't had a `MonitorUpdateInProgress` check in
`Channel::is_live` for some time.
2023-02-17 19:09:28 +00:00
Matt Corallo
d0b8f455fe
Merge pull request #2009 from TheBlueMatt/2023-02-no-racey-retries
Fix (and test) threaded payment retries
2023-02-16 23:41:09 +00:00
Matt Corallo
d986329734 Fix (and test) threaded payment retries
The new in-`ChannelManager` retries logic does retries as two
separate steps, under two separate locks - first it calculates
the amount that needs to be retried, then it actually sends it.
Because the first step doesn't udpate the amount, a second thread
may come along and calculate the same amount and end up retrying
duplicatively.

Because we generally shouldn't ever be processing retries at the
same time, the fix is trivial - simply take a lock at the top of
the retry loop and hold it until we're done.
2023-02-16 21:35:25 +00:00
Matt Corallo
6090d9e6a8 Test if a given mutex is locked by the current thread in tests
In anticipation of the next commit(s) adding threaded tests, we
need to ensure our lockorder checks work fine with multiple
threads. Sadly, currently we have tests in the form
`assert!(mutex.try_lock().is_ok())` to assert that a given mutex is
not locked by the caller to a function.

The fix is rather simple given we already track mutexes locked by a
thread in our `debug_sync` logic - simply replace the check with a
new extension trait which (for test builds) checks the locked state
by only looking at what was locked by the current thread.
2023-02-16 21:35:23 +00:00
Valentine Wallace
5c6d8a7cb8
Remove retry_payments method
We're no longer supporting manual retries since
ChannelManager::send_payment_with_retry can be parameterized by a retry
strategy

This commit also updates all docs related to retry_payment and abandon_payment.
Since these docs frequently overlap with changes in preceding commits where we
start abandoning payments on behalf of the user, all the docs are updated in
one go.
2023-02-15 17:59:39 -05:00