Commit graph

4809 commits

Author SHA1 Message Date
Matt Corallo
75527dbdf1 Remove unused compat block in provide_latest_holder_commitment_tx 2023-03-03 17:19:03 +00:00
Matt Corallo
0ad1f4c943 Track claimed outbound HTLCs in ChannelMonitors
When we receive an update_fulfill_htlc message, we immediately try
to "claim" the HTLC against the HTLCSource. If there is one, this
works great, we immediately generate a `ChannelMonitorUpdate` for
the corresponding inbound HTLC and persist that before we ever get
to processing our counterparty's `commitment_signed` and persisting
the corresponding `ChannelMonitorUpdate`.

However, if there isn't one (and this is the first successful HTLC
for a payment we sent), we immediately generate a `PaymentSent`
event and queue it up for the user. Then, a millisecond later, we
receive the `commitment_signed` from our peer, removing the HTLC
from the latest local commitment transaction as a side-effect of
the `ChannelMonitorUpdate` applied.

If the user has processed the `PaymentSent` event by that point,
great, we're done. However, if they have not, and we crash prior to
persisting the `ChannelManager`, on startup we get confused about
the state of the payment. We'll force-close the channel for being
stale, and see an HTLC which was removed and is no longer present
in the latest commitment transaction (which we're broadcasting).
Because we claim corresponding inbound HTLCs before updating a
`ChannelMonitor`, we assume such HTLCs have failed - attempting to
fail after having claimed should be a noop. However, in the
sent-payment case we now generate a `PaymentFailed` event for the
user, allowing an HTLC to complete without giving the user a
preimage.

Here we address this issue by storing the payment preimages for
claimed outbound HTLCs in the `ChannelMonitor`, in addition to the
existing inbound HTLC preimages already stored there. This allows
us to fix the specific issue described by checking for a preimage
and switching the type of event generated in response. In addition,
it reduces the risk of future confusion by ensuring we don't fail
HTLCs which were claimed but not fully committed to before a crash.

It does not, however, full fix the issue here - because the
preimages are removed after the HTLC has been fully removed from
available commitment transactions if we are substantially delayed
in persisting the `ChannelManager` from the time we receive the
`update_fulfill_htlc` until after a full commitment signed dance
completes we may still hit this issue. The full fix for this issue
is to delay the persistence of the `ChannelMonitorUpdate` until
after the `PaymentSent` event has been processed. This avoids the
issue entirely, ensuring we process the event before updating the
`ChannelMonitor`, the same as we ensure the upstream HTLC has been
claimed before updating the `ChannelMonitor` for forwarded
payments.

The full solution will be implemented in a later work, however this
change still makes sense at that point as well - if we were to
delay the initial `commitment_signed` `ChannelMonitorUpdate` util
after the `PaymentSent` event has been processed (which likely
requires a database update on the users' end), we'd hold our
`commitment_signed` + `revoke_and_ack` response for two DB writes
(i.e. `fsync()` calls), making our commitment transaction
processing a full `fsync` slower. By making this change first, we
can instead delay the `ChannelMonitorUpdate` from the
counterparty's final `revoke_and_ack` message until the event has
been processed, giving us a full network roundtrip to do so and
avoiding delaying our response as long as an `fsync` is faster than
a network roundtrip.
2023-03-03 17:19:03 +00:00
Matt Corallo
33c36d007e Avoid removing stale preimages when hashes collide in fuzzing 2023-03-02 23:42:40 +00:00
Wilmer Paulino
6ddf69c93b
Merge pull request #2064 from TheBlueMatt/2023-03-debug-futures
Make waking after a future completes propagates to the next future
2023-03-02 11:49:54 -08:00
Jeffrey Czyz
f1145158fe
Merge pull request #2057 from johncantrell97/rpc-error-code
Surface bitcoind rpc error information
2023-03-02 08:26:29 -06:00
Matt Corallo
cd03cb6477 Make waking after a future completes propagates to the next future
In our `wakers`, if we first `notify` a future, which is then
`poll`ed complete, and then `notify` the same waker again before a
new future is fetched, that new future will be marked as
non-complete initially and wait for a third `notify`.

The fix is luckily rather trivial, when we `notify` a future, if it
is completed immediately, simply wipe the future state so that we
look at the pending-notify flag when we generate the next future.
2023-03-02 07:50:16 +00:00
Matt Corallo
04ce6de4eb
Merge pull request #2061 from TheBlueMatt/2023-02-114-upstream-bindings
`C-not exported` tags for 0.0.114
2023-03-01 18:57:12 +00:00
Matt Corallo
40fcfafef2 Tag types used for the TLV macros with (C-not exported)
Obviously bindings users can't use the Rust TLV-implementation
macros, so there's no reason to export typsed used exclusively by
them.
2023-03-01 00:21:28 +00:00
Matt Corallo
08cb01e6ef Mark IndexedMap types as (C-not exported)
While we could try to expose the type explicitly, we already have
alternative accessors for bindings, and mapping `Hash`, `Ord` and
the other requirements for `IndexedMap` would be a good chunk of
additional work.
2023-03-01 00:21:23 +00:00
Wilmer Paulino
0b1a64f12d
Merge pull request #2046 from TheBlueMatt/2023-02-rgs-robust-and-log
Do not fail to apply RGS updates for removed channels
2023-02-28 11:36:04 -08:00
John Cantrell
a7af24bf3c Surface bitcoind rpc error code
Users of the RpcClient had no way to access the error code
returned by bitcoind's rpc.  We embed a new RpcError struct
as the inner error for the returned io::Error. Users can access
both the code and the message using this inner struct.
2023-02-28 14:03:28 -05:00
Matt Corallo
d2f5dc01d1 Add some basic logging to Rapid Gossip Sync processing 2023-02-28 17:56:16 +00:00
Matt Corallo
df20bcf188 Make log macros more usable outside of the lightning crate
... by using explicit paths rather than requiring imports.
2023-02-28 17:56:16 +00:00
Matt Corallo
b8eecd1fd8 Do not fail to apply RGS updates for removed channels
If we receive a Rapid Gossip Sync update for channels where we are
missing the existing channel data, we should ignore the missing
channel. This can happen in a number of cases, whether because we
received updated channel information via an onion error from an
HTLC failure or because we've partially synced the graph from a
peer over the standard lightning P2P protocol.
2023-02-28 17:56:16 +00:00
Wilmer Paulino
8311581fe1
Merge pull request #2006 from TheBlueMatt/2023-02-no-recursive-read-locks
Refuse recursive read locks
2023-02-28 00:24:16 -08:00
Matt Corallo
b8bea74d6a
Merge pull request #2015 from TheBlueMatt/2023-02-no-dumb-redundant-fields 2023-02-28 07:55:56 +00:00
Matt Corallo
065dc6e689 Make sure individual mutexes are constructed on different lines
Our lockdep logic (on Windows) identifies a mutex based on which
line it was constructed on. Thus, if we have two mutexes
constructed on the same line it will generate false positives.
2023-02-28 01:06:35 +00:00
Matt Corallo
22662efbc4 Export RUST_BACKTRACE=1 in --feature backtrace CI test
as this test often fails on windows which is hard to debug locally
for most contributors.
2023-02-28 01:06:35 +00:00
Matt Corallo
f082ad40b5 Disallow taking two instances of the same mutex at the same time
Taking two instances of the same mutex may be totally fine, but it
requires a total lockorder that we cannot (trivially) check. Thus,
its generally unsafe to do if we can avoid it.

To discourage doing this, here we default to panicing on such locks
in our lockorder tests, with a separate lock function added that is
clearly labeled "unsafe" to allow doing so when we can guarantee a
total lockorder.

This requires adapting a number of sites to the new API, including
fixing a bug this turned up in `ChannelMonitor`'s `PartialEq` where
no lockorder was guaranteed.
2023-02-28 01:06:35 +00:00
Matt Corallo
9c08fbd435 Refuse recursive read locks in lockorder testing
Our existing lockorder tests assume that a read lock on a thread
that is already holding the same read lock is totally fine. This
isn't at all true. The `std` `RwLock` behavior is
platform-dependent - on most platforms readers can starve writers
as readers will never block for a pending writer. However, on
platforms where this is not the case, one thread trying to take a
write lock may deadlock with another thread that both already has,
and is attempting to take again, a read lock.

Worse, our in-tree `FairRwLock` exhibits this behavior explicitly
on all platforms to avoid the starvation issue.

Thus, we shouldn't have any special handling for allowing recursive
read locks, so we simply remove it here.
2023-02-28 01:06:35 +00:00
Matt Corallo
2c8a26c6d2 Don't per_peer_state read locks recursively in monitor updating
When handling a `ChannelMonitor` update via the new
`handle_new_monitor_update` macro, we always call the macro with
the `per_peer_state` read lock held and have the macro drop the
per-peer state lock. Then, when handling the resulting updates, we
may take the `per_peer_state` read lock again in another function.

In a coming commit, recursive read locks will be disallowed, so we
have to drop the `per_peer_state` read lock before calling
additional functions in `handle_new_monitor_update`, which we do
here.
2023-02-28 01:06:35 +00:00
Matt Corallo
38374dde42 Expect callers to hold read locks before channel_monitor_updated
Our existing lockorder tests assume that a read lock on a thread
that is already holding the same read lock is totally fine. This
isn't at all true. The `std` `RwLock` behavior is
platform-dependent - on most platforms readers can starve writers
as readers will never block for a pending writer. However, on
platforms where this is not the case, one thread trying to take a
write lock may deadlock with another thread that both already has,
and is attempting to take again, a read lock.

Worse, our in-tree `FairRwLock` exhibits this behavior explicitly
on all platforms to avoid the starvation issue.

Sadly, a user ended up hitting this deadlock in production in the
form of a call to `get_and_clear_pending_msg_events` which holds
the `ChannelManager::total_consistency_lock` before calling
`process_pending_monitor_events` and eventually
`channel_monitor_updated`, which tries to take the same read lock
again.

Luckily, the fix is trivial, simply remove the redundand read lock
in `channel_monitor_updated`.

Fixes #2000
2023-02-28 01:06:35 +00:00
Matt Corallo
1d4bc1e0fb Hold the total_consistency_lock while in outbound_payment fns
We previously avoided holding the `total_consistency_lock` while
doing crypto operations to build onions. However, now that we've
abstracted out the outbound payment logic into a utility module,
ensuring the state is consistent at all times is now abstracted
away from code authors and reviewers, making it likely to break.

Further, because we now call `send_payment_along_path` both with,
and without, the `total_consistency_lock`, and because recursive
read locks may deadlock, it would now be quite difficult to figure
out which paths through `outbound_payment` need the lock and which
don't.

While it may slow writes somewhat, it's not really worth trying to
figure out this mess, instead we just hold the
`total_consistency_lock` before going into `outbound_payment`
functions.
2023-02-28 01:06:35 +00:00
Matt Corallo
f03b7cd448 Remove the final_cltv_expiry_delta in RouteParameters entirely
fbc08477e8 purported to "move" the
`final_cltv_expiry_delta` field to `PaymentParamters` from
`RouteParameters`. However, for naive backwards-compatibility
reasons it left the existing on in place and only added a new,
redundant field in `PaymentParameters`.

It turns out there's really no reason for this - if we take a more
critical eye towards backwards compatibility we can figure out the
correct value in every `PaymentParameters` while deserializing.

We do this here - making `PaymentParameters` a `ReadableArgs`
taking a "default" `cltv_expiry_delta` when it goes to read. This
allows existing `RouteParameters` objects to pass the read
`final_cltv_expiry_delta` field in to be used if the new field
wasn't present.
2023-02-27 22:33:21 +00:00
Matt Corallo
6aa5ebb1aa Support ReadableArgs types across in the TLV struct serialization
This adds `required` support for trait-wrapped reading (e.g. for
objects read via `ReadableArgs`) as well as support for the
trait-wrapped reading syntax across the TLV struct/enum
serialization macros.
2023-02-27 22:31:11 +00:00
Matt Corallo
9364154022 Require a non-0 number of non-empty paths when deserializing routes
When we read a `Route` (or a list of `RouteHop`s), we should never
have zero paths or zero `RouteHop`s in a path. As such, its fine to
simply reject these at deserialization-time. Technically this could
lead to something which we can generate not round-trip'ing
serialization, but that seems okay here.
2023-02-27 22:31:11 +00:00
Matt Corallo
b5e5435c4e
Merge pull request #2055 from TheBlueMatt/2023-02-fning-msrv
Remove the lightning-transaction-sync MSRV
2023-02-27 22:30:14 +00:00
Matt Corallo
64a7567928 Move lightning-transaction-sync out of the main workspace
Because `lightning-transaction-sync` does not have an MSRV (and
because its dev-dependencies are huge), we can't build it by
default when devs run `cargo test`, so it is moved out of the
top-level workspace.
2023-02-27 19:01:29 +00:00
Matt Corallo
0fc5a3ae0f Remove the lightning-transaction-sync MSRV
Apparently it has dependencies that don't track an MSRV at all, so
we can't practically enforce one in CI.
2023-02-27 18:28:40 +00:00
Matt Corallo
bb0f696466 Fix silent rebase conflict in previous PR 2023-02-27 18:28:40 +00:00
Matt Corallo
a0c65c32a5
Merge pull request #2043 from valentinewallace/2023-02-initial-send-path-fails
`PaymentPathFailed`: add initial send error details
2023-02-27 18:10:27 +00:00
Matt Corallo
596ef3bbea
Merge pull request #2033 from tnull/2023-02-make-esplora-sync-test-more-robust
Make `test_esplora_syncs` more robust
2023-02-27 17:55:43 +00:00
Matt Corallo
16b3c720a6
Merge pull request #2025 from TheBlueMatt/2023-02-no-pub-genesis-hashes
Remove genesis block hash from public API
2023-02-27 17:50:22 +00:00
Valentine Wallace
f2f90d5fb0
Fix PaymentPathFailed generation and scid on initial send 2023-02-25 16:13:43 -05:00
Valentine Wallace
1dcb3ecb6c
Change PaymentPathFailed's optional network update to a Failure enum
This let us capture the errors when we fail without committing to an HTLC vs
failing via update_fail.
2023-02-25 16:13:42 -05:00
Valentine Wallace
8d686d83cb
Implement writeable for APIError 2023-02-25 16:13:42 -05:00
Valentine Wallace
34f8c39630
ser_macros: Document behavior of upgradable_* variants 2023-02-25 16:13:42 -05:00
Valentine Wallace
5d0ee867ea
Fix upgradable_required fields to actually be required in lower level macros
When using lower level macros such as read_tlv_stream, upgradable_required
fields have been treated as regular options. This is incorrect, they should
either be upgradable_options or treated as required fields.
2023-02-25 16:13:39 -05:00
Matt Corallo
f71daed02d
Merge pull request #1977 from jkczyz/2023-01-offers-fuzz
BOLT 12 deserialization fuzzers
2023-02-25 18:03:51 +00:00
Valentine Wallace
70c7161dbe
Rename OptionDeserWrapper->RequiredWrapper
Makes more sense and sets us up for the UpgradableRequired version of it
2023-02-24 20:00:14 -05:00
Valentine Wallace
0e88538b78
Disambiguate ignorable and ignorable_option
Would rather not rename ignorable to ignorable_required, so rename both of them
to upgradable_*
2023-02-24 14:21:11 -05:00
Valentine Wallace
52551a9fc8
Support deserializing an Option-al MaybeReadable
Prior to this change, our impl_writeable_tlv_based macros only supported
deserializing a MaybeReadable if it's non-Optional.
2023-02-24 14:21:11 -05:00
Valentine Wallace
2037a241f4
Remove all_paths_failed from PaymentPathFailed
This field was previous useful in manual retries for users to know when all
paths of a payment have failed and it is safe to retry. Now that we support
automatic retries in ChannelManager and no longer support manual retries, the
field is no longer useful.

For backwards compat, we now always write false for this field. If we didn't do
this, previous versions would default this field's value to true, which can be
problematic because some clients have relied on the field to indicate when a
full payment retry is safe.
2023-02-24 14:21:08 -05:00
wpaulino
22d1bab7ac
Merge pull request #2050 from douglaz/import-lightning-io-futures
Only import lightning::io if futures are enabled
2023-02-24 11:03:32 -08:00
Allan Douglas R. de Oliveira
19ef2bb6ac Only import lightning::io if futures are enabled 2023-02-24 02:56:11 +00:00
Elias Rohrer
e878f75bef
Make test_esplora_syncs more robust
The test generally works and the parts in question *should* never fail.
However, they did, so we give the test server instances some leeway.
2023-02-23 20:05:16 -06:00
Jeffrey Czyz
9c2a3d090b
Fix amount overflow in Invoice building
An overflow can occur when multiplying the offer amount by the requested
quantity when no amount is given in the request. Return an error instead
of overflowing.
2023-02-23 18:25:50 -06:00
Jeffrey Czyz
32ed69a2bd
Fix amount overflow in Offer parsing and building
An overflow can occur when multiplying the offer amount by the requested
quantity when checking if the given amount is enough. Return an error
instead of overflowing.
2023-02-23 18:25:50 -06:00
Jeffrey Czyz
3d41df025d
Fuzz test for bech32 decoding
Fuzz testing bech32 decoding along with deserializing the underlying
message can result in overly exhaustive searches. Instead, the message
deserializations are now fuzzed separately. Add fuzzing for bech32
decoding.
2023-02-23 18:25:49 -06:00
Jeffrey Czyz
56a01de61d
Expose Bech32Encode trait for fuzzing
In order to fuzz test Bech32Encode parsing independent of the underlying
message deserialization, the trait needs to be exposed. Conditionally
expose it only for fuzzing.
2023-02-23 18:25:49 -06:00