mirrors/rust-lightning

mirror of https://github.com/lightningdevkit/rust-lightning.git synced 2025-02-27 00:06:34 +01:00

Author	SHA1	Message	Date
Matt Corallo	3acf7e2c9d	Drop the dummy no-std `Condvar` which never sleeps In `no-std`, we exposed `wait` functions which rely on a dummy `Condvar` which never actually sleeps. This is somwhat nonsensical, not to mention confusing to users. Instead, we simply remove the `wait` methods in `no-std` builds.	2023-04-03 16:49:54 +00:00
Matt Corallo	a1b5a1bba3	Add `CondVar::wait_{timeout_,}while` to `debug_sync` These are useful, but we previously couldn't use them due to our MSRV. Now that we can, we should use them, so we expose them via our normal debug_sync wrappers.	2023-04-03 16:49:54 +00:00
Wilmer Paulino	2166c8a2c4	Ignore lockorder violation on same callsite with different construction As long as the lock order on such locks is still valid, we should allow them regardless of whether they were constructed at the same location or not. Note that we can only really enforce this if we require one lock call per line, or if we have access to symbol columns (as we do on Linux and macOS). We opt for a smaller patch by relying on the latter. This was previously triggered by some recent test changes to `test_manager_serialize_deserialize_inconsistent_monitor`. When the test ends and a node is dropped causing us to persist each, we'd detect a possible lockorder violation deadlock across three different `Mutex` instances that are held at the same location when serializing our `per_peer_states` in `ChannelManager::write`. The presumed lockorder violation happens because the first `Mutex` held shares the same construction location with the third one, while the second `Mutex` has a different construction location. When we hold the second one, we consider the first as a dependency, and then consider the second as a dependency when holding the third, causing a circular dependency (since the third shares the same construction location as the first). This isn't considered a lockorder violation that could result in a deadlock as the comment suggests inline though, since we are under a dependent write lock which no one else can have access to.	2023-03-28 17:27:47 -07:00
Matt Corallo	fac5373687	Merge pull request #2068 from jkczyz/2023-03-doc-fixes Doc and build warning fixes	2023-03-03 22:19:59 +00:00
Jeffrey Czyz	1d1323a3d0	Fix build warnings	2023-03-03 14:23:18 -06:00
Matt Corallo	0ad1f4c943	Track claimed outbound HTLCs in ChannelMonitors When we receive an update_fulfill_htlc message, we immediately try to "claim" the HTLC against the HTLCSource. If there is one, this works great, we immediately generate a `ChannelMonitorUpdate` for the corresponding inbound HTLC and persist that before we ever get to processing our counterparty's `commitment_signed` and persisting the corresponding `ChannelMonitorUpdate`. However, if there isn't one (and this is the first successful HTLC for a payment we sent), we immediately generate a `PaymentSent` event and queue it up for the user. Then, a millisecond later, we receive the `commitment_signed` from our peer, removing the HTLC from the latest local commitment transaction as a side-effect of the `ChannelMonitorUpdate` applied. If the user has processed the `PaymentSent` event by that point, great, we're done. However, if they have not, and we crash prior to persisting the `ChannelManager`, on startup we get confused about the state of the payment. We'll force-close the channel for being stale, and see an HTLC which was removed and is no longer present in the latest commitment transaction (which we're broadcasting). Because we claim corresponding inbound HTLCs before updating a `ChannelMonitor`, we assume such HTLCs have failed - attempting to fail after having claimed should be a noop. However, in the sent-payment case we now generate a `PaymentFailed` event for the user, allowing an HTLC to complete without giving the user a preimage. Here we address this issue by storing the payment preimages for claimed outbound HTLCs in the `ChannelMonitor`, in addition to the existing inbound HTLC preimages already stored there. This allows us to fix the specific issue described by checking for a preimage and switching the type of event generated in response. In addition, it reduces the risk of future confusion by ensuring we don't fail HTLCs which were claimed but not fully committed to before a crash. It does not, however, full fix the issue here - because the preimages are removed after the HTLC has been fully removed from available commitment transactions if we are substantially delayed in persisting the `ChannelManager` from the time we receive the `update_fulfill_htlc` until after a full commitment signed dance completes we may still hit this issue. The full fix for this issue is to delay the persistence of the `ChannelMonitorUpdate` until after the `PaymentSent` event has been processed. This avoids the issue entirely, ensuring we process the event before updating the `ChannelMonitor`, the same as we ensure the upstream HTLC has been claimed before updating the `ChannelMonitor` for forwarded payments. The full solution will be implemented in a later work, however this change still makes sense at that point as well - if we were to delay the initial `commitment_signed` `ChannelMonitorUpdate` util after the `PaymentSent` event has been processed (which likely requires a database update on the users' end), we'd hold our `commitment_signed` + `revoke_and_ack` response for two DB writes (i.e. `fsync()` calls), making our commitment transaction processing a full `fsync` slower. By making this change first, we can instead delay the `ChannelMonitorUpdate` from the counterparty's final `revoke_and_ack` message until the event has been processed, giving us a full network roundtrip to do so and avoiding delaying our response as long as an `fsync` is faster than a network roundtrip.	2023-03-03 17:19:03 +00:00
Matt Corallo	065dc6e689	Make sure individual mutexes are constructed on different lines Our lockdep logic (on Windows) identifies a mutex based on which line it was constructed on. Thus, if we have two mutexes constructed on the same line it will generate false positives.	2023-02-28 01:06:35 +00:00
Matt Corallo	f082ad40b5	Disallow taking two instances of the same mutex at the same time Taking two instances of the same mutex may be totally fine, but it requires a total lockorder that we cannot (trivially) check. Thus, its generally unsafe to do if we can avoid it. To discourage doing this, here we default to panicing on such locks in our lockorder tests, with a separate lock function added that is clearly labeled "unsafe" to allow doing so when we can guarantee a total lockorder. This requires adapting a number of sites to the new API, including fixing a bug this turned up in `ChannelMonitor`'s `PartialEq` where no lockorder was guaranteed.	2023-02-28 01:06:35 +00:00
Matt Corallo	9c08fbd435	Refuse recursive read locks in lockorder testing Our existing lockorder tests assume that a read lock on a thread that is already holding the same read lock is totally fine. This isn't at all true. The `std` `RwLock` behavior is platform-dependent - on most platforms readers can starve writers as readers will never block for a pending writer. However, on platforms where this is not the case, one thread trying to take a write lock may deadlock with another thread that both already has, and is attempting to take again, a read lock. Worse, our in-tree `FairRwLock` exhibits this behavior explicitly on all platforms to avoid the starvation issue. Thus, we shouldn't have any special handling for allowing recursive read locks, so we simply remove it here.	2023-02-28 01:06:35 +00:00
Matt Corallo	6090d9e6a8	Test if a given mutex is locked by the current thread in tests In anticipation of the next commit(s) adding threaded tests, we need to ensure our lockorder checks work fine with multiple threads. Sadly, currently we have tests in the form `assert!(mutex.try_lock().is_ok())` to assert that a given mutex is not locked by the caller to a function. The fix is rather simple given we already track mutexes locked by a thread in our `debug_sync` logic - simply replace the check with a new extension trait which (for test builds) checks the locked state by only looking at what was locked by the current thread.	2023-02-16 21:35:23 +00:00
Matt Corallo	9422370dd2	Move `fairrwlock` to the `sync` module	2023-02-14 23:33:02 +00:00
Matt Corallo	ab46f6b988	Make `debug_sync` regex more robust On windows the symbol names appear to sometimes be truncated, which causes the symbol name to not include the `::new` at the end. This causes the regex to mis-match and track the wrong location for the mutex construction, leading to bogus lockorder violations. For example, in testing the following symbol name appeared on Windows, without the function name itself: `lightning::debug_sync::RwLock<std::collections:#️⃣:map::HashMap<lightning::chain::transaction::OutPoint,lightning::chain::chainmonitor::MonitorHolder<lightning::util::enforcing_trait_impls::EnforcingSigner>,std::collections:#️⃣:map::RandomState> >::`	2023-01-10 06:48:04 +00:00
Matt Corallo	230331f3e8	Move tests from debug_sync to a new submodule This will allow us to change the module regex match in debug_sync to make it more robust.	2023-01-10 06:48:04 +00:00
Matt Corallo	558bfa3fb3	Move `debug_sync` to the new `sync` folder	2023-01-10 06:31:13 +00:00
Matt Corallo	f66f720fa5	Move `no-std` sync implementations to a folder to clean up	2023-01-10 06:26:46 +00:00

15 commits