mirror of
https://github.com/lightningnetwork/lnd.git
synced 2025-01-18 21:35:24 +01:00
eae9dd07f0
This PR is a follow up, to a [follow up](https://github.com/lightningnetwork/lnd/pull/7938) of an [initial concurrency issue](https://github.com/lightningnetwork/lnd/pull/7856) fixed in the peer goroutine. In #7938, we noticed that the introduction of `p.startReady` can cause `Disconnect` to block. This happens as `Disconnect` cannot be called until `p.startReady` has been closed. `Disconnect` is also called from `InboundPeerConnected` (the case of concurrent peers, so we need to remove one of the connections) while the main server mutex is held. If `p.Start` blocks for any reason, then this leads to the deadlock as: we can't disconnect until we've finished starting, and we can't finish starting as we need the disconnect caller to exit as it has the mutex. In this commit, we now make the call to `prunePersistentPeerConnection` async. The call to `prunePersistentPeerConnection` eventually wants to grab the server mutex, which triggers the circular waiting scenario above. The main learning here is that no calls to the main server mutex path can block from `p.Start`. This is more or less a stop gap to resolve the issue initially introduced in v0.16.4. Assuming we want to move forward with this fix, we should reexamine `p.startReady` all together, and also revisit attempt to refactor this section of the code to eliminate the mega mutex in the server in favor of a dedicated event loop. |
||
---|---|---|
.. | ||
brontide_test.go | ||
brontide.go | ||
interfaces.go | ||
log.go | ||
musig_chan_closer.go | ||
test_utils.go |