Commit graph

222 commits

Author SHA1 Message Date
Carla Kirk-Cohen
e4f90ec593
lnwire: add blinding point to update_add_htlc TLVs
Add blinding points to update_add_htlc. This TLV will be set for
nodes that are relaying payments in blinded routes that are _not_
the introduction node.
2024-03-27 09:38:50 -04:00
Keagan McClelland
659ad459bb brontide: placate linter 2024-03-08 15:54:25 -08:00
Keagan McClelland
db39a905cb multi: make NewChanIDFromOutpoint accept value instead of pointer 2024-03-08 15:47:55 -08:00
Keagan McClelland
fd1cd315ce multi: don't leak underlying pointer to LightningChannel.ChannelPoint() 2024-03-08 15:27:19 -08:00
Keagan McClelland
f5f0286e34 peer+lnwire: move LinkUpdater to lnwire
The purpose of this interface is to distinguish messages that are
bound to a particular channel_id from those that are not. There is
no reason this ought to be a property of the peer package as it is
entirely determined by the contents of the message itself. Therefore
we move it to the lnwire package so it can be used without having
import cycles elsewhere in the codebase.
2024-03-06 11:59:19 -08:00
ffranr
cd566eb097
multi: fix fmt.Errorf error wrapping
Refactor fmt.Errorf usage to correctly wrap errors instead of using
non-wrapping format verbs.
2024-02-27 11:13:40 +00:00
Elle Mouton
785790abed
peer/lnwallet: persist shutdown info on send
In this commit, we start persisting shutdown info when we send the
Shutdown message. When starting up a link, we also first check if we
have previously persisted Shutdown info and if we have, we start the
link in shutdown mode meaning that it will not accept any new outgoing
HTLC additions and it will queue the shutdown message after any pending
CommitSig has been sent.
2024-02-21 11:58:18 +02:00
Elle Mouton
8c064b86f1
peer: call DisableAdds before link.OnCommitOnce
This commit moves calls to link.DisableAdds to outside link.OnCommitOnce
call backs. This is done so that if a user requests shutdown, then we
can immediately block any new outgoing HTLCs. It's only the sending of
the Shutdown message that needs to wait until after any pending
CommitSig message is sent.
2024-02-21 11:35:11 +02:00
Elle Mouton
972f57e9a7
peer+htlcswitch: update Enable/DisableAdds API
In this commit, the `ChannelUpdateHandler`'s `EnableAdds` and
`DisableAdds` methods are adjusted to return booleans instead of errors.
This is done becuase currently, any error returned by these methods is
treated by just logging the error since today all it means is that the
proposed update has already been done. And so all we do today is log the
error. But in future, if these methods are updated to return actual
errors that need to be handled, then we might forget to handle them
correctly at the various call sights. So we instead change the signature
of the function to just return a boolean. In future, if we do need to
return any error, we will have to go inspect every call sight in any
case to fix compliation & then we can be sure we are handling the errors
correctly.
2024-02-21 11:35:11 +02:00
Elle Mouton
71753af8ee
multi: fix various typos 2024-02-21 11:35:10 +02:00
Keagan McClelland
317aaf9940 peer: fix issue where we echo remote Shutdown 2024-01-30 12:15:59 -05:00
Olaoluwa Osuntokun
1caca81ba1
Merge pull request #8167 from ProofOfKeags/bugfix/htlc-flush-shutdown
Bugfix/htlc flush shutdown
2024-01-23 18:49:47 -08:00
Keagan McClelland
69ef1b2b3b docs: update release notes 2024-01-23 14:31:57 -08:00
Olaoluwa Osuntokun
185119f5c3
peer: refactor main event loop for ping handler
The error was never used as the init couldn't return an error, so we do
away with that. We also modify the main event loop dispatch to more
closely match other areas of the codebase.
2024-01-22 18:20:30 -08:00
Olaoluwa Osuntokun
b5cbeb4ad7
peer: make PingManager disconnect call async
In this commit, we make all calls to disconnect after a ping/pong
violation is detected in the `PingManager` async. We do this to avoid
circular waiting that may occur if the disconnect call back ends up
waiting on the peer goroutine to be torn down. If this happens, then the
peer goroutine will be blocked on the ping manager fully tearing down,
which is blocked on the peer disconnect succeeding.

This is a similar class of issue we've delt with recently as pertains to
the peer and the server: sync all back execution must not lead to
a circular waiting loop.

Fixes #8379
2024-01-22 18:20:23 -08:00
Keagan McClelland
ec55831229 htlcswitch+peer: remove ShutdownIfChannelClean 2024-01-22 16:08:59 -08:00
Keagan McClelland
64fda6ca65 htlcswitch: implement flush and commit lifecycle hooks for channelLink 2024-01-22 16:08:55 -08:00
Keagan McClelland
5ab69aedc7 peer: remove tryLinkShutdown due to redundance
We don't need to try a link shutdown when the chan closer is fetched
since, by this commit, the only callsite manages the shutdown semantics.
After removing the call to tryLinkShutdown, we no longer need the
function at all.
2024-01-22 12:19:58 -08:00
Keagan McClelland
025e569f07 peer: fix local close requests to shutdown via link lifecycle hooks
In order to handle shutdown requests when there are still HTLCs on
the link, we have to manage the shutdown process via the link's
lifecycle hooks. This means we can't use the simple `tryLinkShutdown`
anymore and instead queue a `Shutdown` message at the next opportunity
to do so -- when we send our next `CommitSig`
2024-01-22 12:19:58 -08:00
Keagan McClelland
442f1dd677 peer: handle close messages using link lifecycle hooks 2024-01-22 12:19:58 -08:00
Keagan McClelland
f81e7ada4c peer: rewrite handleCloseMsg in terms of new ChanCloser methods 2024-01-22 12:19:58 -08:00
Keagan McClelland
9b2d1018f2 htlcswitch+peer: add flush api and lifecycle hooks to ChannelUpdateHandler
We also add dummy implementations to channelLink and various mocks.
2024-01-22 12:19:58 -08:00
ziggie
3530254ff4
peer: add unit test.
Add a unit test for the removal of a pending channel.
2024-01-22 13:01:42 +00:00
ziggie
a5d2541292
funding: initialize remove channel. 2024-01-22 12:58:52 +00:00
Elle Mouton
f12cc12da5
server: register a taproot tower client 2024-01-19 15:33:07 +02:00
Keagan McClelland
717facc202 feature: make gossip queries compulsory 2024-01-11 13:57:56 -08:00
bitcoin-lightning
b72fc9529e docs: fix typos 2023-12-21 15:21:35 +00:00
Elle
4fa483f1bc
Merge pull request #7702 from ellemouton/towerClientMux
wtclient: Tower Client Multiplexer
2023-12-05 12:27:05 +02:00
Elle Mouton
e800aacff4
wtclient+server: unexport and rename TowerClient
Rename and unexport the `TowerClient` struct to `client` and rename the
`TowerClientManager` interface to `ClientManager`.
2023-11-28 11:01:51 +02:00
Elle Mouton
fcfdf699e3
multi: move BackupState and RegisterChannel to Manager
This commit moves over the last two methods, `RegisterChannel` and
`BackupState` from the `Client` to the `Manager` interface. With this
change, we no longer need to pass around the individual clients around
and now only need to pass the manager around.

To do this change, all the goroutines that handle channel closes,
closable sessions needed to be moved to the Manager and so a large part
of this commit is just moving this code from the TowerClient to the
Manager.
2023-11-28 10:59:40 +02:00
yyforyongyu
85f4b13632
multi: enhance logging around channel reestablishment 2023-11-28 14:06:53 +08:00
Matt Morehouse
f0ae5b2456
peer: add test for startup race on writeMessage
The test reliably detects
https://github.com/lightningnetwork/lnd/issues/8184.
2023-11-16 17:33:44 -06:00
Matt Morehouse
08fff28504
peer: enable mockMessageConn to detect data races
We use unsynchronized counters to trigger a report under the race
detector if multiple reads or writes happen concurrently.
2023-11-16 17:32:19 -06:00
Matt Morehouse
c4e0daa274
peer: add missing Close() method to mockMessageConn 2023-11-16 17:31:18 -06:00
Eugene Siegel
7556402ea4
peer: send reestablish, shutdown messages before starting writeHandler
This is to avoid a potential race on WriteMessage and Flush internals.
Because there is no locking on WriteMessage and Flush, if we allow
writeMessage calls in Start after the writeHandler has started,
the writeMessage calls may call WriteMessage/Flush at the same time
that writeMessage calls from the writeHandler does. Since there is
no locking, internals like b.nextHeaderSend can race and cause
panics.
2023-11-16 12:07:10 -05:00
Elle
ed179e3e7d
Merge pull request #8180 from carlaKC/8128-brontideflake
peer/test: fix race in TestHandleNewPendingChannel
2023-11-16 10:10:24 +02:00
Carla Kirk-Cohen
a8a86b290c
peer/test: setup clean peer for each TestHandleNewPendingChannel case
Running test cases in parallel with shared state means that they
can intermingle if they're run at the same time and set up incorrect
values when we assert on the number of channels we have. Moving the
peer setup into the parallel test run fixes this because no single
test case can interfere with the other.
2023-11-14 15:45:19 -05:00
Elle
6122452509
Merge pull request #8117 from ellemouton/removeDBFromChannelDBSchemas
channeldb+refactor: remove `kvdb.Backend` from channel DB schemas
2023-11-09 07:47:11 +02:00
Eugene Siegel
955a345153
peer: send messages in Start via writeMessage to bypass writeHandler
Without this the following could happen:

* InboundPeerConnected is called while we already have an inbound
connection with the peer. This calls removePeer which calls Disconnect.
* If the peer is starting up in Start, it may be sending messages
synchronously via SendMessage(true, ...). This eventually calls the
writeMessage function which will exit if disconnect is set to 1.
* Since Disconnect was called, disconnect will be 1 and writeMessage
will exit, causing writeHandler to exit.
* If there is more than 1 message being sent, later messages will
queue in queueHandler but be unable to get into sendQueue as the
writeHandler goroutine has exited.
* The synchronous sends will be waiting on the errChan indefinitely
and startReady will never get closed meaning Disconnect will never
proceed.

The end result is that the server's mutex will be held until shutdown.

Avoid this by using writeMessage to bypass the writeHandler goroutine.
2023-11-08 16:36:02 -05:00
Elle Mouton
84cdcd6847
multi: move DB schemas to channeldb/models
This commit moves the ChannelEdgePolicy, ChannelEdgeInfo,
ChanelAuthProof and CachedEdgePolicy structs to the `channeldb/models`
package.
2023-11-08 14:50:35 +02:00
Eugene Siegel
dc42b160a0
multi: skip InitRemoteMusigNonces if we've already called it
Prior to this commit, taproot channels had a bug:

- If a disconnect happened before peer.AddNewChannel was called,
  then the subsequent reconnect would call peer.AddNewChannel and
  attempt the ChannelReestablish dance.

- peer.AddNewChannel would call NewLightningChannel with
  populated nonce ChannelOpts. This in turn would call
  InitRemoteMusigNonces which would create a new musig pair session
  and set the channel's pendingVerificationNonce to nil.

- During the reestablish dance, ProcessChanSyncMsg would be called.
  This would also call InitRemoteMusigNonces, except it would fail
  since pendingVerificationNonce was set to nil in the previous
  invocation.

To fix this, we add a new functional option to signal to the init logic
that it doesn't need to call InitRemoteMusigNonces in   in
ProcessChanSyncMsg.
2023-10-31 10:10:35 -07:00
Keagan McClelland
012cc6af8c peer: eliminate unnecessary log spam from received pong msgs 2023-10-19 09:26:50 -07:00
Keagan McClelland
99226e37ef peer: Add machinery to track the state and validity of remote pongs
This commit refactors some of the bookkeeping around the ping logic
inside of the Brontide. If the pong response is noncompliant with
the spec or if it times out, we disconnect from the peer.
2023-10-19 09:26:45 -07:00
Keagan McClelland
1eab7826d2 peer: abstract out ping payload generation from the pingHandler
This change makes the generation of the ping payload a no-arg
closure parameter, relieving the pingHandler of having to
directly monitor the chain state. This makes use of the
BestBlockView that was introduced in earlier commits.
2023-10-19 09:22:07 -07:00
Keagan McClelland
ac265812fc peer+lnd: make BestBlockView available to Brontide 2023-10-19 09:22:07 -07:00
Oliver Gugger
82838e4a62
Merge pull request #8090 from ziggie1984/add-additional-log-chan-closure
Add more information when a co-op close is failing.
2023-10-17 10:27:21 +00:00
ziggie
e4dd911778
multi: clarify co-op closure failures. 2023-10-13 20:35:08 +02:00
yyforyongyu
839b6271e5
peer: fix unit test flake 2023-10-13 13:50:07 +08:00
Olaoluwa Osuntokun
eae9dd07f0
peer: launch persistent peer pruning in background goroutine
This PR is a follow up, to a [follow
up](https://github.com/lightningnetwork/lnd/pull/7938) of an [initial
concurrency issue](https://github.com/lightningnetwork/lnd/pull/7856)
fixed in the peer goroutine.

In #7938, we noticed that the introduction of `p.startReady` can cause
`Disconnect` to block. This happens as `Disconnect` cannot be called
until `p.startReady` has been closed. `Disconnect` is also called from
`InboundPeerConnected` (the case of concurrent peers, so we need to
remove one of the connections) while the main server mutex is held. If
`p.Start` blocks for any reason, then this leads to the deadlock as: we
can't disconnect until we've finished starting, and we can't finish
starting as we need the disconnect caller to exit as it has the mutex.

In this commit, we now make the call to `prunePersistentPeerConnection`
async. The call to `prunePersistentPeerConnection` eventually wants to
grab the server mutex, which triggers the circular waiting scenario
above.

The main learning here is that no calls to the main server mutex path
can block from `p.Start`. This is more or less a stop gap to resolve the
issue initially introduced in v0.16.4. Assuming we want to move forward
with this fix, we should reexamine `p.startReady` all together, and also
revisit attempt to refactor this section of the code to eliminate the
mega mutex in the server in favor of a dedicated event loop.
2023-09-27 21:24:50 -05:00
Olaoluwa Osuntokun
01c64712a3
multi: ensure link is always torn down due to db failures, add exponential back off for sql-kvdb failures (#7927)
* lnwallet: fix log output msg

The log message is off by one.

* htlcswitch: fail channel when revoking it fails.

When the revocation of a channel state fails after receiving a new
CommitmentSigned msg we have to fail the channel otherwise we
continue with an unclean state.

* docs: update release-docs

* htlcswitch: tear down connection if revocation processing fails

If we couldn't revoke due to a DB error, then we want to also tear down
the connection, as we don't want the other party to continue to send
updates. That may lead to de-sync'd state an eventual force close.
Otherwise, the database might be able to recover come the next
reconnection attempt.

* kvdb: use sql.LevelSerializable for all backends

In this commit, we modify the default isolation level to be
`sql.LevelSerializable. This is the strictness isolation type for
postgres. For sqlite, there's only ever a single writer, so this doesn't
apply directly.

* kvdb/sqlbase: add randomized exponential backoff for serialization failures

In this commit, we add randomized exponential backoff for serialization
failures. For postgres, we''ll his this any time a transaction set fails
to be linearized. For sqlite, we'll his this if we have many writers
trying to grab the write lock at time same time, manifesting as a
`SQLITE_BUSY` error code.

As is, we'll retry up to 10 times, waiting a minimum of 50 miliseconds
between each attempt, up to 5 seconds without any delay at all. For
sqlite, this is also bounded by the busy timeout set, which applies on
top of this retry logic (block for busy timeout seconds, then apply this
back off logic).

* docs/release-notes: add entry for sqlite/postgres tx retry

---------

Co-authored-by: ziggie <ziggie1984@protonmail.com>
2023-08-30 16:48:00 -07:00