When a channel is closed, we can forget the data from historical HTLCs
sent and received through that channel (which is otherwise required to
punish cheating attempts by our peer).
We previously synchronously removed that data from the DB when the closing
transaction confirmed. However, this could create performance issues as
the `htlc_infos` table can be very large for busy nodes and many concurrent
writes may be happening at the same time.
We don't need to get rid of this data immediately: we only want to remove
it to avoid degrading the performance of active channels that read and
write to the `htlc_infos` table. We now mark channels as closed in a
dedicated table, and run a background actor that deletes batches of
obsolete htlc data at regular intervals. This ensures that the table is
eventually cleaned up, without impacting the performance of active
channels.
When a splice transaction confirms, all the revoked commitment transactions
that only applied to the previous funding transaction cannot be published
anymore, because the previous funding output has already been spent.
We can thus forget all the historical HTLCs that were included in those
commitments, because we will never need to generate the corresponding
penalty transactions.
This ensures that the growth of our DB is bounded, and will shrink every
time a splice transaction is confirmed.
Fixes#2610, #2702 and #2740
* Test bitcoind wallet behavior during mempool eviction
When transactions are being evicted from the mempool, we want to make
sure that bitcoind keeps the corresponding wallet inputs locked to
avoid accidentally double-spending ourselves. We update our unit test
to be closer to common scenarios (chain of unconfirmed splice txs and
commit txs replaced by the remote version).
This test confirms that bitcoind keeps wallet inputs locked, and we
have to call `abandontransaction` to free up utxos used in txs that
have been permanently double-spent.
* Test bitcoind wallet behavior during double spend
When one of our wallet transactions is directly double-spent, the
bitcoin wallet detects that and automatically frees up wallet inputs
from the permanently double-spent wallet transaction. This is great,
that means we only need to manually call `abandontransaction` for
transactions that have been invalidated because one of their parents
has been double-spent.
We had some cases where channels were staying stuck in the CLOSING state
without going to the CLOSED state in previous versions of eclair where
we published different versions of our 3rd-stage transactions after a
restart and a feerate increase.
This happened in two cases:
- when we published a different version of our claim-main, since we only
store the latest (it's an `Option`, not a `Seq`) and put a watch on
this specific `txid` instead of watching the corresponding output
- when we re-published 3rd-stage txs after receiving `CMD_FULFILL_HTLC`
because we replaced the previous `claim-htlc-delayed` transactions
This was on my backlog for a while, and I wanted to take another look at
it after the splicing implementation. Fortunately, this was fixed as
part of the splicing changes, but it's worth having a test to ensure
that there's no regression from future changes.
When a transaction is directly double-spent, bitcoind is able to
automatically detect it and free up wallet inputs that are used
in double-spent transactions. However, when a transaction becomes
invalid because one of its ancestors is double-spent, bitcoind is
not able to detect it and will keep wallet inputs locked forever.
The only cases where this can happen are:
- splice transactions when a commitment transaction that is not
based on the latest splice is confirmed
- anchor transactions in the above case, or when a different
version of the commitment confirms (e.g. remote commit while
we're trying to CPFP ou local commit)
When we detect that this happened, we abandon the corresponding
transactions, which ensures that we don't end up with unusable
liquidity in our wallet.
We also remove cases where we eagerly abandoned transactions,
which could lead us to double-spend ourselves inadvertently
because the abandoned transaction could still confirm while
we reused its inputs in another transaction (which is a real
issue when using 0-conf).
* Add configurable threshold on maximum anchor fee
We previously bumped up to 5% of our channel balance when no HTLCs were
at risk. For large channels, 5% is an unreasonably high value. In most
cases it doesn't matter, because the transaction confirms before we try
to bump it to unreasonable levels. But if the commitment transaction
was pruned and couldn't be relayed to miners, then eclair would keep
trying to bump until it reached that threshold. We now restrict this to
a value configurable by the node operator. Note that when HTLCs are at
risk, we still bump up to the HTLC amount, which may be higher than the
new configuration parameter: we want that behavior as a scorched earth
strategy against pinning attacks.
* Add feerate upper bound from fee estimator
We add a new limit to the feerate used for fee-bumping, based on the
fastest feerate returned by our fee estimator. It doesn't make sense
to use much higher values, since this feerate should guarantee that
the transaction is included in the next block.
Allow trampoline to pay a list of blinded paths instead of a node id.
Only the last trampoline hop can target blinded paths, trampoline nodes are still reached with their node id, not with blinded paths.
Co-authored-by: t-bast <bastien@acinq.fr>
The size of trampoline onions was fixed at 400 bytes. We remove this limit, allowing larger trampoline onions when necessary.
The size of the trampoline onion is still constrained by the size of the standard onion (1300 bytes).
Co-authored-by: t-bast <bastien@acinq.fr>
When opening an important channel we may be willing to pay a high feerate that we wouldn't want to use when consolidating UTXOs. However because we delegate the transaction creation to bitcoin core, we can sometimes be surprised by funding transactions that are much bigger than expected. This PR protects us from such cases by refusing to open the channel if the transaction fee is too high.
Based on original work from @thomash-acinq.
If we get disconnected after sending `commit_sig`, we will retransmit
that message when reconnecting if our peer has not received it.
When we have multiple commitments, we need to retransmit one message
per commitment, and include the `batch_size` tlv.
We incorrectly stored `commit_sig` without the `batch_size` tlv in the
next remote commit field, and thus retransmitted without that tlv.
When an inactive revoked commitment containing outgoing HTLCs
is confirmed, we must consider the outgoing HTLC failed and relay that
failure upstream. Because of the alternative commit tx mechanism,
those inactive revoked commitments match the remote commit,
so we were waiting for tx confirmation before failing back upstream.
Last command of eclair-cli is a curl piped to jq. In case of an
error with the curl command (for instance, eclair service is not
running) the error code was swallowed by the pipe wich will
return the exit code of the last command of the pipe (here, jq).
Setting `pipefail` ensures that the error code of curl is
preserve in case of HTTP issue.
This can come handy, for instance, to check that eclair is not
ok:
```
if eclair-cli getinfo
then
echo eclair is answering
else
echo problem communicating with eclair
fi
```
#2761 introduced the ability for the HTLC sender to let a remote initiator
dip into its reserve to unblock channels after a large splice. However, we
relaxed that condition for all channels, even those that don't use splice.
This creates compatibility issues with other implementations that are
stricter than what the specification requires, and will force-close in
those situations.
This version of logback fixes the following CVE:
"a potential denial of service attack on a centralized logback receiver
when a third party controlling a remote appender connects to said
receiver and could shut down or slow down logging of events."
Eclair isn't affected since we don't use logback receivers, but if there
are applications or plugins that depend on eclair and use logback
receivers, it's better to use the logback version containing the fix.
The format used by bitcoinheaders.net is changing to use whole bytes
instead of nibles, which is easier to parse. We start using the v2 format
exclusively, which will allow deprecating the previous format.
Fixes#2786
* Add a txOut field to our InteractiveTxBuilder.Input interface
This will help us add support for taproot inputs: to create taproot signatures we need all prevouts (and not just prevouts for tx inputs that spend taproot outputs).
Offers (https://github.com/lightning/bolts/pull/798) allow nodes to be identified using either the public key or a pair channel id and direction.
The goal is to save bytes as channel id and direction only use 9 bytes instead of 33 for a public key.
When we receive a `channel_update` in `update_fail_htlc`, we should take
it into account to exclude channels or correctly retry with updated fees,
but we shouldn't apply it to our routing table.
If we did, that could be exploited to deanonymize our payments.
It shouldn't be necessary anyway, as honest nodes should broadcast
those `channel_update`s publicly, so we would receive them through the
normal gossip mechanism.
Fixes#2767
As soon as a channel is transitioning to a closing state (mutual or
unilateral close), it cannot be used to relay HTLCs anymore. We should
notify the network by sending a disabled `channel_update`.
Fixes#2766
If remote has a positive contribution, we do not check their
post-splice reserve level, because they are improving their
situation, even if they stay below the requirement. Note that if
local splices-in some funds in the same operation, remote
post-splice reserve may actually be worse than before, but that's
not their fault.
Splicing (and dual funding as well) introduce a new scenario that could
not happen before, where the channel initiator (who pays the fees for the
commit tx) can end up below the channel reserve, or right above it.
In that case it means that most of the channels funds are on the non
initiator side, so we should allow HTLCs from the non-initiator to the
initiator to move funds towards the initiator. We allow slightly dipping
into the channel reserve in that case, for at most 5 pending HTLCs.
If we abort the interactive-tx protocol while we're waiting for bitcoind
to fund the transaction, we must avoid sending an obsolete `tx_add_input`
for the funded transaction. There is a check in `Channel.scala` to only
send an outgoing interactive-tx message if we have an active interactive-tx
session, but this isn't sufficient in the following scenario:
- we disconnect and thus abort the interactive-tx session while bitcoind
is funding
- we reconnect and restart another interactive-tx session
- bitcoind responds for the first interactive-tx session
In that scenario we would mix `tx_add_input` from the old and the new
interactive-tx session, which leads to a duplicate `serial_id`.
We were incorrectly splitting the shared input amount based on each peer's
balance, but for the signing order the peer that adds the shared input
takes credit for the whole amount of that input.
Whenever we use the local commit, we must use the local dust limit to
trim HTLCs. Similarly, whenever we use the remote commit, we must use
the remote dust limit.
Sometimes the notifier for node A has not been killed before node A connects. This causes an extra GetPeerInfo message to be sent to the switchboard and the `assert(request3.remoteNodeId == remoteNodeId2)` check to fail when the GetPeerInfo for node A comes first. Just waiting to receive the `AsyncPaymentTimeout` message is not enough.
If node A connects after node B then this race does not occur.
Other implementations and Eclair (#2703) disable channels only when the other peer is offline.
So disabled channels should no be used for routing messages.
This PR removes `ActiveEdge` that was introduced in #2656
Some channels have only a few sats available to send but other nodes don't know it so they try to use them and fail.
When the balance goes below configurable thresholds we now advertize a lower maxHtlcAmount.
This should reduce the number of failed attempts and benefit the network.
We assumed that once quiescent, the local and remote commitment index
would be the same, but that's not true. Those two indices may diverge
because of concurrent updates. We need to keep using the right index for
each commitment during and after a splice, otherwise it leads to "invalid
commit sig" errors.
When we start, bitcoind may not be ready to handle requests if we restarted
it around the same time. We thus poll it for 30 seconds before giving up.
This is especially useful for tests (e.g. on regtest).
The `ChannelFeatures`->`ChannelType` conversion is problematic for the following reasons:
- It creates an implicit bidirectional mapping between the two classes, which is illogical and error-prone because both directions must be kept in sync.
- Not only must the conversions be kept in sync, it must also be stable over time.
- From the spec p.o.v., the scope of `ChannelType` is really limited to channel opening negotiation, while we currently make it a permanent concept.
- It has weird semantic side effects where we would check channel types in places where we really meant to check commitment formats.
If a channel has negotiated quiescence with pending htlcs then `InteractiveTxBuilder` will create/verify commit txs which include htlc outputs and create/verify htlc signatures when exchanging `CommitSig`.
The new attribute `htlcsAmount` of `Output.Shared` and `Input.Shared` accounts for the value in htlcs when computing fees.
* Move some logs to debug level
This should reduce the pressure on the file system and RAM without impacting
our ability to troubleshoot common issues.
* Avoid herd effect watching external channels
When we restart, we put watches on every public channel in the network.
That creates a lot of RPC calls to bitcoind, which aren't time-sensitive.
It's ok if we don't see immediately that an external channel was closed,
the spec even recommends waiting for 12 blocks to distinguish a channel
close from a splice.
By default, we now smooth that over a 1 hour period. This means we should
also allow our peers to be late at discovering that a channel was closed.
We thus stop sending a `warning` in that case and increase our tolerance
to that kind of behavior.
* Avoid herd effect watching local channels
When we restart, we set watches on our funding transactions. But we don't
actually need to watch them immediately, we just need enough time to react
to our peer broadcasting their commitment. We use long `cltv_delta` delays
to guarantee funds safety, so we can spread out the watches across several
blocks to reduce the start-up load. It essentially is the same thing as
receiving mempool transactions or blocks after a delay, which is something
that our threat model already takes into account.
* Allow splicing on non dual-funded channels
We currently allow splicing on top of a channel that was created without
using dual funding, but it results in an incorrect channel reserve being
used post-splice.
* Rename requestedChannelReserve_opt
This field only applies to the first funding index, so we rename it to
make that explicit. Once a splice has happened, it will be ignored.
A minor fix for tests that rely on the minimum feerate being low.
---------
Co-authored-by: Bastien Teinturier <31281497+t-bast@users.noreply.github.com>