If the remote commit confirms before our local commit, there is no reason
to try to publish our HTLC transactions, we will instead directly claim
the htlc outputs from the remote commit.
We previously checked timelocks before checking preconditions, which in
this case means we would be waiting for a confirmation on our local commit
forever. We now check preconditions before timelocks, and added a
precondition that verifies that the remote commit isn't confirmed before
publishing our HTLC txs.
The output of `getsentinfo` didn't include the `nodeId` of the failing node.
This PR adds it, as it can be used by external apps when they build routes
themselves instead of relying on eclair's internals (e.g. channel rebalancing).
The scodec magic was quite hard to read, and the use of the prefix wasn't
very intuitive since Sphinx uses both a prefix and a suffix.
Also added more codec tests.
We rename the EncryptedRecipientData types.
The data it contains is namespaced to usages for route blinding, so we
make that explicit.
This way if future scenarios use another kind of encrypted tlv stream
we won't have name clashes (e.g. encrypted state backup).
We also update the route blinding test vectors to the final spec version.
Add basic support for onion messages (lightning/bolts#759)
Add functions and codecs to create, read and process onion messages. Does not use any of them yet.
We previously relied on bitcoind's dumpprivkey RPC in some of our tests.
That RPC isn't available with descriptor wallets, and descriptor wallets
are now the default wallet type.
We previously created restrictions in Sphinx.scala to only allow using it
for two types of onions: a 1300 bytes one for HTLCs and a 400 bytes one
for trampoline.
This doesn't make sense anymore. The latest version of trampoline allows
any onion size, and onion messages also allow any onion size. The Sphinx
protocol doesn't care either about the size of the payload.
Another reason to remove it is that it wasn't working that well with
pattern matching because of type erasure.
So now the caller must explicitly set the length of the payload, which is
more flexible. Verifying that the correct length is used is deferred to
higher level components.
Fixes#1995, which was due to a pattern matching error for the expected response type of `sendToX` helper methods in `EclairImpl`, and had nothing to do with json serialization. Added a few non-reg tests.
In the second commit I also set a basic "ok" json serializer for all default `RES_SUCCESS` messages, but didn't follow https://github.com/ACINQ/eclair/issues/1995#issuecomment-940821678, because we would either completely break backwards compatibility, or create inconsistency with non-default command responses like `RES_GETINFO`, and with other API calls not related to channels.
If a _local_ mutual close transaction is published from outside of the actor state machine, the channel will fail to recognize it, and will move to the `ERR_INFORMATION_LEAK` state. We could instead log a warning and handle it gracefully, since no harm has been done.
This is different from a local force close, because we do not keep the fully-signed local commit tx anymore, so an unexpected published tx would indeed be very fishy in that case. But we do store the best fully-signed, read-to-publish mutual close tx in the channel data so we must be ready to handle the case where the operator manually publishes it for whatever reason.
We use to store UNIX timestamps in the `waitingSince` field before
moving to block count. In order to ensure backward compatibility, we
converted from timestamps to blockheight based on the value. Anything
over 1 500 000 was considered a timestamp. But this value is much too
low: on testnet the blockheight is already greater than 2 000 000.
We can use 1 500 000 000 instead, which is somewhere in 2017.
Another way to deal with this would be to simply remove this
compatibility code.
This PRs adds an alternate strategy to handle unhandled exceptions. The goal is to avoid unnecessary mass force-close, but it is reserved to advanced users who closely monitor the node.
Available strategies are:
- local force close of the channel (default)
- log an error message and stop the node
Default settings maintain the same behavior as before.
In an "outdated commitment" scenario where we are on the up-to-date side, we always react by force-closing the channel immediately, not giving our peer a chance to fix their data and restart. On top of that, we consider this a commitment sync error, instead of clearly logging that our counterparty is using outdated data.
Addressing this turned out to be rabbit-holey: our sync code is quite complicated and is a bit redundant because we separate between:
- checking whether we are late
- deciding what messages we need to retransmit
Also, discovered a missing corner case when syncing in SHUTDOWN state.
Add support for cookie authentication with bitcoind instead of
user/password. This is recommended when running eclair and
bitcoind on the same machine: it ensures only processes with
read permissions to the bitcoind cookie file are able to call the
RPC, which is safer than a user/password pair.
Default upper bound was `Long.MaxValue unixsec` which overflowed when converted to `TimestampMilli`. We now enforce `min` and `max` values on timestamp types.
API tests didn't catch it because eclair is mocked and the conversion happens later.
Fixes#2031.
Add a new log file for important notifications that require an action from
the node operator.
Using a separate log file makes it easier than grepping specific messages
from the standard logs, and lets us use a different style of messaging,
where we provide more information about what steps to take to resolve
the issue.
We rely on an event sent to the event stream so that plugins can also pick
it up and connect with notification systems (push, messages, mails, etc).
On slow CI machines, the "recv WatchFundingConfirmedTriggered" test was
flaky because there was a race between the publication of Alice's
TransactionPublished event before going to the WaitForFundingLocked state
and the tests registering event listeners (after going to the
WaitForFundingLocked state).
It doesn't make sense to throw away this information, and it's useful in
some scenarios such as onion messages.
The ephemeral keys aren't part of the route, they're usually derived hop
by hop instead. We only need to keep the first one that must be somehow
sent to the introduction node.
For incoming htlcs, the amount needs to be included in our balance if we know the preimage, even if the htlc hasn't yet been formally settled.
We were already taking into accounts preimages in the `pending_commands` database.
But, as soon as we have sent and signed an `update_fulfill_htlc`, we clean up the `pending_commands` database. So we also need to look at current sent changes.
Add options to ignore specific channels or nodes for
findRoute* APIs, and an option to specify a flat maximum
fee.
With these new parameters, it's now possible to do circular
rebalancing of your channels.
Co-authored-by: Roman Taranchenko <romantaranchenko@Romans-MacBook-Pro.local>
Co-authored-by: t-bast <bastuc@hotmail.fr>
Add API to delete an invoice.
This only works if the invoice wasn't paid yet.
Co-authored-by: Roman Taranchenko <romantaranchenko@Romans-MacBook-Pro.local>
Co-authored-by: t-bast <bastuc@hotmail.fr>
Cryptographic functions to blind and unblind a route and its associated
encrypted payloads.
Decrypt and decode the contents of an `encrypted_recipient_data` tlv field.
We could share the tlv namespace with onion tlvs, but it's cleaner to
separate them. They have a few common fields, but already diverge on
others, and will diverge even more in the future.
* Check serialization consistency in all channel tests
We add a simple wrapper over the channels db used in all channel state unit tests, which will basically check
that deserialize(serialize(state)) == state.
* Add getChannel() method to ChannelsDb interface
This makes our serialization checks cleaner: we now test that read(write(channel)) == channel
We define `TimestampSecond` and `TimestampMilli` for second and millisecond precision UNIX-style timestamps.
Let me know what you think of the syntaxic sugar, I went for `123456 unixsec` and `123456789 unixms`.
Json serialization is as follows for resp. second and millisecond precision. Note that in both case we display the unix format in second precision, but the iso format is more precise:
```
{
"iso": "2021-10-04T14:32:41Z",
"unix": 1633357961
}
{
"iso": "2021-10-04T14:32:41.456Z",
"unix": 1633357961
}
```
* use a map for feature->channelType resolution
Instead of explicitly listing all the combination of features, and risk
inconsistency, we may has well build the reverse map using the channel
type objects.
* better and less spammy logs
We can switch the "funding tx already spent" router log from _warn_ to
_debug_ because as soon as there are more than 10 of them, the peer's
announcements will be ignored and there will be a warning message about
that.
* timedOutHtlcs -> trimmedOrTimedOutHtlcs
Add a precision on trimmed htlcs, which can be failed as soon as the
commitment tx has been confirmed.
* proper logging of outgoing messages
It is also logical to make `Outgoing` a command of `Peer`. It should
have been done this way from the start if `Peer` had been a typed actor.
* fixed mixed up comments
Discovered this while working on #1838.
In the following scenario, at reconnection:
- `localCommit.index = 7`
- `nextRemoteRevocationNumber = 6`
So when `localCommit.index == nextRemoteRevocationNumber + 1` we must retransmit the revocation.
```
local remote
| |
| (no pending sig) |
commit = 6 | | next rev = 6
|<----- sig 7 ------|
commit = 7 | |
|-- rev 6 --> ? |
| |
| (disconnection) |
| |
```
* reproduce bug causing API hang at open
In case of an error when validating channel parameters, we do not
return a message to the origin actor. That translates to API hanging
until timeout.
Took the opportunity to test return values in other cases too.
* return an error to origin actor for invalid params
* WaitForFundingCreatedInternal -> WaitForFundingInternal
* add tests to WaitForFundingInternalStateSpec
* add tests to WaitForFundingConfirmedStateSpec
* API nits
We probably don't need to print the stack trace for API errors, and the
open timeout of 10s was a bit short (it has to be << 30s though).
Add config fields for max dust htlc exposure.
These configuration fields let node operators decide on the amount of dust
htlcs that can be in-flight in each channel.
In case the channel is force-closed, up to this amount may be lost in
miner fees.
When sending and receiving htlcs, we check whether they would overflow
our configured dust exposure, and fail them instantly if they do.
A large `update_fee` may overflow our dust exposure by removing from the
commit tx htlcs that were previously untrimmed.
Node operators can choose to automatically force-close when that happens,
to avoid risking losing large dust amounts to miner fees.
The script wasn't length-delimited.
Fortunately this feature was disabled by default.
Since no-one reported the issue, we can probably just do this simple fix.
Unfortunately, `context.log` is *not* thread safe and shouldn't be used
in future continuation. We should instead use `pipeToSelf` when we want
to act on the results of a `Future`.
Allow any feerate when using anchor outputs and we're fundee.
This will prevent unwanted channel closure.
This can be unsafe in a high fee environment if the commit tx is below
the propagation threshold. However, even when we discover it it's too late
anyway, so our only option is to wait for package relay to save the day.
Ensure feerate is always above propagation threshold when we're funder.
We lift the limit configured by the node operator when it is below the
network propagation threshold.
We previously computed the on-chain fees paid by us after the fact, when
receiving a notification that a transaction was confirmed. This worked
because lightning transactions had a single input, which we stored in
our DB to allow us to compute the fee.
With anchor outputs, this mechanism doesn't work anymore. Some txs have
their fees paid by a child tx, and may have more than one input.
We completely change our model to store every transaction we publish,
along with the fee we're paying for this transaction. We then separately
store every transaction that confirms, which lets us join these two data
sets to compute how much on-chain fees we paid.
This has the added benefit that we can now audit every transaction that
we tried to publish, which lets node operators audit the anchor outputs
internal RBF mechanism and all the on-chain footprint of a given channel.
It's quite cumbersome to investigate complex MPP payment failures.
We need to grep on the parent ID, then group together logs for each child
payment, and then we're ready for some analysis.
Most of the time, a quick look at the breakdown of all intermediate failures
is all we need to diagnose the problem. This PR adds such a log line.
Having basic documentation in-place by providing examples in
`eclair.conf` is great and very convenient, but in the case of
path-finding, defining experiments take so much space that it makes
the whole configuration file actually more complicated to understand.
And since we don't want to enable experiments by default, the user still
has to figure out what to change to actually enable AB-testing.
Co-authored-by: Bastien Teinturier <31281497+t-bast@users.noreply.github.com>
* make custom serializers objects instead of classes
* reorder json format definition
* use minimal serializers
Having custom serializers depend on external format would introduce an
infinite recursion at runtime if not careful. Thankfully, none of our
serializers use it, so we may as well remove the possibility entirely.
* simplify serializers further
We don't need to type the serializers: this is required for deserializing,
not serializing, and we are not using it.
The fact that be had a type mismatch here shows it:
```scala
object TransactionSerializer extends MinimalSerializer[TransactionWithInputInfo]
```
* new generic json serializer
Instead of providing a `MyClass => JValue` conversion method, we
provide a `MyClass => MyClassJson` method, with the assumption that
`MyClassJson` is serializable using the base serializers.
The rationale is that it's easier to define the structure with types
rather than by building json objects.
This also means that the serialization of attributes of class C is out
of the scope when defining the serializer for class C. See for example
how `DirectedHtlcSerializer` doesn't need anymore to bring in
lower level serializers.
It also has the advantage of removing recursion from custom serializers
which sometimes generated weird stack overflows.
We are slowly dropping support for non-segwit outputs, as proposed in
https://github.com/lightningnetwork/lightning-rfc/pull/894
We can thus safely allow dust limits all the way down to 354 satoshis.
In very rare cases where dust_limit_satoshis is negotiated to a low value,
our peer may generate closing txs that will not correctly relay on the
bitcoin network due to dust relay policies.
When that happens, we detect it and force-close instead of completing the
mutual close flow.
The fee that's recorded in the path-finding metrics should count the local channel. Without it we record failing payments with a fee budget larger than the fee recorded for the successful payment to the same node.
* optionally record path-funding metrics
This is useful in feature branches.
Also rename `recordMetrics` to `recordPathFindingMetrics`.
* rename channel features attribute
It was a remainder of init features, which can be activated or not. But
in the context of the `ChannelFeatures` object, that naming was
confusing because all features are activated in that context.
* minor refactoring on channel_type
Moved some logic outside `Peer`.
* refactor `RouteParams`
`PathFindingConf` and `RouteParams` have almost the same set of params,
but some of them don't have the same structure (flat vs hierarchical).