Fixes#1995, which was due to a pattern matching error for the expected response type of `sendToX` helper methods in `EclairImpl`, and had nothing to do with json serialization. Added a few non-reg tests.
In the second commit I also set a basic "ok" json serializer for all default `RES_SUCCESS` messages, but didn't follow https://github.com/ACINQ/eclair/issues/1995#issuecomment-940821678, because we would either completely break backwards compatibility, or create inconsistency with non-default command responses like `RES_GETINFO`, and with other API calls not related to channels.
If a _local_ mutual close transaction is published from outside of the actor state machine, the channel will fail to recognize it, and will move to the `ERR_INFORMATION_LEAK` state. We could instead log a warning and handle it gracefully, since no harm has been done.
This is different from a local force close, because we do not keep the fully-signed local commit tx anymore, so an unexpected published tx would indeed be very fishy in that case. But we do store the best fully-signed, read-to-publish mutual close tx in the channel data so we must be ready to handle the case where the operator manually publishes it for whatever reason.
We use to store UNIX timestamps in the `waitingSince` field before
moving to block count. In order to ensure backward compatibility, we
converted from timestamps to blockheight based on the value. Anything
over 1 500 000 was considered a timestamp. But this value is much too
low: on testnet the blockheight is already greater than 2 000 000.
We can use 1 500 000 000 instead, which is somewhere in 2017.
Another way to deal with this would be to simply remove this
compatibility code.
This PRs adds an alternate strategy to handle unhandled exceptions. The goal is to avoid unnecessary mass force-close, but it is reserved to advanced users who closely monitor the node.
Available strategies are:
- local force close of the channel (default)
- log an error message and stop the node
Default settings maintain the same behavior as before.
In an "outdated commitment" scenario where we are on the up-to-date side, we always react by force-closing the channel immediately, not giving our peer a chance to fix their data and restart. On top of that, we consider this a commitment sync error, instead of clearly logging that our counterparty is using outdated data.
Addressing this turned out to be rabbit-holey: our sync code is quite complicated and is a bit redundant because we separate between:
- checking whether we are late
- deciding what messages we need to retransmit
Also, discovered a missing corner case when syncing in SHUTDOWN state.
Add support for cookie authentication with bitcoind instead of
user/password. This is recommended when running eclair and
bitcoind on the same machine: it ensures only processes with
read permissions to the bitcoind cookie file are able to call the
RPC, which is safer than a user/password pair.
Default upper bound was `Long.MaxValue unixsec` which overflowed when converted to `TimestampMilli`. We now enforce `min` and `max` values on timestamp types.
API tests didn't catch it because eclair is mocked and the conversion happens later.
Fixes#2031.
Add a new log file for important notifications that require an action from
the node operator.
Using a separate log file makes it easier than grepping specific messages
from the standard logs, and lets us use a different style of messaging,
where we provide more information about what steps to take to resolve
the issue.
We rely on an event sent to the event stream so that plugins can also pick
it up and connect with notification systems (push, messages, mails, etc).
On slow CI machines, the "recv WatchFundingConfirmedTriggered" test was
flaky because there was a race between the publication of Alice's
TransactionPublished event before going to the WaitForFundingLocked state
and the tests registering event listeners (after going to the
WaitForFundingLocked state).
It doesn't make sense to throw away this information, and it's useful in
some scenarios such as onion messages.
The ephemeral keys aren't part of the route, they're usually derived hop
by hop instead. We only need to keep the first one that must be somehow
sent to the introduction node.
For incoming htlcs, the amount needs to be included in our balance if we know the preimage, even if the htlc hasn't yet been formally settled.
We were already taking into accounts preimages in the `pending_commands` database.
But, as soon as we have sent and signed an `update_fulfill_htlc`, we clean up the `pending_commands` database. So we also need to look at current sent changes.
Add options to ignore specific channels or nodes for
findRoute* APIs, and an option to specify a flat maximum
fee.
With these new parameters, it's now possible to do circular
rebalancing of your channels.
Co-authored-by: Roman Taranchenko <romantaranchenko@Romans-MacBook-Pro.local>
Co-authored-by: t-bast <bastuc@hotmail.fr>
Add API to delete an invoice.
This only works if the invoice wasn't paid yet.
Co-authored-by: Roman Taranchenko <romantaranchenko@Romans-MacBook-Pro.local>
Co-authored-by: t-bast <bastuc@hotmail.fr>
Cryptographic functions to blind and unblind a route and its associated
encrypted payloads.
Decrypt and decode the contents of an `encrypted_recipient_data` tlv field.
We could share the tlv namespace with onion tlvs, but it's cleaner to
separate them. They have a few common fields, but already diverge on
others, and will diverge even more in the future.
* Check serialization consistency in all channel tests
We add a simple wrapper over the channels db used in all channel state unit tests, which will basically check
that deserialize(serialize(state)) == state.
* Add getChannel() method to ChannelsDb interface
This makes our serialization checks cleaner: we now test that read(write(channel)) == channel
We define `TimestampSecond` and `TimestampMilli` for second and millisecond precision UNIX-style timestamps.
Let me know what you think of the syntaxic sugar, I went for `123456 unixsec` and `123456789 unixms`.
Json serialization is as follows for resp. second and millisecond precision. Note that in both case we display the unix format in second precision, but the iso format is more precise:
```
{
"iso": "2021-10-04T14:32:41Z",
"unix": 1633357961
}
{
"iso": "2021-10-04T14:32:41.456Z",
"unix": 1633357961
}
```
* use a map for feature->channelType resolution
Instead of explicitly listing all the combination of features, and risk
inconsistency, we may has well build the reverse map using the channel
type objects.
* better and less spammy logs
We can switch the "funding tx already spent" router log from _warn_ to
_debug_ because as soon as there are more than 10 of them, the peer's
announcements will be ignored and there will be a warning message about
that.
* timedOutHtlcs -> trimmedOrTimedOutHtlcs
Add a precision on trimmed htlcs, which can be failed as soon as the
commitment tx has been confirmed.
* proper logging of outgoing messages
It is also logical to make `Outgoing` a command of `Peer`. It should
have been done this way from the start if `Peer` had been a typed actor.
* fixed mixed up comments
Discovered this while working on #1838.
In the following scenario, at reconnection:
- `localCommit.index = 7`
- `nextRemoteRevocationNumber = 6`
So when `localCommit.index == nextRemoteRevocationNumber + 1` we must retransmit the revocation.
```
local remote
| |
| (no pending sig) |
commit = 6 | | next rev = 6
|<----- sig 7 ------|
commit = 7 | |
|-- rev 6 --> ? |
| |
| (disconnection) |
| |
```
* reproduce bug causing API hang at open
In case of an error when validating channel parameters, we do not
return a message to the origin actor. That translates to API hanging
until timeout.
Took the opportunity to test return values in other cases too.
* return an error to origin actor for invalid params
* WaitForFundingCreatedInternal -> WaitForFundingInternal
* add tests to WaitForFundingInternalStateSpec
* add tests to WaitForFundingConfirmedStateSpec
* API nits
We probably don't need to print the stack trace for API errors, and the
open timeout of 10s was a bit short (it has to be << 30s though).
Add config fields for max dust htlc exposure.
These configuration fields let node operators decide on the amount of dust
htlcs that can be in-flight in each channel.
In case the channel is force-closed, up to this amount may be lost in
miner fees.
When sending and receiving htlcs, we check whether they would overflow
our configured dust exposure, and fail them instantly if they do.
A large `update_fee` may overflow our dust exposure by removing from the
commit tx htlcs that were previously untrimmed.
Node operators can choose to automatically force-close when that happens,
to avoid risking losing large dust amounts to miner fees.
The script wasn't length-delimited.
Fortunately this feature was disabled by default.
Since no-one reported the issue, we can probably just do this simple fix.
Unfortunately, `context.log` is *not* thread safe and shouldn't be used
in future continuation. We should instead use `pipeToSelf` when we want
to act on the results of a `Future`.
Allow any feerate when using anchor outputs and we're fundee.
This will prevent unwanted channel closure.
This can be unsafe in a high fee environment if the commit tx is below
the propagation threshold. However, even when we discover it it's too late
anyway, so our only option is to wait for package relay to save the day.
Ensure feerate is always above propagation threshold when we're funder.
We lift the limit configured by the node operator when it is below the
network propagation threshold.
We previously computed the on-chain fees paid by us after the fact, when
receiving a notification that a transaction was confirmed. This worked
because lightning transactions had a single input, which we stored in
our DB to allow us to compute the fee.
With anchor outputs, this mechanism doesn't work anymore. Some txs have
their fees paid by a child tx, and may have more than one input.
We completely change our model to store every transaction we publish,
along with the fee we're paying for this transaction. We then separately
store every transaction that confirms, which lets us join these two data
sets to compute how much on-chain fees we paid.
This has the added benefit that we can now audit every transaction that
we tried to publish, which lets node operators audit the anchor outputs
internal RBF mechanism and all the on-chain footprint of a given channel.
It's quite cumbersome to investigate complex MPP payment failures.
We need to grep on the parent ID, then group together logs for each child
payment, and then we're ready for some analysis.
Most of the time, a quick look at the breakdown of all intermediate failures
is all we need to diagnose the problem. This PR adds such a log line.
Having basic documentation in-place by providing examples in
`eclair.conf` is great and very convenient, but in the case of
path-finding, defining experiments take so much space that it makes
the whole configuration file actually more complicated to understand.
And since we don't want to enable experiments by default, the user still
has to figure out what to change to actually enable AB-testing.
Co-authored-by: Bastien Teinturier <31281497+t-bast@users.noreply.github.com>
* make custom serializers objects instead of classes
* reorder json format definition
* use minimal serializers
Having custom serializers depend on external format would introduce an
infinite recursion at runtime if not careful. Thankfully, none of our
serializers use it, so we may as well remove the possibility entirely.
* simplify serializers further
We don't need to type the serializers: this is required for deserializing,
not serializing, and we are not using it.
The fact that be had a type mismatch here shows it:
```scala
object TransactionSerializer extends MinimalSerializer[TransactionWithInputInfo]
```
* new generic json serializer
Instead of providing a `MyClass => JValue` conversion method, we
provide a `MyClass => MyClassJson` method, with the assumption that
`MyClassJson` is serializable using the base serializers.
The rationale is that it's easier to define the structure with types
rather than by building json objects.
This also means that the serialization of attributes of class C is out
of the scope when defining the serializer for class C. See for example
how `DirectedHtlcSerializer` doesn't need anymore to bring in
lower level serializers.
It also has the advantage of removing recursion from custom serializers
which sometimes generated weird stack overflows.
We are slowly dropping support for non-segwit outputs, as proposed in
https://github.com/lightningnetwork/lightning-rfc/pull/894
We can thus safely allow dust limits all the way down to 354 satoshis.
In very rare cases where dust_limit_satoshis is negotiated to a low value,
our peer may generate closing txs that will not correctly relay on the
bitcoin network due to dust relay policies.
When that happens, we detect it and force-close instead of completing the
mutual close flow.
The fee that's recorded in the path-finding metrics should count the local channel. Without it we record failing payments with a fee budget larger than the fee recorded for the successful payment to the same node.
* optionally record path-funding metrics
This is useful in feature branches.
Also rename `recordMetrics` to `recordPathFindingMetrics`.
* rename channel features attribute
It was a remainder of init features, which can be activated or not. But
in the context of the `ChannelFeatures` object, that naming was
confusing because all features are activated in that context.
* minor refactoring on channel_type
Moved some logic outside `Peer`.
* refactor `RouteParams`
`PathFindingConf` and `RouteParams` have almost the same set of params,
but some of them don't have the same structure (flat vs hierarchical).
The separation between `ExtendedBitcoinClient` and `BitcoinCoreWallet` has
become very blurry since anchor ouputs: eclair now requires fee bumping
utilities from the underlying bitcoin wallet, and it's not yet clear what
the interface should be. The on-chain utility methods that were added to
the eclair API also made it awkward to cleanly separate concerns.
We completely remove the `BitcoinCoreWallet` and merge it inside the
bitcoin client. We may in the future re-introduce a cleaner on-chain wallet
abstraction, but that can only happen once we have stable fee bumping
mechanisms.
Add support for https://github.com/lightningnetwork/lightning-rfc/pull/824
When the channel type is anchor outputs with zero fee htlc txs, we set
the fees for the htlc txs to 0.
An important side-effect is that it changes the trimmed to dust calculation,
and outputs that were previously dust can now be included in the commit tx.
* disable pg lock auto-release in tests
It relies on akka's coordinated shutdown and causes the test jvm to
halt.
* fixup! Remove `messageFlags` from `ChannelUpdate` (#1941)
Add AB testing framework:
- Experiments are added by adding a section in router.path-finding config. Each experiment can have different parameters.
- Traffic is randomly split among the different experiments. The size of each experiment is configurable. 0% experiments don't affect traffic but can be triggered manually with the API.
- Metrics are recorded in the audit database
We make it a serialization detail, which it should be. The `derive`
method from `scodec` makes it very easy to do. We should probably always
use a dedicated class to handle flags, instead of using the `byte` codec
and binary operators.
This allows to remove the `require` in the `ChannelUpdate` definition,
which recently bit us in testing.
The only annoying thing is that we still need to expose a `messageFlags`
method in order to populate the `ChannelDisabled` error message.
We also typeify channel flags, as an alternative to passing around a `Byte`.