* Split feerate mismatch configuration
We want to be much stricter with feerates that are below our estimation
than feerates that are above it.
This also makes this configuration parameter easier to understand
for end users.
* Tolerate feerate mismatch while channel is unused
We can relax the conditions where we close a channel because of a feerate
mismatch: when the channel has no pending HTLCs, it's ok to temporarily
disagree on the feerate.
While we disagree on feerates, we don't use this channel to offer outgoing
HTLCs. If we receive an incoming HTLC, we have to close the channel because
that HTLC would be at risk (incorrect feerate).
This mechanism gives us time to adapt to feerate changes, hopefully reducing
the amount of unnecessary channel closures.
The channelstats API only returns results for the *outgoing* channels
used when relaying. We must also include the *incoming* channels, otherwise
it looks like they're inactive which doesn't reflect their real usage.
Fixes#1465
- Test was not executed (because the "tests" variable was an iterator that was emptied by the call to .size())
- HTLC regex had to be updated to skip over the HTLC number that was added to the reference test vectors
* Add metric to track onion payload format
This will be useful to decide when we can safely phase out support for
the legacy format.
* Add metric to track htlcs in flight
We track both the number of HTLCs and their amounts.
We track this at the channel level and globally.
If all trampoline retries fail, we should convert the error to a route
not found. We tried multiple fee targets and none of those was enough to
allow the trampoline node to find a satisfying route.
MPP lifecycle shares preimage as soon as received.
This allows removing the use of the node-relayer as a passthrough for
fulfills, it can now simply listen to this event.
Long term, this could be sent to the event stream to share with more actors.
* PaymentLifecycle cleanup
Remove temporary hooks added for first version of MPP (route prefix,
empty routes, etc).
Allow specifying the whole route (not only nodeIds) in SendPaymentToRoute.
* Rework MultiPartPaymentLifecycle
Use the Router's new MPP RouteRequest.
Remove the "blind" split based on channel balances.
Reactivate trampoline relay to MPP non-trampoline recipient.
* Add MPP payment metrics
* Activate MPP feature by default
This provider will save the feerates retrieved by another provider to
database.
This feature can be used to retrieve the last used feerates when starting
the node, which will save time. This can have a significant effect on nodes
running with a slow connection (e.g. mobile devices).
Note that this commit does not affect the current setup and does not
actually create the database, the feature must be implemented separately.
Fixes#1447
Legacy codecs are isolated in a separate file, with a visibility restricted to "package" in order to reduce the risk of using those codecs. Also codecs are restricted to `decodeOnly` for the same reason.
* localPaymentBasepoint->staticPaymentBasepoint
Use `getOrElse` on the option value to decide between static/dynamic
payment point, instead of the `ChannelVersion` bit.
* define explicit method to test channel features
Also reduce visibility of a few members of `ChannelVersion`, and some
cleanup.
* use `if/else` instead of `match` for version bit
Leverage Yen's k-shortest paths and a simple split algorithm
to move MPP entirely inside the Router.
This is currently unused, the multipart payment lifecycle needs
to be updated to leverage this new algorithm.
We go to the `CLOSING` if and only if the funding tx has been spent by
one transaction. The `require` is absolutely necessary. We could
probably enforce this constraint at the compilation level by more clever
typing but that's another matter.
It should return stats for all channels (funder and fundee).
The previous code was incorrectly filtering on channels for which we paid
an on-chain fees, excluding the channels for which we were fundee.
Fixes#1449.
Light clients don't always validate channels by fetching the blockchain tx.
That means they don't have access to the exact capacity.
When that happens, we can fallback to htlc_maximum_msat if
available, or to a default capacity (otherwise path-finding will ignore
these edges).
When we announce a new public channel, make sure we don't override the
balance information with None.
Clean up IntegrationSpec warnings.
Fix PaymentLifecycle test: this test was broken by the recent changes in
BaseRouterSpec.
* Electrum client: downgrade log-level for errors that happened before a proper handshake
We only care for errors that we received during or after the "handshake" (i.e. the exchange of "version" messages).
Other errors cause by dead/unresponsive servers... which are very common on testnet just add spams to the logs.
* Electrum client pool: downgrade log level for errors with our secondary servers
We care for errors with our master electrum servers.
* Update Electrum checkpoints
Move the maximum fee computation outside of `findRoute`: this should be
done earlier in the payment pipeline if we want to allow accurate fee control
for MPP retries.
Right now MPP uses approximations when retrying which can lead to payments
that exceed the maximum configured fees. This is a first step towards
ensuring that this situation cannot happen anymore.
* Replace ArrayDeque by Queue
This is clearly not a hot path. This collection will have less than
10 elements and the bottleneck is rather on the dijkstra call.
ArrayDeque doesn't exist in Scala 2.11 so it makes the merge to
eclair-mobile more cumbersome.
* Use standard Scala collections in Dijkstra
Collections performance has improved greatly in Scala 2.13.
This change yields a consistent 10% improvement compared to the previous
implementation on my laptop based on the mainnet graph.
Both `Client` and `TransportHandler` were watching the connection actor,
which resulted in undeterministic behavior during termination of
`PeerConnection`.
We now always return a message when a connection fails during
authentication.
Took the opportunity to add more typing (insert
deathtoallthestring.jpg).
When a new connection happens while already connected, the `Peer` will
switch to the new connection.
In the current implementation, upon receiving a `ConnectionReady`, the
`Peer` will kill the current connection, then sends back the
`ConnectionReady` message to itself. In the meantime, the
`PeerConnection` will happily forward any incoming messages to the `Peer`.
This opens up to a race in the `Peer`'s mailbox between the
`ConnectionReady` message, and any incoming messages from the new
connection. If the latter win, they will get dropped because the `Peer`
is in state `DISCONNECTED`. Typically those are `ChannelReestablish`
messages and channels will get stuck in state `SYNCING`.
This PR make the `Peer` atomically switch to the new connection, without
going back to the `DISCONNECTED` state. As a result, we now have a
`CONNECTED`->`CONNECTED` transition.
* Extract faulty channels selection from PaymentLifecycle
Move the logic of figuring out which channels/nodes should be ignored
when retrying after a payment failure out of the PaymentLifecycle.
We can figure this out looking only at the `PaymentFailure` generated,
and the multi-part logic could leverage these helpers.
* Refactor RouteResponse
It was useless to return `ignoreNodes` and `ignoreChannels`, it's rather
the responsibility of the caller (PaymentLifecycle) to store and update
these sets.
Preparing for the MPP move inside the router, we introduce a Route class
and let RouteResponse return a collection of Routes.
This creates some ugliness in PaymentLifecycle because of the `routePrefix`,
but this is just temporary: the `routePrefix` "hack" will be removed soon.
* Use channel capacity and balance in path-finding
The path finding algorithm uses channel capacity instead of htlcMaximumMsat.
It also takes into account channel balance when available and excludes
channels that don't have enough funds to relay the payment.
This change also fixes an off-by-one error in weight computation: we were
incorrectly applying a channel's fee to the amount that needs to be relayed
through that channel (whereas this is instead what the node needs to receive
to collect enough fee *before* relaying).
* Refactor Graph file
Add documentation, update comments, rename fields and reformat to (helpfully)
make the code clearer.
* Simplify path-fiding implementation
There were a couple confusing steps in the implementation of Yen's algorithm.
The first one was the computation of the `edgesToIgnore` and the specific
handling of the case i = 0. This specific case wasn't needed and made the
code a bit hard to read.
The second one was the weight provided to dijkstra for spur paths.
The weight of the root path was applied to the target node. It was probably
an attempt to take into account the fact that dijkstra wasn't computing
a complete path and that fees may not match, but it couldn't really work.
I removed that and added a fee check at the end of the path-finding.
* Update graph balance for duplicate channel_update
This case regularly happens after a restart: the router already has the
latest channel_update for that channel, but we want to update the graph's
balances because they are all at `None` after a restart.
* minor: catch harmless unhandled events
This prevents unnecessary warnings to clog up the logs.
* fix race condition in test
Changing the fake ip address from 1.2.3.4:42000 to localhost:42000
in 32f15c85eb made the dummy connection
fail much faster, creating a race in the test. Reverting to the previous
ip and increasing the timeout should improve things a bit.
* Unlock transaction inputs if tx cannot be published
In some cases, funding a tx will work but publishing may fail (because mempool fees are not met for example).
In that case we need to make sure that the tx inputs are unlocked.
We should validate the announcement signatures in the channel too.
We still validate them in the router, but it's a different layer and the
check should happen as soon as possible.
We don't want to keep the channel open if the peer is sending us garbage.
This allows decoupling the reconnection task from the actual client
creation.
Also improved tests.
Co-Authored-By: Bastien Teinturier <31281497+t-bast@users.noreply.github.com>
This test was randomly failing. There may be a rounding error or a bit of
randomness somewhere in Bitcoin Core 0.19.1, sometimes a feerate right
below the limit is accepted into the mempool.
We've been witnessing random test suites freezes (since ages).
We've observed that when these freezes happen, there are usually a lot of
"too many open files" errors raised by the OS.
The backup handler is a likely culprit as the IntegrationSpec is running
multiple nodes and exchanging HTLCs at a fast rate.
At least it won't hurt disabling it in tests, and will speed up the
test suite.
We also increase the file limits in CI providers, when possible.
Instead of waiting for htlc-success txs to be confirmed, eclair also looks
at mempool txs to detect preimages as soon as possible.
This has been the case for a very long time, but our integration tests
didn't showcase this correctly.
Refactored common watcher test helpers and added tests to
ZmqWatcherSpec.
This is almost a drop-in replacement. I had to relaxed compiler
parameters to allow deprecated features though.
Main changes:
- relaxed compiler parameters to minimize impact (e.g. allow
deprecated features)
- `scala.collection.JavaConverters` -> `scala.jdk.CollectionConverters`
- `MultiMap` -> `MultiDict`
Compilation is 25% faster on my machine, compiler is a bit more strict
(it found an "invalid comparison" bug).