The default is 360 minutes which is very long.
Our builds are usually around 10 minutes, so 20 minutes is a good value
to prevent overbilling builds that create infinite loops.
We don't want a database error to cause force close of channels.
Database errors are more likely to happen when using Postgres, but can
also happen with Sqlite in case of e.g. full disk.
Since we always write to disk before sending messages, we should be able
to recover gracefully after the db issue is fixed and eclair is
restarted.
The `ReconnectionTask` was only catching
`ConnectionResult.Failure.ConnectionFailed`, which is a subset of
possible failures. It should instead have caught
`ConnectionResult.Failure`.
All authentication and initialization failures were not caught and
didn't trigger reconnections.
Co-authored-by: Bastien Teinturier <31281497+t-bast@users.noreply.github.com>
Introduce a `TxPublisher` actor to publish channel txs.
Move logic from watcher to this new actor.
Remove the `TxSigningKit` abstraction that was introduced a bit too early.
The `TxPublisher` will hold all the logic so we'll start by providing the full
commitments, and we'll extract more compact objects later.
We also now publish the commit-tx and its anchor-tx independently.
I'll add two of mine later once documentation is there and it's made sure they work as intended.
Co-authored-by: Bastien Teinturier <31281497+t-bast@users.noreply.github.com>
* preserve pg lock exception cause
* specialize connections by backend type
* added concurrency test on channels table
This test unveils a concurrency issue in the upsert logic of the local
channels db, with the following error being thrown when we update many
channels concurrently:
```
Canceled on identification as a pivot, during conflict out checking
```
* use pg upsert construct
This is the recommended pattern according to postgres doc
(https://www.postgresql.org/docs/current/plpgsql-control-structures.html#PLPGSQL-UPSERT-EXAMPLE):
> It is recommended that applications use INSERT with ON CONFLICT DO
UPDATE rather than actually using this pattern.
* reproduce and fix same issue in peers db
* proper formatting
* prefix sqlite-specific tests
* fix double init tests
Due to how the `TestDatabases` is instantiated, calling `dbs.dbName`
twice was a no-op.
* add jdbcurl files to gitignore
Electrum support was provided for mobile wallets, server nodes should always
run a bitcoind node as this provides more control (especially for utxo
management for anchor outputs channels).
Since wallets will use https://github.com/acinq/eclair-kmp instead of eclair,
we can now remove Electrum and API fee providers from eclair.
We also removed 3rd-party fee API providers that were only used on wallets.
We need to wait for the channel relayer to be registered to the event stream
before publishing an event, otherwise the channel relayer will never receive it.
We send it a first message and wait for its response to ensure it had time
to correctly initialize.
More symmetry between postgres and sqlite init.
* define SqliteDatabases and PostgresDatabase
Those classes implement traits `Databases`, `FileBackup` and
`ExclusiveLock`.
The goal is to have access to backend-specific attributes, particularly
in tests. It arguably makes the `Databases` cleaner and simpler, with a
nice symmetry between the `apply methods`.
* replace 5s lock timeout by NOLOCK
* use chaindir instead of datadir for jdbcurl file
It is more consistent with sqlite, and makes sense because we don't want
to mix up testnet and mainnet databases.
* add tests on locks and jdbc url check
Reverse the flow of yen's k-shortest path: go backwards like
we do in dijkstra.
Better tracking of already explored spur paths which improves
performance (especially tail latency).
An interesting side-effect of anchor outputs is that htlc txs can be merged
when they have the same lockTime (thanks to sighash flags).
We're not currently doing that, but our peers may do it, so we need to handle
it in the revoked commit tx case and correctly claim multiple outputs if
necessary.
Since anchor outputs, we not only deduce the commit tx fee from the funder's
main output but the cost of the anchors as well.
We rename the function that does that for more clarity.
It can be useful for newcomers to have a high-level view of the main
components in eclair. This will help them quickly find where they should
start digging into actual code to achieve what they want.
We're leaving a lot of details out to ensure this document stays up-to-date
and doesn't need to completely change every time we slightly rework internal
details of our architecture.
This commit clarifies some parts of the code on which we regularly have
questions during pull requests.
We also add a test for an edge case in shutdown that was correctly handled,
but not properly tested, to ensure non-regression.
Re-work the `CommitPublished` types to work better with anchor outputs.
We previously stored the txs spending utxos that we could claim: this
doesn't make sense anymore if these txs may be RBF-ed, because the final
tx will be different from the initial one.
We instead track what `OutPoint`s we can claim, and the information
necessary to claim them. This way we can in the future let a different
actor finalize the txs that spend these outpoints (set the fees and sign).
We also add information on mutual close txs to immediately identify our
output and its amount: this makes auditing how much sats we'll get back
very easy from the API when we have many channels to watch.
This commit contains a DB migration of the channel data types, but in a
backwards-compatible way: we can still read from old data. The only
scenario impacted is channels that started force-closing before the migration.
They need special care to handle the fact that they had less data than
migrated channels, which is why we keep some legacy code around.
The goal is to ensure maximum safety when migrating data.
To achieve this, we strictly segregate each codec version (to make sure
that we don't accidentally mix codec versions), while still letting
tests access each unitary codecs (which using inner `private classes`
would have prevented). Relevant tests only need to be moved to the
appropriate package.
The package structure is now:
```
wire
|
`-- internal
| |
| `-- channel
| | |
| | `-- version0
| | | |
| | | `-- ChannelCodecs0
| | |
| | `-- version1
| | | |
| | | `-- ChannelCodecs1
| | |
| | `-- ChannelCodecs
| |
| `-- CommandCodecs
|
`-- others
```
Co-authored-by: Bastien Teinturier <31281497+t-bast@users.noreply.github.com>
We incorrectly applied error handlers at each sub-route instead of applying
it after grouping all sub-routes together. The result was that only `getinfo`
could actually be called.
* rename `Auditor` to `DbEventHandler`
Move it to the `db` package (it was in the `payments` package for
historical reasons but doesn't deal only with payment anymore).
Better typing for channel lifecycle event.
* add meta info to local_channels table
Here is the rationale for implementing channel metadata as additional
columns in the `local_channels` table of the `channels` db, as opposed
to a dedicated `channel_metadata` table of a `audit` db:
1) There is a migration to do (in the `local_channels` table, no less!),
but it's just a table migration, as opposed to a data migration, if we
had to populate a new table in a separate database.
2) We don't need to worry about creating a new metadata line when a new
channel is created (compared to doing add-or-update stuff). It's only
_updating_ optional columns in a best-effort manner.
3) We don't need to worry about inconsistencies between two tables
located in two separated databases (that's a big one).
4) We may want to use the metadata during operations, not just for
audit purposes. For example to close channels that have stayed unused
for a long time.
5) The audit db is an append-only log of events and shouldn't be used
for anything else. There is no `UPDATE` sql statement in
`*AuditDb.scala`. The `channel_metadata` would break that heuristic.
When a node returns a TemporaryChannelFailure, we should ignore this channel
when retrying. In some cases (such as channels from routing hints) this was
not correctly handled.
Fixes#1725
When we receive a trampoline payment asking us to relay with an
expiry that is already below the current chain height, we know that
this payment will fail when it reaches the next trampoline node.
Instead of waiting for the next trampoline node to reject it, we reject
it immediately.
Refactor the API handlers.
Split handlers and directives in several files to make them more composable.
Co-authored-by: Pierre-Marie Padiou <pm47@users.noreply.github.com>
Co-authored-by: Bastien Teinturier <31281497+t-bast@users.noreply.github.com>
We previously relied on `context.child` to check whether we already had a
relay handler for a given payment_hash.
Unfortunately this could return an actor that is currently stopping itself.
When that happens our relay command can end up in the dead letters and the
payment will not be relayed, nor be failed upstream.
We fix that by maintaining the list of current relay handlers in the
NodeRelayer and removing them from the list before stopping them.
This is similar to what's done in the MultiPartPaymentFSM.
It makes sense to allow node operators to configure the value they want
to use as a maximum threshold for anchor outputs commitment tx feerate.
This allows node operators to raise this value when mempools start getting
full in anticipation of a potential rise of the min-relay-fee.
This value can also be overridden for specific nodes.
We previously used a Set, which means you could theoretically have a feature
that is both activated as `optional` and `mandatory`.
We change that to be a Map `feature -> support`.
* add tests on funding mindepth
We verify that when using wumbo channels:
- if we are funder we keep our regular min_depth
- if we are fundee we use a greater min_depth
* use lenses to simplify tags handling
Co-authored-by: Bastien Teinturier <31281497+t-bast@users.noreply.github.com>
When using anchor outputs, the commitment feerate is kept low (<10 sat/byte).
When we need to force-close a channel, we must ensure the commit tx and htlc
txs confirm before a given deadline, so we need to increase their feerates.
This is currently done only once, at broadcast time.
We use CPFP for the commit tx and RBF for the htlc txs.
If publishing fails because we don't have enough utxos available, it will
be retried after the next block is confirmed.
Note that it's still not recommended to activate anchor outputs.
More work needs to be done on this fee bumping logic and utxos management.
If a channel closes when we've received an UpdateFailHtlc, signed it but
not yet received our peer's revocation, we need to fail the htlc upstream.
That specific scenario was not correctly handled, resulting in upstream
htlcs that were not failed which would force our upstream peer to close
the channel.
Since we almost always know which transactions will spend the utxos that we are watching, we can optimize the watcher to look for those instead of starting from scratch.