* added a revocation timeout
If a peer doesn't quickly reply to a `commit_sig`, we assume that it is
experiencing technical issues, and we disconnect. This will make pending
(unsigned) `update_add_htlc` to be fast-failed and will hopefully limit
the number of htlc that time out in the network.
By default we wait 20 seconds, configurable with
`eclair.revocation-timeout`.
This fixes#745.
We need to put back watches on restart in "future remote commit published" scenarii, otherwise we will never consider the channel closed if we restart before the "claim main output" tx is confirmed. Note that there was no risk of losing funds, but the channel would have lingered forever.
We can now use the `overrideDefaults` parameter in `Setup` to programmatically
provide a custom electrum address when starting eclair, by setting the
`eclair.electrum.host` and `eclair.electrum.port` entries in the configuration.
When these entries are set, eclair-core will always try to connect to this server
instead of relying on a random server picked from the preset lists.
We persist htlc data in order to be able to claim htlc outputs in
case a revoked tx is published by our counterparty, so only htlcs
above remote's `dust_limit` matter.
Removed the TODO because we need data to be indexed by commit number so
it is ok to write the same htlc data for every commitment it is included
in.
* Add instructions for Bitcoin Core 0.17.0 [ci skip]
Bitcoin Core 0.17.0 deprecates the `signrawtransaction` RPC call, which will be removed in version 0.18.0, you need to enable this call if you want your eclair node to use a 0.1.70 node.
* README: add an example of how to use the new bitcoin.conf sections [ci skip]
* updated to scalatest 3.0.5
* use scalatest runner instead of junit
Output is far more readable, and makes console (incl. travis) reports
actually usable.
Turned off test logs as error reporting is enough to figure out what
happens.
The only downside is that we can't use junit's categories to group
tests, like we did for docker related tests. We could use nested suites,
but that seems to be overkill so I just removed the categories. Users
will only have the possibility to either skip/run all tests.
* update scala-maven-plugin to 3.4.2
NB: This requires maven 3.5.4, which means that we currently need to
manually install maven on travis.
Also updated Docker java version to 8u181 (8u171 for compiling).
* Logging: use a rolling file appender
Use one file per day, keep 90 days of logs with a total maximum size
capped at 5 Gb
* Router: log routing broadcast in debug level only
When updating relay fee in state OFFLINE, the new channel_update must
have the disabled flag on.
This caused tests to be flaky, added necessary checks to always make
them fail in case that kind of regression happens again.
Previously it was only possible to update relay fee in NORMAL state,
which is not very convenient because most of the time there are always
some channels in OFFLINE state.
This works like the NORMAL case, except that the new `channel_update`
won't be broadcast immediately. It will be sent out next time the
channel goes back to NORMAL, in the same `channel_update` that sets the
`enable` flag to true.
Also added a default handler that properly rejects the
CMD_UPDATE_RELAY_FEE command in all other states.
This fixes#695, and also adds the channel point in the default channel output.
```bash
$ ./eclair-cli channel 00fd4d56d94af93765561bb6cb081f519b9627d3f455eba3215a7846a1af0e46
{
"nodeId": "0232e20e7b68b9b673fb25f48322b151a93186bffe4550045040673797ceca43cf",
"shortChannelId": "845230006070001",
"channelId": "00fd4d56d94af93765561bb6cb081f519b9627d3f455eba3215a7846a1af0e46",
"state": "NORMAL",
"balanceSat": 9858759,
"capacitySat": 10000000,
"channelPoint": "470eafa146785a21a3eb55f4d327969b511f08cbb61b566537f94ad9564dfd00:1"
}
```
Bitcoin core returns an error `missing inputs (code: -25)` if the tx that we want to publish has already been published and its output have been spent. When we receive this error, we try to get the tx, in order to know if it is in the blockchain, or if its inputs were spent by another tx.
Note: If the outputs of the tx were still unspent, bitcoin core would return "transaction already in block chain (code: -27)" and this is already handled.
This is a simple optimisation, we don't have to keep all `update_fee`, just the last one.
cf BOLT 2:
> An update_fee message is sent by the node which is paying the Bitcoin fee. Like any update, it's first committed to the receiver's commitment transaction and then (once acknowledged) committed to the sender's. Unlike an HTLC, update_fee is never closed but simply replaced.
* Fix handling of born again channels
When we receive a recent update for a channel that we had marked as stale we
must send a query to the underlying transport, not the origin of the update (which
would send the query back to the router)
If we don't have the origin, it means that we already have forwarded the fulfill so that's not a big deal.
This can happen if they send a signature containing the fulfill, then fail the channel before we have time to sign it.
* Router: reset sync state on reconnection
When we're reconnected to a peer we will start a new sync process and should reset our sync
state with that peer.
Fixed regression caused by 2c1811d: we now don't force sending a
channel_update at the same time with channel_announcement.
This greatly simplifies the rebroadcast logic, and is what caused the
integration test to fail.
Added proper test on Peer, testing the actor, not only static methods.
Currently we don't remember channels that we have pruned, so we will happily revalidate the same channels again if a node re-sends them to us, and prune them again, a.k.a. the "zombie churn".
Before channel queries, rejecting a stale channel without validating it wasn't trivial, because nodes sent us the `channel_announcement` before `channel_update`s, and only after receiving the `channel_update` could we know if the channel was still stale. Since we had no way of requesting the `channel_announcement` for one particular channel, we would have to buffer it, which would lead to potential DOS issues.
But now that we have channel queries, we can now be much more efficient. Process goes like this:
(1) channel x is detected as stale gets pruned, and its id is added to the pruned db
(2) later on we receive a `channel_announcement` from Eve, we ignore it because the channel is in the pruned db
(3) we also receive old `channel_update`s from Eve nodes, just ignore them because we don't know the channel
(4) then one day some other node George sends us the `channel_announcement`, we still ignore it because the channel is still in the pruned db
(5) but then George sends us recent `channel_update`s, and we know that the channel is back from the dead. We ignore those `channel_update`s, but we aldo remove the channel id from the pruned db, and we request announcements for that node from George
(6) George sends us the `channel_announcement` again, we validate it
(7) George then sends us the `channel_update`s again, we process them
(8) done!
This also allows removing the pruning code that we were doing on-the-fly when answering to routing table sync requests.
Needless to say that this leads to a huge reduction in CPU/bandwidth usage on well-connected nodes.
Fixes#623, #624.
Nodes currently receive tons of bogus channel_announcements, mainly with unexisting or long-spent funding transactions. Of course those don't pass the validation and are rejected, but that takes a significant amount of resources: bandwidth, multiple calls to bitcoind, etc.
On top of that, we forget those announcements as soon as we have rejected them, and will happily revalidate them next time we receive them. As a result, a typical node on mainnet will validate 10s of thousands of useless announcements every day.
As far as we know, this is apparently due to bug in another implementation, but that may very well be used as a DOS attack vector in the future.
This PR adds a simple mechanism to react to misbehaving peer and handle three types of misbehaviors:
(a) bad announcement sigs: that is a serious offense, for now we just close the connection, but in the future we will ban the peer for that kind of things (the same way bitcoin core does)
(b) funding tx already spent: peer send us channel_announcement, but the channel has been closed (funding tx already spent); if we receive too many of those, we will ignore future announcements from this peer for a given time
(c) same as (b), but the channel doesn't even exist (funding tx not found). That may be due to reorgs on testnet.
Needless to say that this leads to a huge reduction in CPU/bandwidth usage on well-connected nodes.
* Improve startup error handling
* minor changes to ZMQACtor
Scheduler now sends messages instead of directly calling our checkXX methods.
It will work the same but should fix the NPE we sometimes get when we stop the app.
* Correctly handle multiple channel_range_replies
The scheme we use to keep tracks of channel queries with each peer would forget about
missing data when several channel_range_replies are sent back for a single channel_range_queries.
* RoutingSync: remove peer entry properly
* Remove peer entry on our sync map only when we've received a `reply_short_channel_ids_end` message.
* Make routing sync test more explicit
* Routing Sync: rename Sync.count to Sync.totalMissingCount
When we just signed an outgoing htlc, it is only present in the next
remote commit (which will become the remote commit once the current one
is revoked).
If we unilaterally close the channel, and our commitment is confirmed,
then the htlc will never reach the chain, it has been "overriden" and
should be failed ASAP. This is correctly handled since
6d5ec8c4fa.
But if remote unilaterally close the channel with its *current*
commitment (that doesn't contain the htlc), then the same thing happens:
the htlc is also "overriden", and we should fail it.
This fixes#691.
* permissive codec for failure messages
Some implementations were including/ommitting the message type when
including a `channel_update` in failure messages.
We now use a codec that supports both versions for decoding, and
will encode *with* the message type.
* make router handle updates from failure messages
This is a regression caused by 9f708acf04,
which made the `Peer` class encapsulates network announcements inside
`PeerRoutingMessage`, in order to preserve the origin of the messages.
But when we get channel updates as part of failure messages when making
payments, they weren't encapsulated and were ignored by the router.
Re-enabled a non-regression test in `IntegrationSpec` which will prevent
this from happening in the future.
* improve handling of `Update` payment failures
Exclude nodes that send us multiple `Update` failures for the same
payment (they may be bad actors)
There are two different expiry checks:
(a) relative checks: when relaying an htlc, we need to be sure that
the difference of expiries between the outgoing htlc and the incoming
htlc is equal to the `cltv_expiry_delta` that we advertise in the
`channel_update` for the outgoing channel;
(b) absolute checks: we need to make sure that those values are not too
early or too far compared to the current "blockchain time".
The check for (a) needs to be done in the `Relayer`, which is the case
currently. This means that we will check the expiry delta *after* having
signed the incoming htlc, and we will fail the htlc (not the channel) if
the delta is incorrect.
The check for (b) was done in the `Commitments.receiveAdd` method. This
seems to make sense, because we would want to make sure as early as
possible that an incoming htlc has correct expiries, but it is actually
incorrect: the spec tells us to accept (=cross-sign) the htlc, and only
then to fail it before relaying it to the next node.
Indeed there is no risk in accepting an htlc that has an expiry in the
past, or an expiry very far in the future, because this only affects the
counterparty's balance. We just don't want to sign that kind of outgoing
htlcs.
Moving the check to the `sendAdd` will result in an error being return
to the relayer, which will then fail the corresponding incoming htlc.
* removed max body size in http client
This is required because since f3676c6497
we retrieve multiple full blocks in parallel.
* trivial: removed unused code
* trivial: added log
* trivial: more unused code removal
* Add a "smooth" fee provider
It will return the moving average of fee estimates provided by
its internal fee provider, within a user defined window that can
be set with eclair.smooth-feerate-window.
* Use a default fee rate smoothing window of 3
This should smooth our fee rate when there is a sudden change in
onchain fees and prevent channels with c-lightning node from getting
closed because they disagree with our fee rate.