We were previously handling `UpdateFailHtlc` and
`UpdateFailMalformedHtlc` similarly to `UpdateFulfillHtlc`, but that is
wrong:
- a fulfill needs to be propagated as soon as possible, because it
allows us to pull funds from upstream
- a fail needs to be cross-signed downstream (=irrevocably confirmed)
before forwarding it upstream, because it means that we won't
be able to pull funds anymore. In other words we need to be absolutely
sure that the htlc won't be fulfilled downstream if we fail it upstream,
otherwise we risk losing money.
Also added tests.
* Consider htlc_minimum/maximum_msat when computing a route
* Compare shortChannelIds first as it is less costly than comparing the pubkeys
* Remove export to dot functionality
* Remove dependency jgraph
* Add optimized constructor to build the graph faster
* Use fibonacci heaps from jheaps.org
* Use Set instead of Seq for extraEdges, remove redundant publishing of channel updates
* Use Set for ignored edges
`NioEventLoopGroup` is expensive to create and should be shared with all clients. This also prevents issues when several event loop groups are created and are not shutdown correctly.
* replaced akka.io by netty in electrum client and enabled ssl support
* updated docker-testkit to 0.9.8 so that electrum tests pass on windows
* use ssl port on testnet/mainnet
* removed experimental warning on electrum
* added a revocation timeout
If a peer doesn't quickly reply to a `commit_sig`, we assume that it is
experiencing technical issues, and we disconnect. This will make pending
(unsigned) `update_add_htlc` to be fast-failed and will hopefully limit
the number of htlc that time out in the network.
By default we wait 20 seconds, configurable with
`eclair.revocation-timeout`.
This fixes#745.
We need to put back watches on restart in "future remote commit published" scenarii, otherwise we will never consider the channel closed if we restart before the "claim main output" tx is confirmed. Note that there was no risk of losing funds, but the channel would have lingered forever.
We can now use the `overrideDefaults` parameter in `Setup` to programmatically
provide a custom electrum address when starting eclair, by setting the
`eclair.electrum.host` and `eclair.electrum.port` entries in the configuration.
When these entries are set, eclair-core will always try to connect to this server
instead of relying on a random server picked from the preset lists.
We persist htlc data in order to be able to claim htlc outputs in
case a revoked tx is published by our counterparty, so only htlcs
above remote's `dust_limit` matter.
Removed the TODO because we need data to be indexed by commit number so
it is ok to write the same htlc data for every commitment it is included
in.
* Add instructions for Bitcoin Core 0.17.0 [ci skip]
Bitcoin Core 0.17.0 deprecates the `signrawtransaction` RPC call, which will be removed in version 0.18.0, you need to enable this call if you want your eclair node to use a 0.1.70 node.
* README: add an example of how to use the new bitcoin.conf sections [ci skip]
* updated to scalatest 3.0.5
* use scalatest runner instead of junit
Output is far more readable, and makes console (incl. travis) reports
actually usable.
Turned off test logs as error reporting is enough to figure out what
happens.
The only downside is that we can't use junit's categories to group
tests, like we did for docker related tests. We could use nested suites,
but that seems to be overkill so I just removed the categories. Users
will only have the possibility to either skip/run all tests.
* update scala-maven-plugin to 3.4.2
NB: This requires maven 3.5.4, which means that we currently need to
manually install maven on travis.
Also updated Docker java version to 8u181 (8u171 for compiling).
* Logging: use a rolling file appender
Use one file per day, keep 90 days of logs with a total maximum size
capped at 5 Gb
* Router: log routing broadcast in debug level only
When updating relay fee in state OFFLINE, the new channel_update must
have the disabled flag on.
This caused tests to be flaky, added necessary checks to always make
them fail in case that kind of regression happens again.
Previously it was only possible to update relay fee in NORMAL state,
which is not very convenient because most of the time there are always
some channels in OFFLINE state.
This works like the NORMAL case, except that the new `channel_update`
won't be broadcast immediately. It will be sent out next time the
channel goes back to NORMAL, in the same `channel_update` that sets the
`enable` flag to true.
Also added a default handler that properly rejects the
CMD_UPDATE_RELAY_FEE command in all other states.
This fixes#695, and also adds the channel point in the default channel output.
```bash
$ ./eclair-cli channel 00fd4d56d94af93765561bb6cb081f519b9627d3f455eba3215a7846a1af0e46
{
"nodeId": "0232e20e7b68b9b673fb25f48322b151a93186bffe4550045040673797ceca43cf",
"shortChannelId": "845230006070001",
"channelId": "00fd4d56d94af93765561bb6cb081f519b9627d3f455eba3215a7846a1af0e46",
"state": "NORMAL",
"balanceSat": 9858759,
"capacitySat": 10000000,
"channelPoint": "470eafa146785a21a3eb55f4d327969b511f08cbb61b566537f94ad9564dfd00:1"
}
```
Bitcoin core returns an error `missing inputs (code: -25)` if the tx that we want to publish has already been published and its output have been spent. When we receive this error, we try to get the tx, in order to know if it is in the blockchain, or if its inputs were spent by another tx.
Note: If the outputs of the tx were still unspent, bitcoin core would return "transaction already in block chain (code: -27)" and this is already handled.
This is a simple optimisation, we don't have to keep all `update_fee`, just the last one.
cf BOLT 2:
> An update_fee message is sent by the node which is paying the Bitcoin fee. Like any update, it's first committed to the receiver's commitment transaction and then (once acknowledged) committed to the sender's. Unlike an HTLC, update_fee is never closed but simply replaced.
* Fix handling of born again channels
When we receive a recent update for a channel that we had marked as stale we
must send a query to the underlying transport, not the origin of the update (which
would send the query back to the router)
If we don't have the origin, it means that we already have forwarded the fulfill so that's not a big deal.
This can happen if they send a signature containing the fulfill, then fail the channel before we have time to sign it.
* Router: reset sync state on reconnection
When we're reconnected to a peer we will start a new sync process and should reset our sync
state with that peer.
Fixed regression caused by 2c1811d: we now don't force sending a
channel_update at the same time with channel_announcement.
This greatly simplifies the rebroadcast logic, and is what caused the
integration test to fail.
Added proper test on Peer, testing the actor, not only static methods.
Currently we don't remember channels that we have pruned, so we will happily revalidate the same channels again if a node re-sends them to us, and prune them again, a.k.a. the "zombie churn".
Before channel queries, rejecting a stale channel without validating it wasn't trivial, because nodes sent us the `channel_announcement` before `channel_update`s, and only after receiving the `channel_update` could we know if the channel was still stale. Since we had no way of requesting the `channel_announcement` for one particular channel, we would have to buffer it, which would lead to potential DOS issues.
But now that we have channel queries, we can now be much more efficient. Process goes like this:
(1) channel x is detected as stale gets pruned, and its id is added to the pruned db
(2) later on we receive a `channel_announcement` from Eve, we ignore it because the channel is in the pruned db
(3) we also receive old `channel_update`s from Eve nodes, just ignore them because we don't know the channel
(4) then one day some other node George sends us the `channel_announcement`, we still ignore it because the channel is still in the pruned db
(5) but then George sends us recent `channel_update`s, and we know that the channel is back from the dead. We ignore those `channel_update`s, but we aldo remove the channel id from the pruned db, and we request announcements for that node from George
(6) George sends us the `channel_announcement` again, we validate it
(7) George then sends us the `channel_update`s again, we process them
(8) done!
This also allows removing the pruning code that we were doing on-the-fly when answering to routing table sync requests.
Needless to say that this leads to a huge reduction in CPU/bandwidth usage on well-connected nodes.
Fixes#623, #624.
Nodes currently receive tons of bogus channel_announcements, mainly with unexisting or long-spent funding transactions. Of course those don't pass the validation and are rejected, but that takes a significant amount of resources: bandwidth, multiple calls to bitcoind, etc.
On top of that, we forget those announcements as soon as we have rejected them, and will happily revalidate them next time we receive them. As a result, a typical node on mainnet will validate 10s of thousands of useless announcements every day.
As far as we know, this is apparently due to bug in another implementation, but that may very well be used as a DOS attack vector in the future.
This PR adds a simple mechanism to react to misbehaving peer and handle three types of misbehaviors:
(a) bad announcement sigs: that is a serious offense, for now we just close the connection, but in the future we will ban the peer for that kind of things (the same way bitcoin core does)
(b) funding tx already spent: peer send us channel_announcement, but the channel has been closed (funding tx already spent); if we receive too many of those, we will ignore future announcements from this peer for a given time
(c) same as (b), but the channel doesn't even exist (funding tx not found). That may be due to reorgs on testnet.
Needless to say that this leads to a huge reduction in CPU/bandwidth usage on well-connected nodes.
* Improve startup error handling
* minor changes to ZMQACtor
Scheduler now sends messages instead of directly calling our checkXX methods.
It will work the same but should fix the NPE we sometimes get when we stop the app.