I also got an error under CI; it seems the sleep() was insufficient.
So try adding a sleep inside the check_coin_moves, which should cover
everyone.
```
acct_moves = acct_moves[number_moves:]
else:
> if not move_matches(m, acct_moves[0]):
E IndexError: list index out of range
```
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Once connectd is doing this, we can't close as soon as we send,
and in fact we can't do 'fail write' either.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
These would have to be done by connectd, not the local daemon, once it's
intermediating. Otherwise the remote peer won't see any change.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We need to stash/save the amount of the lease fees on a leased channel,
we do this by re-using the 'push' amount field on channel (which is
technically correct, since we're essentially pushing the fee amount to
the peer).
Also updates a bit of how the pushes are accounted for (pushed to now
has an event; their channel will open at zero but then they'll
immediately register a push event).
Leases fees are treated exactly the same as pushes, except labeled
differently.
Required adding a 'lease_fee' field to the inflights so we keep track of
the fee for the lease until the open happens.
The old model of coin movements attempted to compute fees etc and log
amounts, not utxos. This is not as robust, as multi-party opens and dual
funded channels make it hard to account for fees etc correctly.
Instead, we move towards a 'utxo' view of the onchain events. Every
event is either the creation or 'destruction' of a utxo. For cases where
the value of the utxo is not (fully) debited/credited to our account, we
also record the output_value. E.g. channel closings spend a utxo who's
entire value we may not own.
Since we're now tracking UTXOs onchain, we can now do more complex
assertions about the onchain footprint of them. The integration tests
have been updated to now use more 'chain aware' assertions about the
ending state.
If it reconnects by itself, it will get a warning message:
```
lightningd-2: 2021-10-08T01:40:42.446Z DEBUG 0266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c03518-channeld-chan#3: billboard: Sent reestablish, waiting for theirs
lightningd-2: 2021-10-08T01:40:42.446Z DEBUG 0266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c03518-channeld-chan#3: peer_in WIRE_ERROR
lightningd-2: 2021-10-08T01:40:42.447Z DEBUG 0266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c03518-channeld-chan#3: billboard perm: Received error channel 0a6220a3e904d17e72b5c3499928dc8a65720063c6395c999a129a0ff0b06afb: Forcibly closed by `close` command timeout
lightningd-2: 2021-10-08T01:40:42.448Z INFO 0266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c03518-chan#3: Peer transient failure in CHANNELD_NORMAL: channeld WARNING: error channel 0a6220a3e904d17e72b5c3499928dc8a65720063c6395c999a129a0ff0b06afb: Forcibly closed by `close` command timeout
```
And this will make CI complain.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Send a ping every 15-45 seconds. If we try to send another one and we
haven't got a reply, hang up.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Changed: Protocol: Send regular pings to detect dead connections (particularly for Tor).
To minimize the diffs, we #if 0 the code. We'll reenable it once
channeld is ready.
We also temporarily disable the ping tests.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We make it a first-class citizen internally, even though we won't use
it over the wire (at least, non-experimental builds). This scheme
follows the latest draft, in which features are flagged compulsory.
We also add several helper functions.
Since uses the *even* bits (as per latest spec), not the *odd* bits,
we have some other fixups.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Pointed out by @fiatjaf, and indeed it happened to me as well; a peer with
a high feerate reconnects and sends a similar (but now ludicrous) feerate,
and we get upset:
```
$ lightning-cli listpeers 039c73f53daad1050a6a72afb5353a2152f3152ee17168cd0ab28c2cb3e0050e36
{
"peers": [
{
"id": "039c73f53daad1050a6a72afb5353a2152f3152ee17168cd0ab28c2cb3e0050e36",
"connected": false,
"channels": [
{
"state": "CHANNELD_NORMAL",
"scratch_txid": "d796aa9c44920cc7169cdb61e36437bf180cedaec44103a69591ce2baac9b1d9",
"last_tx_fee": "14329000msat",
"last_tx_fee_msat": "14329000msat",
"feerate": {
"perkw": 19791,
"perkb": 79164
},
```
Then in the logs:
```
2021-07-23T19:34:56.227Z DEBUG 039c73f53daad1050a6a72afb5353a2152f3152ee17168cd0ab28c2cb3e0050e36-channeld-chan#39381: billboard perm: update_fee 17055 outside range 253-7210
```
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
In fact, we make it compulsory, which means if you don't understand it
you'll hang up on us!
Add some logging for that in future.
Changelog-Changed: Protocol: All new invoices require a payment_secret (i.e. modern TLV format onion)
Changelog-Changed: Protocol: We can no longer connect to peers which don't support `payment_secret`.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Let the callers do that (only channeld needs to do this).
We temporarily send an error on unknown reestablish in openingd, as
this mimic previous behavior and avoids breaking tests (it does leave
a BROKEN message in the logs though, so
test_funding_external_wallet_corners needs to ignore that for now.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It handles all the cases of retransmission, and in the normal case
retransmits shutdown and immediately returns for us to run closingd.
This is actually far simpler and reduces code duplication.
[ Includes fixup to stop warn_unused_result from Christian ]
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Fixed: Protocol: We could get stuck on signature exchange if we needed to retransmit the final revoke_and_ack.
In general, it's better to omit a field than put in a 'null', and
putting variable-named fields in an object is also a bad idea.
This is reflected in how hard this is to express in JSON schema, too.
Others:
1. Remove the obsolete "funding": "LOCAL" from unopened channels, but add
"opener": "local" as used in normal channels.
2. htlc cltv_expiry is a u16.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Deprecated: JSON-RPC: `listfunds` `channels` `funding_allocation_msat` and `funding_msat`: use `funding`.
Changelog-Deprecated: JSON-RPC: `listfunds` `channels` `last_tx_fee`: use `last_tx_fee_msat`.
Changelog-Deprecated: JSON-RPC: `listfunds` `channels` `closer` is now omitted if it does not apply, not JSON `null`.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Changed: Plugins: `fundchannel` and `multifundchannel` will now reserve funding they use for 2 weeks instead of 12 hours.
test_funding_cancel_race uses fundchannel_open etc; this new test does a
similar (but not exact thing, as 'aborts' dont work after an update
is confirmed) thing, using openchannel_update and openchannel_abort.
We set the timeout on first HTLC, but didn't clear it if that HTLC failed.
It's saner to have a per-HTLC timeout (since that's what it is!) and
also our timer infra is specially coded to scale approximately infinitely so
trying to optimize this is vastly premature.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Fixed: Protocol: We would sometimes gratuitously disconnect 30 seconds after an HTLC failed.
If l2 didn't get FUNDING_LOCKED from l1 before it disconnected, it
won't be in state CHANNELD_NORMAL: it will be in DUALOPEND_AWAITING_LOCKIN.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This would flake fairly regularly, what we really care about is
asserting that the l2 node is in CHANNELD_NORMAL state, while the l1
node hasn't progressed that far yet.
Trying to put all the disconnect logic into the same path was a dumb
idea. If you asked to reconnect but passed in an 'unsaved' channel, we
would not call the 'reconnect' code.
Instead, we make a differentiation between "unsaved" channels
(ones that we haven't received commitment tx for) and handle the
disconnect for these separate from where we want to do a reconnect.
Tests that will only run when !EXPERIMENTAL_DUAL_FUND:
@pytest.marker.openchannel('v1')
def test_...()
Tests that will only run when EXPERIMENTAL_DUAL_FUND:
@pytest.marker.openchannel('v2')
def test_...()
Users are more upset recently with the cost of unilateral closes
than they are the risk of being cheated. While we complete our
anchor implementation so we can use low fees there, let's
get less aggressive (we already have 34 or 18 blocks to close
in the worst case).
The changes are:
- Commit transactions were "2 CONSERVATIVE" now "6 ECONOMICAL".
- HTLC resolution txs were "3 CONSERVATIVE" now "6 ECONOMICAL".
- Penalty txs were "3 CONSERVATIVE" now "12 ECONOMICAL".
- Normal txs were "4 ECONOMICAL" now "12 ECONOMICAL".
There can be no perfect levels, but we have had understandable
complaints recently about how high our default fee levels are.
Changelog-Changed: Protocol: channel feerates reduced to bitcoind's "6 block ECONOMICAL" rate.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
You can now activate dual-funded channels using the
`--experimental-dual-fund` flag
Changelog-Changed: Config: `--experimental-dual-fund` runtime flag will enable dual-funded protocol on this node
Otherwise, we might find an address other than the one given and
the user might think that address worked.
Fixes: #4185
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Added: JSON-RPC: `connect` returns `address` it actually connected to
And update all the in-tree callers.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Deprecated: JSON-RPC: `fundchannel_complete` `txid` and `txout` parameters (use `psbt`)
Users have no idea what they would pay for unilateral closes.
At least this gives them a clue!
Reported-by: @az0re on IRC.
Changelog-Added: JSON-RPC: `listpeers` now shows latest feerate and unilaral close fee.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
They need to specify fees to get a channel before this, but it's possible.
The test revealed no surprises (other than the last change).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Saves a great deal of confusion for regtest users.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Changed: JSON-RPC: If bitcoind won't give a fee estimate in regtest, use minimum.
Nested `with` exception checks don't work; fix flakes i'm seeing with
valgrind causing failures because blockchain not up to date when
`fundchannel` called.
It was using a trick to only shut down the first node, and forgetting
about the others. This could lead to processes not being stopped
correctly and to test failures because the directory isn't cleaned up
correctly.
Now we use the executor to shut as many nodes as possible in parallel.
Since we turned many errors into warnings, we want our tests to fail
when they happen unexpectedly. We make WARNING clear in the strings
we print, too, to help out.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
And make all the callers choose which one. In general, I prefer warn,
which lets them reconnect and try again, however some places are either
stated that they must be errors in the spec itself, or in openingd
where we abandon the channel when we close the connection anyway.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Changed: Protocol: we now send warning messages and close the connection, except on unrecoverable errors.
If we're quick (or the node is slow) we end up reconnecting before our
counterparty has realized the state transition, resulting in an
unexpected re-establish.
There's a case where a dropped funding_locked will result in the peer
moving onto channeld, while we stay in dualopend. As we haven't
received their funding_locked, we retransmit tx_sigs, which channeld
will need to handle.
With the patch the peer drops it on the floor; the peer will resend
funding_locked on reconnect, which will correctly advance us to
channeld and CHANNELD_NORMAL
If we're already attempting to connect to a peer, we would ignore
new connection requests. This is problematic if your node has bad
connection details for the node -- you can't update it while inflight.
This patch appends new connection suggestions to the list of connections
to try.
Fixes#4154
Since `fundchannel` now supports the 'close_to' argument, we can remove
all the logic needed to call fundchannel_start here.
Underneath, we're still calling `fundchannel_start`, we're just one (or
two, if you count multifundchannel) call levels away from it now.
We had one report of this, and then Eugene and Roasbeef of Lightning
Labs confirmed it; they saw misordered HTLCs on reconnection too.
Since we didn't enforce this when we receive HTLCs, we never noticed :(
Fixes: #3920
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Fixed: Protocol: fixed retransmission order of multiple new HTLCs (causing channel close with LND)
We didn't care, but other implementations (particularly lnd) do. And it
does violate the spec.
(We need to use skip not xfail on the test which catches this, since
xfail doesn't seem to stop errors reported by cleanup)
(Includes Christian's typo fix!)
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Marking spent means if the transaction doesn't confirm for some
reason, the user will need to force a rescan to find the funds. Now
we have timed reservations, reserving for (an additional) 12 hours
should be sufficient.
We also take this opportunity (now we have our own callback path)
to record the tx in the wallet only on success.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Some minor phrasing differences cause test changes.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Changed: txprepare reservations stay across restarts: use fundpsbt/reservepsbt/unreservepsbt
Changelog-Removed: txprepare `destination` `satoshi` argument form removed (deprecated v0.7.3)
With a feerate of 7500perkw and subtracting 660 sats for anchors, a
20,000 sat channel has capacity about 9800 sat, below our default:
You gave bad parameters: channel capacity with funding 20000sat, reserves 546sat/546sat, max_htlc_value_in_flight_msat is 18446744073709551615msat, channel capacity is 9818sat, which is below 10000000msat
So bump channel amounts.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
And document exactly what it does: insist that an HTLC can pass of
this value (module assumptions of feerate).
Note that we remove the "is_opener" test from the capacity calculation
for anchor fees: it doesn't matter which side it is, someone has to pay
for anchor fees to it deducts from capacity.
This change breaks the test, which we rewrite.
Changelog-Changed: config: `min-capacity-sat` is now stricter about checking usable capacity of channels.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It's actually not possible to currently tell if you're using anchor_outputs
with a peer (since it depends on whether you both supported it at *channel open*).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-added: JSON-RPC: `listpeers` shows `features` list for each channel.
Anchor outputs break many assumptions in our tests:
1. Remove some hardcoded numbers in favor of a fee calc, so we only have to
change in one place.
FIXME: This should also be done for elements!
2. Do binary search to get feerate for a given closing fee.
3. Don't assume output #0: anchor outputs perturb them.
4. Don't assume we can make 1ksat channels (anchors cost 660 sats!).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
And when it's set, and we're SLOW_MACHINE, simply disable valgrind.
Since Travis (SLOW_MACHINE=1) only does VALGRIND=1 DEVELOPER=1 tests,
and VALGRIND=0 DEVELOPER=0 tests, it was missing tests which needed
DEVELOPER and !VALGRIND.
Instead, this demotes them to non-valgrind tests for SLOW_MACHINEs.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
I started replacing all get_node() calls, but got bored, so then just did the
tests which call get_node() 3 times or more.
Ends up not making a measurable speed difference, but it does make some
things neater and more standard.
Times with SLOW_MACHINE=1 (given that's how Travis tests):
Time before (non-valgrind):
393 sec (had 3 failures?)
Time after (non-valgrind):
410 sec
Time before (valgrind):
890 seconds (had 2 failures)
Time after (valgrind):
892 sec
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
I thought this was timing out because I made it slow with the
change to txprepare as a plugin. In fact, it was timing out
because sometimes gossip comes so fast it gets suppressed
and we never get the log messags.
Still, before this it took 98 seconds under valgrind and
24 under non-valgrind, so it's an improvement to time as
well as robustness.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Changed: `fundchannel_cancel` will now succeed even when executed while a `fundchannel_complete` is ongoing; in that case, it will be considered as cancelling the funding *after* the `fundchannel_complete` succeeds.
Let me introduce the concept of "Sequential Consistency":
All operations on parallel processes form a single total order agreed upon by all processes.
So for example, suppose we have parallel invocations of `fundchannel_complete` and `fundchannel_cancel`:
+--[fundchannel_complete]-->
|
--[fundchannel_start]-+
|
+--[fundchannel_cancel]---->
What "Sequential Consistency" means is that the above parallel operations can be serialized as a single total order as:
--[fundchannel_start]--[fundchannel_complete]--[fundchannel_cancel]-->
Or:
--[fundchannel_start]--[fundchannel_cancel]--[fundchannel_complete]-->
In the first case, `fundchannel_complete` succeeds, and the `fundchannel_cancel` invocation also succeeds, sending an `error` to the peer to make them forget the chanel.
In the second case, `fundchannel_cancel` succeeds, and the succeeding `fundchannel_complete` invocation fails, since the funding is already cancelled and there is nothing to complete.
Note that in both cases, `fundchannel_cancel` **always** succeeds.
Unfortunately, prior to this commit, `fundchannel_cancel` could fail with a `Try fundchannel_cancel again` error if the `fundchannel_complete` is ongoing when the `fundchannel_cancel` is initiated.
This violates Sequential Consistency, as there is no single total order that would have caused `fundchannel_cancel` to fail.
This commit is a minimal patch which just reschedules `fundchannel_cancel` to occur after any `fundchannel_complete` that is ongoing.
I noticed the following in logs for tests/test_connection.py::test_feerate_stress:
```
DEBUG 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-chan#1: Failing HTLC 18446744073709551615 due to peer death
DEBUG 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-chan#1: local_routing_failure: 8194 (WIRE_TEMPORARY_NODE_FAILURE)
```
This is because it reports the (transient) node_failure error, because
our channel_failure message is incomplete. Fix this wart up.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
If we don't wait for close tx to reach mempool, it might not get to
depth 100, and we don't get 'onchaind complete, forgetting peer'.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Previously we've used the term 'funder' to refer to the peer
paying the fees for a transaction; v2 of openchannel will make
this no longer true. Instead we rename this to 'opener', or the
peer sending the 'open_channel' message, since this will be universally
true in a dual-funding world.
The new `keysend` plugin modifies the node features that we send to
peers. This commit breaks out the 'expected_features' we use for tests
to encompass this differentiation.
Sending update_fee immediately after channel establishment seems to
upset LND, so work around it by deferring it. The reason we increase
the fee after establishment is because now we might need to close the
channel in a hurry due to htlcs, but until there are htlcs that's
unnecessary.
Fixes: #3596
Changelog-Changed: Added workaround for lnd rejecting our commitment_signed when we send an update_fee after channel confirmed.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>