Changelog-Experimental: option `--lease-fee-base-msat` renamed to `--lease-fee-base-sat`
Changelog-Experimental: option `--lease-fee-base-msat` deprecated and will be removed next release
These tests have proven to be:
a) very expensive, as they spin up many nodes, and perform long setup
b) are not testing anything specific, they just fuzz functionality
that is already tested otherwise
c) have not helped pinpoint any issues in living memory
d) are very flaky, making for really bad signal-to-noise, so much
that devs usually just restart without even looking at the logs
e) even if we were to look at the logs, we'd be unable to reproduce
due to the inherent randomness involved in these tests
f) are really noisy neighbors, causing other tests to flake as well,
further muddying the water
All in all, these tests are a waste of time, and source of
frustration.
[ Cleaned up python unused imports --RR ]
Changelog-None
We really need our own lnprototest tests for packet-based stuff;
these message-based tests are inherently delicate and awkward.
In particular, connectd now does dev-disconnect, so the socket is not
immediately closed after a dev-disconnect command. In this case, the
WIRE_SHUTDOWN has often already been written from connectd to channeld.
But it sometimes works, too.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
If we call update_channel_from_inflight *twice* with the same inflight, we
will get bad results. Using tal_steal() here was a premature optimization:
```
Valgrind error file: valgrind-errors.496395
==496395== Invalid read of size 8
==496395== at 0x22A9D3: to_tal_hdr (tal.c:174)
==496395== by 0x22B4B5: tal_steal_ (tal.c:498)
==496395== by 0x16A13D: update_channel_from_inflight (peer_control.c:1225)
==496395== by 0x16A4C7: funding_depth_cb (peer_control.c:1299)
==496395== by 0x182807: txw_fire (watch.c:232)
==496395== by 0x182AA9: watch_topology_changed (watch.c:300)
==496395== by 0x1290ED: updates_complete (chaintopology.c:624)
==496395== by 0x129BF4: get_new_block (chaintopology.c:835)
==496395== by 0x125EEF: getrawblockbyheight_callback (bitcoind.c:362)
==496395== by 0x176ECC: plugin_response_handle (plugin.c:584)
==496395== by 0x1770F5: plugin_read_json_one (plugin.c:690)
==496395== by 0x1772D9: plugin_read_json (plugin.c:735)
==496395== Address 0x89fbb08 is 24 bytes inside a block of size 104 free'd
==496395== at 0x483CA3F: free (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==496395== by 0x22B193: del_tree (tal.c:421)
==496395== by 0x22B461: tal_free (tal.c:486)
==496395== by 0x16A123: update_channel_from_inflight (peer_control.c:1223)
==496395== by 0x16A4C7: funding_depth_cb (peer_control.c:1299)
==496395== by 0x182807: txw_fire (watch.c:232)
==496395== by 0x182AA9: watch_topology_changed (watch.c:300)
==496395== by 0x1290ED: updates_complete (chaintopology.c:624)
==496395== by 0x129BF4: get_new_block (chaintopology.c:835)
==496395== by 0x125EEF: getrawblockbyheight_callback (bitcoind.c:362)
==496395== by 0x176ECC: plugin_response_handle (plugin.c:584)
==496395== by 0x1770F5: plugin_read_json_one (plugin.c:690)
==496395== Block was alloc'd at
==496395== at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==496395== by 0x22AC1C: allocate (tal.c:250)
==496395== by 0x22B1DD: tal_alloc_ (tal.c:428)
==496395== by 0x22B3A6: tal_alloc_arr_ (tal.c:471)
==496395== by 0x22C094: tal_dup_ (tal.c:805)
==496395== by 0x12B274: new_inflight (channel.c:187)
==496395== by 0x136D4C: wallet_commit_channel (dual_open_control.c:1260)
==496395== by 0x13B084: handle_commit_received (dual_open_control.c:2839)
==496395== by 0x13B6AF: dual_opend_msg (dual_open_control.c:2976)
==496395== by 0x1809FF: sd_msg_read (subd.c:553)
==496395== by 0x218F5D: next_plan (io.c:59)
==496395== by 0x219B65: do_plan (io.c:407)
```
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
If we fund a channel between two nodes, then mine all the blocks to
announce it, any other nodes may see the announcement before the
blocks, causing CI to complain about "bad gossip":
```
lightningd-4: 2022-01-25T22:33:25.468Z DEBUG 032cf15d1ad9c4a08d26eab1918f732d8ef8fdc6abb9640bf3db174372c491304e-gossipd: Ignoring future channel_announcment for 113x1x1 (current block 112)
lightningd-4: 2022-01-25T22:33:25.468Z DEBUG 032cf15d1ad9c4a08d26eab1918f732d8ef8fdc6abb9640bf3db174372c491304e-gossipd: Bad gossip order: WIRE_CHANNEL_UPDATE before announcement 113x1x1/0
lightningd-4: 2022-01-25T22:33:25.468Z DEBUG 032cf15d1ad9c4a08d26eab1918f732d8ef8fdc6abb9640bf3db174372c491304e-gossipd: Bad gossip order: WIRE_CHANNEL_UPDATE before announcement 113x1x1/1
lightningd-4: 2022-01-25T22:33:25.468Z DEBUG 032cf15d1ad9c4a08d26eab1918f732d8ef8fdc6abb9640bf3db174372c491304e-gossipd: Bad gossip order: WIRE_NODE_ANNOUNCEMENT before announcement 032cf15d1ad9c4a08d26eab1918f732d8ef8fdc6abb9640bf3db174372c491304e
```
Add a new helper for this case, and use it where there are more than 2 nodes.
Cleans up test_routing_gossip and a few other places which did this manually.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This fixes lightningd's chronic weight underestimate.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Fixed: closingd: more accurate weight estimation helps mutual closing near min/max feerates.
There's actually a bug in our closing tx size estimation; I'll do
a separate patch for this, though.
Seems this used to be flaky, now we always flush queues, so it's
more reliably caught.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Once connectd is doing this, we can't close as soon as we send,
and in fact we can't do 'fail write' either.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Fire off a snapshot of current account balances (node wallet + every
'active' channel) after we've caught up to the chain tip for the *first*
time (in other words, on start).
We need to stash/save the amount of the lease fees on a leased channel,
we do this by re-using the 'push' amount field on channel (which is
technically correct, since we're essentially pushing the fee amount to
the peer).
Also updates a bit of how the pushes are accounted for (pushed to now
has an event; their channel will open at zero but then they'll
immediately register a push event).
Leases fees are treated exactly the same as pushes, except labeled
differently.
Required adding a 'lease_fee' field to the inflights so we keep track of
the fee for the lease until the open happens.
The old model of coin movements attempted to compute fees etc and log
amounts, not utxos. This is not as robust, as multi-party opens and dual
funded channels make it hard to account for fees etc correctly.
Instead, we move towards a 'utxo' view of the onchain events. Every
event is either the creation or 'destruction' of a utxo. For cases where
the value of the utxo is not (fully) debited/credited to our account, we
also record the output_value. E.g. channel closings spend a utxo who's
entire value we may not own.
Since we're now tracking UTXOs onchain, we can now do more complex
assertions about the onchain footprint of them. The integration tests
have been updated to now use more 'chain aware' assertions about the
ending state.
shutdown_subdaemons frees the channel and calls destroy_close_command_on_channel_destroy, see gdb:
0 destroy_close_command_on_channel_destroy (_=0x55db6ca38e18, cc=0x55db6ca43338) at lightningd/closing_control.c:94
1 0x000055db6a8181b5 in notify (ctx=0x55db6ca38df0, type=TAL_NOTIFY_FREE, info=0x55db6ca38e18, saved_errno=0) at ccan/ccan/tal/tal.c:237
2 0x000055db6a8186bb in del_tree (t=0x55db6ca38df0, orig=0x55db6ca38e18, saved_errno=0) at ccan/ccan/tal/tal.c:402
3 0x000055db6a818a47 in tal_free (ctx=0x55db6ca38e18) at ccan/ccan/tal/tal.c:486
4 0x000055db6a73fffa in shutdown_subdaemons (ld=0x55db6c8b4ca8) at lightningd/lightningd.c:543
5 0x000055db6a741098 in main (argc=21, argv=0x7ffffa3e8048) at lightningd/lightningd.c:1192
Before this PR, there was no io_loop after shutdown_subdaemons and client side raised a
general `Connection to RPC server lost.`
Now we test the more specific `Channel forgotten before proper close.`, which is good!
BTW, this test was added recently in PR #4599.
We're about to require that fundchannel_complete() take a PSBT, where it
does sanity checks to avoid this error, making this a difficult mistake
to make.
Is it time to remove this functionality anyway? @cdecker?
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This allows cmdline users to have more idea what's going on.
Inspired-by: https://github.com/ElementsProject/lightning/issues/4777
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Added: `close` now notifies about the feeranges each side uses.
That was quick!
We remove the 50% test, since the default is now to use quickclose.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Added: Protocol: We now perform quick-close if the peer supports it.
This affects the range we offer even without quick-close, but it's
more critical for quick-close.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Added: JSONRPC: `close` now takes a `feerange` parameter to set min/max fee rates for mutual close.
This is now allowed for anchors (as per https://github.com/lightningnetwork/lightning-rfc/pull/847).
We need to play with feerates, since we don't put a discount on anchor
commitments yet.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Based on a commit by @niftynei, but:
- Separated quickclose logic from main loop.
- I made it indep of anchor_outputs, use and option instead.
- Disable if they've specified how to negotiate.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We actually were waiting for l3 to disconnect, not l2.
But in general we should be looking for status rather than grovelling
in the logs where possible, so I changed all those.
```
2021-08-17T04:40:34.9015538Z # l2 leases a channel from l3
2021-08-17T04:40:34.9016520Z l2.rpc.connect(l3.info['id'], 'localhost', l3.port)
2021-08-17T04:40:34.9017570Z rates = l2.rpc.dev_queryrates(l3.info['id'], amount, amount)
2021-08-17T04:40:34.9018724Z l3.daemon.wait_for_log('disconnect')
2021-08-17T04:40:34.9019851Z l2.rpc.connect(l3.info['id'], 'localhost', l3.port)
2021-08-17T04:40:34.9021467Z l2.rpc.fundchannel(l3.info['id'], amount, request_amt=amount,
2021-08-17T04:40:34.9022865Z feerate='{}perkw'.format(feerate), minconf=0,
2021-08-17T04:40:34.9024000Z > compact_lease=rates['compact_lease'])
...
2021-08-17T04:40:34.9103422Z > raise RpcError(method, payload, resp['error'])
2021-08-17T04:40:34.9106829Z E pyln.client.lightning.RpcError: RPC call failed: method: fundchannel, payload: {'id': '035d2b1192dfba134e10e540875d366ebc8bc353d5aa766b80c090b39c3a5d885d', 'amount': 500000, 'feerate': '2000perkw', 'announce': True, 'minconf': 0, 'request_amt': 500000, 'compact_lease': '029a00640064000000644c4b40'}, error: {'code': 400, 'message': 'Unable to connect, no address known for peer', 'data': {'id': '035d2b1192dfba134e10e540875d366ebc8bc353d5aa766b80c090b39c3a5d885d', 'method': 'connect'}}
```
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We fail waiting for 'Resolved OUR_UNILATERAL/DELAYED_OUTPUT_TO_US by our proposal OUR_DELAYED_RETURN_TO_WALLET'
because we close *two* channels, but only wait for one transaction before mining a block.
This means we might only have one, and we immediately mine the next wait_for_mempool=1,
so the OUR_DELAYED_RETURN_TO_WALLET isn't mined.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
If l3 is too slow, it can get channel_announcement after channel
is closed, so it gets upset at the node_announcement which follows:
022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-gossipd: Updated pending announce with update 103x1x1/1
022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-gossipd: channel_announcement: no unspent txout 103x1x1
022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-gossipd: Bad gossip order: WIRE_NODE_ANNOUNCEMENT before announcement 0266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c03518
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Since we now use 'compact_lease' to gate an open (if the rates have
changed, we fail), we no longer need to rely on query rates for figuring
things out, so we make it dev-only.
Changelog-Changed: JSON-API: queryrates is now developer only
We were actually using the last commit tx's size, since we were
setting it in lightningd. Instead, hand the min and desired feerates
to closingd, and (as it knows the weight of the closing tx), and have
it start negotiation from there.
This can be significantly less when anchor outputs are enabled: for
example in test_closing.py, the commit tx weight is 1124 Sipa, the
close is 672 Sipa!
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Changed: Protocol: Use a more accurate fee for mutual close negotiation.
We set amounts to a list, but then don't use the values except
as a boolean.
Make it a boolean, which should also speed the test up!
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This lets us transition (with a few supporting changes) to closingd,
which will happily let them mutual close with us.
We already handle the case where this mutual close is redundant (for
packet loss), so this is easy.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Added: Protocol: We will now reestablish and negotiate mutual close on channels we've already closed (great if peer has lost their database).
In particular, if one side sees the final CLOSING_SIGNED and the other
doesn't, we won't talk when it reconnects.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It handles all the cases of retransmission, and in the normal case
retransmits shutdown and immediately returns for us to run closingd.
This is actually far simpler and reduces code duplication.
[ Includes fixup to stop warn_unused_result from Christian ]
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Fixed: Protocol: We could get stuck on signature exchange if we needed to retransmit the final revoke_and_ack.
This was turned by at random by CI:
1. Alice has sent shutdown, but it still waiting for revoke_and_ack.
2. Bob has sent and received shutdown, and sent revoke_and_ack,
so it considers it time for signature exchange.
3. Disconnect before Alice received revoke_and_ack.
4. Reconnect, Bob is in closingd, which doesn't rexmit revoke_and_ack.
5. Timeout.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Temporarily rename old getroute to getrouteold (we will remove this).
Changelog-Changed: JSON-RPC: `getroute` is now implemented in a plugin.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Prior to this, sending a v1 address (or, in fact, any random crap!)
would cause the unsupporting node to unilaterally close.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This includes anysegwit and the updated HTLC tiebreak test vector. It
also adds explicit wording for invalid per_commitment_secret (which
nicely matches our code already!).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>