These tests have proven to be:
a) very expensive, as they spin up many nodes, and perform long setup
b) are not testing anything specific, they just fuzz functionality
that is already tested otherwise
c) have not helped pinpoint any issues in living memory
d) are very flaky, making for really bad signal-to-noise, so much
that devs usually just restart without even looking at the logs
e) even if we were to look at the logs, we'd be unable to reproduce
due to the inherent randomness involved in these tests
f) are really noisy neighbors, causing other tests to flake as well,
further muddying the water
All in all, these tests are a waste of time, and source of
frustration.
[ Cleaned up python unused imports --RR ]
Changelog-None
If we fund a channel between two nodes, then mine all the blocks to
announce it, any other nodes may see the announcement before the
blocks, causing CI to complain about "bad gossip":
```
lightningd-4: 2022-01-25T22:33:25.468Z DEBUG 032cf15d1ad9c4a08d26eab1918f732d8ef8fdc6abb9640bf3db174372c491304e-gossipd: Ignoring future channel_announcment for 113x1x1 (current block 112)
lightningd-4: 2022-01-25T22:33:25.468Z DEBUG 032cf15d1ad9c4a08d26eab1918f732d8ef8fdc6abb9640bf3db174372c491304e-gossipd: Bad gossip order: WIRE_CHANNEL_UPDATE before announcement 113x1x1/0
lightningd-4: 2022-01-25T22:33:25.468Z DEBUG 032cf15d1ad9c4a08d26eab1918f732d8ef8fdc6abb9640bf3db174372c491304e-gossipd: Bad gossip order: WIRE_CHANNEL_UPDATE before announcement 113x1x1/1
lightningd-4: 2022-01-25T22:33:25.468Z DEBUG 032cf15d1ad9c4a08d26eab1918f732d8ef8fdc6abb9640bf3db174372c491304e-gossipd: Bad gossip order: WIRE_NODE_ANNOUNCEMENT before announcement 032cf15d1ad9c4a08d26eab1918f732d8ef8fdc6abb9640bf3db174372c491304e
```
Add a new helper for this case, and use it where there are more than 2 nodes.
Cleans up test_routing_gossip and a few other places which did this manually.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We were relying on the fee update to create an additional tx. That's
ugly; do an actual payment and make sure we definitely complete a new
tx by waiting for that *then* both revoke_and_ack.
(Without this, we could get a unilateral close instead of a penalty).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
```
l1.rpc.disconnect(l2.info['id'], force=True)
l1.rpc.connect(l2.info['id'], 'localhost', l2.port)
> l1.daemon.wait_for_log('option_static_remotekey enabled at 2/2')
tests/test_connection.py:3653:
```
If l2's channeld gets killed (due to reconnect) before it tells
lightningd it got the revoke_and_ack it will need a retransmission
*again*.
This makes the test more robust, and does more checks too.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
l1 might split in a commitment_signed before it notices the disconnect, and this test fails:
```
for i in range(0, len(disconnects)):
with pytest.raises(RpcError):
l1.rpc.sendpay(route, rhash, payment_secret=inv['payment_secret'])
> l1.rpc.waitsendpay(rhash)
E Failed: DID NOT RAISE <class 'pyln.client.lightning.RpcError'>
```
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
I also got an error under CI; it seems the sleep() was insufficient.
So try adding a sleep inside the check_coin_moves, which should cover
everyone.
```
acct_moves = acct_moves[number_moves:]
else:
> if not move_matches(m, acct_moves[0]):
E IndexError: list index out of range
```
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Once connectd is doing this, we can't close as soon as we send,
and in fact we can't do 'fail write' either.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
These would have to be done by connectd, not the local daemon, once it's
intermediating. Otherwise the remote peer won't see any change.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We need to stash/save the amount of the lease fees on a leased channel,
we do this by re-using the 'push' amount field on channel (which is
technically correct, since we're essentially pushing the fee amount to
the peer).
Also updates a bit of how the pushes are accounted for (pushed to now
has an event; their channel will open at zero but then they'll
immediately register a push event).
Leases fees are treated exactly the same as pushes, except labeled
differently.
Required adding a 'lease_fee' field to the inflights so we keep track of
the fee for the lease until the open happens.
The old model of coin movements attempted to compute fees etc and log
amounts, not utxos. This is not as robust, as multi-party opens and dual
funded channels make it hard to account for fees etc correctly.
Instead, we move towards a 'utxo' view of the onchain events. Every
event is either the creation or 'destruction' of a utxo. For cases where
the value of the utxo is not (fully) debited/credited to our account, we
also record the output_value. E.g. channel closings spend a utxo who's
entire value we may not own.
Since we're now tracking UTXOs onchain, we can now do more complex
assertions about the onchain footprint of them. The integration tests
have been updated to now use more 'chain aware' assertions about the
ending state.
If it reconnects by itself, it will get a warning message:
```
lightningd-2: 2021-10-08T01:40:42.446Z DEBUG 0266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c03518-channeld-chan#3: billboard: Sent reestablish, waiting for theirs
lightningd-2: 2021-10-08T01:40:42.446Z DEBUG 0266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c03518-channeld-chan#3: peer_in WIRE_ERROR
lightningd-2: 2021-10-08T01:40:42.447Z DEBUG 0266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c03518-channeld-chan#3: billboard perm: Received error channel 0a6220a3e904d17e72b5c3499928dc8a65720063c6395c999a129a0ff0b06afb: Forcibly closed by `close` command timeout
lightningd-2: 2021-10-08T01:40:42.448Z INFO 0266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c03518-chan#3: Peer transient failure in CHANNELD_NORMAL: channeld WARNING: error channel 0a6220a3e904d17e72b5c3499928dc8a65720063c6395c999a129a0ff0b06afb: Forcibly closed by `close` command timeout
```
And this will make CI complain.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Send a ping every 15-45 seconds. If we try to send another one and we
haven't got a reply, hang up.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Changed: Protocol: Send regular pings to detect dead connections (particularly for Tor).
To minimize the diffs, we #if 0 the code. We'll reenable it once
channeld is ready.
We also temporarily disable the ping tests.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We make it a first-class citizen internally, even though we won't use
it over the wire (at least, non-experimental builds). This scheme
follows the latest draft, in which features are flagged compulsory.
We also add several helper functions.
Since uses the *even* bits (as per latest spec), not the *odd* bits,
we have some other fixups.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Pointed out by @fiatjaf, and indeed it happened to me as well; a peer with
a high feerate reconnects and sends a similar (but now ludicrous) feerate,
and we get upset:
```
$ lightning-cli listpeers 039c73f53daad1050a6a72afb5353a2152f3152ee17168cd0ab28c2cb3e0050e36
{
"peers": [
{
"id": "039c73f53daad1050a6a72afb5353a2152f3152ee17168cd0ab28c2cb3e0050e36",
"connected": false,
"channels": [
{
"state": "CHANNELD_NORMAL",
"scratch_txid": "d796aa9c44920cc7169cdb61e36437bf180cedaec44103a69591ce2baac9b1d9",
"last_tx_fee": "14329000msat",
"last_tx_fee_msat": "14329000msat",
"feerate": {
"perkw": 19791,
"perkb": 79164
},
```
Then in the logs:
```
2021-07-23T19:34:56.227Z DEBUG 039c73f53daad1050a6a72afb5353a2152f3152ee17168cd0ab28c2cb3e0050e36-channeld-chan#39381: billboard perm: update_fee 17055 outside range 253-7210
```
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
In fact, we make it compulsory, which means if you don't understand it
you'll hang up on us!
Add some logging for that in future.
Changelog-Changed: Protocol: All new invoices require a payment_secret (i.e. modern TLV format onion)
Changelog-Changed: Protocol: We can no longer connect to peers which don't support `payment_secret`.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Let the callers do that (only channeld needs to do this).
We temporarily send an error on unknown reestablish in openingd, as
this mimic previous behavior and avoids breaking tests (it does leave
a BROKEN message in the logs though, so
test_funding_external_wallet_corners needs to ignore that for now.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It handles all the cases of retransmission, and in the normal case
retransmits shutdown and immediately returns for us to run closingd.
This is actually far simpler and reduces code duplication.
[ Includes fixup to stop warn_unused_result from Christian ]
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Fixed: Protocol: We could get stuck on signature exchange if we needed to retransmit the final revoke_and_ack.
In general, it's better to omit a field than put in a 'null', and
putting variable-named fields in an object is also a bad idea.
This is reflected in how hard this is to express in JSON schema, too.
Others:
1. Remove the obsolete "funding": "LOCAL" from unopened channels, but add
"opener": "local" as used in normal channels.
2. htlc cltv_expiry is a u16.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Deprecated: JSON-RPC: `listfunds` `channels` `funding_allocation_msat` and `funding_msat`: use `funding`.
Changelog-Deprecated: JSON-RPC: `listfunds` `channels` `last_tx_fee`: use `last_tx_fee_msat`.
Changelog-Deprecated: JSON-RPC: `listfunds` `channels` `closer` is now omitted if it does not apply, not JSON `null`.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Changed: Plugins: `fundchannel` and `multifundchannel` will now reserve funding they use for 2 weeks instead of 12 hours.
test_funding_cancel_race uses fundchannel_open etc; this new test does a
similar (but not exact thing, as 'aborts' dont work after an update
is confirmed) thing, using openchannel_update and openchannel_abort.
We set the timeout on first HTLC, but didn't clear it if that HTLC failed.
It's saner to have a per-HTLC timeout (since that's what it is!) and
also our timer infra is specially coded to scale approximately infinitely so
trying to optimize this is vastly premature.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Fixed: Protocol: We would sometimes gratuitously disconnect 30 seconds after an HTLC failed.