Got complaints about us hanging up on some nodes because they don't respond
to pings in a timely manner (e.g. ACINQ?), but that turned out to be something
else.
Nonetheless, we've had reports in the past of LND badly prioritizing gossip
traffic, and thus important messages can get queued behind gossip dumps!
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Changed: connectd: give busy peers more time to respond to pings.
If we can't broadcast the tx, confirm that it didn't end up in the
mempool or the utxo set before throwing an error.
Note that this doesn't protect us in the case where the funding
output has already been *spent*... but that's extremely rare, right?
Fixes#5296
Reported-By: @rustyrussell
Collab-With: @vincenzopalazzo
There's a 1 in 256 chance that our signature on the transaction is 70,
not 71 bytes long. This changes the freerate. So fix up the weight in
this case, to be the expected weight.
```
@unittest.skipIf(TEST_NETWORK == 'liquid-regtest', "Fees on elements are different")
@pytest.mark.developer("uses dev-fail")
@pytest.mark.openchannel('v1') # v2 the weight calculation is off by 3
deftest_multifunding_feerates(node_factory, bitcoind):
'''
Test feerate parameters for multifundchannel
'''
funding_tx_feerate = '10000perkw'
commitment_tx_feerate_int = 2000
commitment_tx_feerate = str(commitment_tx_feerate_int) + 'perkw'
l1, l2, l3 = node_factory.get_nodes(3, opts={'log-level': 'debug'})
l1.fundwallet(1 << 26)
def_connect_str(node):
return'{}@localhost:{}'.format(node.info['id'], node.port)
destinations = [{"id": _connect_str(l2), 'amount': 50000}]
res = l1.rpc.multifundchannel(destinations, feerate=funding_tx_feerate,
commitment_feerate=commitment_tx_feerate)
entry = bitcoind.rpc.getmempoolentry(res['txid'])
weight = entry['weight']
expected_fee = int(funding_tx_feerate[:-5]) * weight // 1000
> assert expected_fee == entry['fees']['base'] * 10 ** 8
E AssertionError: assert 7000 == (Decimal('0.00007010') * (10 ** 8))
tests/test_connection.py:1982: AssertionError
```
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We can fail to use larger channel if it's not ready yet:
```
2022-05-23T01:20:05.5325600Z # Check it used the larger channel!
2022-05-23T01:20:05.5326376Z > assert before[chan23a_idx]['to_us_msat'] == after[chan23a_idx]['to_us_msat']
2022-05-23T01:20:05.5326961Z E assert 1000000000msat == 900000000msat
2022-05-23T01:20:05.5327240Z
2022-05-23T01:20:05.5327621Z tests/test_connection.py:3896: AssertionError
```
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Over time, it has cost us more developer cycles than it has gained.
It has hidden intermittant bugs, and allowed cruft to accumulate:
when we eventually tried to figure out what was going wrong, the
actual change which caused it was now stale and forgotten.
This was a particular bane during the connectd rewrite, and I
worked through some issues which had occurred before, but were not
more likely.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We should be using amount_msat always. Many tests were not. Plus,
deprecating it simplifies the code.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Deprecated: JSONRPC: `sendpay` `route` elements `msatoshi` (use `amount_msat`)
The new msat fields are turned into Millisatoshi, so handle that correctly
too in tests too.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Deprecated: Plugins: `coin_movement` notification: `balance`, `credit`, `debit` and `fees` (use `balance_msat`, `credit_msat`, `debit_msat` and `fees_msat`)
Before this fix, there was the situation where a DEVELOPER=1 node would
announce non-public addresses on mainnet if detected. Since there
are some nodes on the internet that falsely report local addresses
we move this 'testing feature' to 'dev-allow-locahost' nodes.
Changelog-None
We call out to connectd to activate the peer, and while we do that,
channel->owner is NULL. A better pattern would be to set up the unsaved
channel once connectd has given us the peer, but this works for now.
Fixes: #5204
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Make it always a number; this makes the JSON request specification
simpler. We allowed a number since v0.10.1.
(reserve=True is the default anyway, so usually it can be omitted:
reserve=False becomes reserve=0).
Changelog-Deprecated: JSON-RPC: `fundpsbt`/`utxopsbt` `reserve` must be a number, not bool (for `true` use 72/don't specify, for `false` use 0). Numbers have been allowed since v0.10.1.
I have a separate branch which fixes this race properly, but it's not anything
to do with this PR.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This was missed in e8d2176e6b.
```
> raise ValueError(str(errors))
E ValueError:
E Node errors:
E - lightningd-2: had bad gossip messages
E - lightningd-3: had bad gossip messages
E Global errors:
contrib/pyln-testing/pyln/testing/fixtures.py:201: ValueError
...
0266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c03518-gossipd: Ignoring future channel_announcment for 105x1x2 (current block 104)
0266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c03518-gossipd: Bad gossip order: WIRE_CHANNEL_UPDATE before announcement 105x1x2/0
```
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Make sure it sees disconnect before reconnect, otherwise the next command
fails since we're now disconnected.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We may not see a disconnect instantly:
```
> assert len(l2.rpc.listpeers()['peers']) == 0
E assert 1 == 0
E +1
E -0
```
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is generally verboten now, since there can be multiple. There are a
few exceptions:
1. We sometimes want to know if there are *any* active channels.
2. Some dev commands still take peer id when they mean channel_id.
3. We still allow peer id when it's fully determined.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Changed: JSON-RPC: `close` by peer id will fail if there is more than one live channel (use `channel_id` or `short_channel_id` as id arg).
Rather than intuiting whether this is a new channel / active channel,
use the channel_id. This simplifies things and makes them explicit,
and prepares for multiple live channels per peer.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Either because lightningd tells us it wants to talk, or because the peer
says something about a channel.
We also introduce a behavior change: we disconnect after a failed open.
We might want to modify this later, but we it's a side-effect of openingd
not holding onto idle connections.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
openingd currently holds the connection to idle peers, but we're about
to change that: it will only look after peers which are actively
opening a connection. We can start this process by disconnecting
whenever we have a negotiation failure.
We could stay connected if we wanted to, but that would be up to
connectd to decide. Right now it's easier if we disconnect from any
idle peer once it's been active.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
When compiled without DEVELOPER this will now filter out `remote_addr` that
come from localhost. The testcase checks for DEVELOPER to test for correct
function of `remote_addr`.
Also, I renamed "test_connect" to "test_connect_basic" so it can be started
without all the other tests in that file that start with "test_connect..."
These tests have proven to be:
a) very expensive, as they spin up many nodes, and perform long setup
b) are not testing anything specific, they just fuzz functionality
that is already tested otherwise
c) have not helped pinpoint any issues in living memory
d) are very flaky, making for really bad signal-to-noise, so much
that devs usually just restart without even looking at the logs
e) even if we were to look at the logs, we'd be unable to reproduce
due to the inherent randomness involved in these tests
f) are really noisy neighbors, causing other tests to flake as well,
further muddying the water
All in all, these tests are a waste of time, and source of
frustration.
[ Cleaned up python unused imports --RR ]
Changelog-None
If we fund a channel between two nodes, then mine all the blocks to
announce it, any other nodes may see the announcement before the
blocks, causing CI to complain about "bad gossip":
```
lightningd-4: 2022-01-25T22:33:25.468Z DEBUG 032cf15d1ad9c4a08d26eab1918f732d8ef8fdc6abb9640bf3db174372c491304e-gossipd: Ignoring future channel_announcment for 113x1x1 (current block 112)
lightningd-4: 2022-01-25T22:33:25.468Z DEBUG 032cf15d1ad9c4a08d26eab1918f732d8ef8fdc6abb9640bf3db174372c491304e-gossipd: Bad gossip order: WIRE_CHANNEL_UPDATE before announcement 113x1x1/0
lightningd-4: 2022-01-25T22:33:25.468Z DEBUG 032cf15d1ad9c4a08d26eab1918f732d8ef8fdc6abb9640bf3db174372c491304e-gossipd: Bad gossip order: WIRE_CHANNEL_UPDATE before announcement 113x1x1/1
lightningd-4: 2022-01-25T22:33:25.468Z DEBUG 032cf15d1ad9c4a08d26eab1918f732d8ef8fdc6abb9640bf3db174372c491304e-gossipd: Bad gossip order: WIRE_NODE_ANNOUNCEMENT before announcement 032cf15d1ad9c4a08d26eab1918f732d8ef8fdc6abb9640bf3db174372c491304e
```
Add a new helper for this case, and use it where there are more than 2 nodes.
Cleans up test_routing_gossip and a few other places which did this manually.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We were relying on the fee update to create an additional tx. That's
ugly; do an actual payment and make sure we definitely complete a new
tx by waiting for that *then* both revoke_and_ack.
(Without this, we could get a unilateral close instead of a penalty).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
```
l1.rpc.disconnect(l2.info['id'], force=True)
l1.rpc.connect(l2.info['id'], 'localhost', l2.port)
> l1.daemon.wait_for_log('option_static_remotekey enabled at 2/2')
tests/test_connection.py:3653:
```
If l2's channeld gets killed (due to reconnect) before it tells
lightningd it got the revoke_and_ack it will need a retransmission
*again*.
This makes the test more robust, and does more checks too.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
l1 might split in a commitment_signed before it notices the disconnect, and this test fails:
```
for i in range(0, len(disconnects)):
with pytest.raises(RpcError):
l1.rpc.sendpay(route, rhash, payment_secret=inv['payment_secret'])
> l1.rpc.waitsendpay(rhash)
E Failed: DID NOT RAISE <class 'pyln.client.lightning.RpcError'>
```
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
I also got an error under CI; it seems the sleep() was insufficient.
So try adding a sleep inside the check_coin_moves, which should cover
everyone.
```
acct_moves = acct_moves[number_moves:]
else:
> if not move_matches(m, acct_moves[0]):
E IndexError: list index out of range
```
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Once connectd is doing this, we can't close as soon as we send,
and in fact we can't do 'fail write' either.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
These would have to be done by connectd, not the local daemon, once it's
intermediating. Otherwise the remote peer won't see any change.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>