Make this a bit less difficult to see what's going on; use common
variables for the paths etc.
Also make the final `rm` step recursive, same as the above find step
which patches the files.
Files are in `contrib/pyln-grpc-proto/pyln/grpc`, but the upper level
find command acts at a level above.
At cold start, if your node is behind the blocktip and you've sent your
peer a blockheight counter from the future, we shouldn't confuse ourselves
with our rollback/replay.
Should fix flakes in CI that were spotting BROKEN blockheight updates.
Logs below from a previuos CI fail (edited for relative clarity)
The one that sasy "{ SENT_ADD_ACK_REVOCATION:111 }, our current 108` is
the tell; the last line is the node finally catching up to the tip.
In the test we get into this state by stopping and restarting the node.
```
2023-07-22T11:24:28.2754533Z lightningd-1 2023-07-22T11:19:34.188Z DEBUG 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-chan#2: Already have funding locked in
2023-07-22T11:24:28.2755486Z lightningd-1 2023-07-22T11:19:34.188Z DEBUG 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-chan#2: attempting update blockheight a5b23dff5177badd6df725c
efeb83ceccbfc52dc64a16b38894a41f0ad8fa181
2023-07-22T11:24:28.2755778Z lightningd-1 2023-07-22T11:19:34.188Z DEBUG lightningd: update_blockheight: height = 108
2023-07-22T11:24:28.2766210Z lightningd-1 2023-07-22T11:19:34.210Z DEBUG 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-channeld-chan#2: init LOCAL: remote_per_commit = 029563e7c898
5d8b95bdfe19e47e494bb8ec8d53ff4edb93f156be57667bfee8c9, old_remote_per_commit = 02bf3117c149d324361f0b418db8984b1e29af70c773eb2865a41ff7f583c7c9ed next_idx_local = 3 next_idx_remote = 3 revocations_recei
ved = 2 feerates { SENT_ADD_ACK_REVOCATION:3750 } range 253-150000 blockheights { SENT_ADD_ACK_REVOCATION:111 }, our current 108
2023-07-22T11:24:28.2768866Z lightningd-1 2023-07-22T11:19:34.211Z DEBUG 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-channeld-chan#2: peer_out WIRE_CHANNEL_REESTABLISH
2023-07-22T11:24:28.2769416Z lightningd-1 2023-07-22T11:19:34.211Z DEBUG 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-channeld-chan#2: billboard: Sent reestablish, waiting for the
irs
2023-07-22T11:24:28.2771115Z lightningd-1 2023-07-22T11:19:34.212Z DEBUG 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-channeld-chan#2: peer_in WIRE_CHANNEL_REESTABLISH
2023-07-22T11:24:28.2774150Z lightningd-1 2023-07-22T11:19:34.212Z DEBUG 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-channeld-chan#2: Got reestablish commit=3 revoke=2
2023-07-22T11:24:28.2776056Z lightningd-1 2023-07-22T11:19:34.212Z DEBUG 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-channeld-chan#2: next_revocation_number = 2
2023-07-22T11:24:28.2805639Z lightningd-1 2023-07-22T11:19:34.239Z **BROKEN** 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-channeld-chan#2: current blockheight 108 less than last 111
2023-07-22T11:24:28.2823960Z lightningd-1 2023-07-22T11:19:34.240Z DEBUG lightningd: Adding block 109: 5f67b6e110eb3c3457bea4fcf0d04ce9be90efeee5df8e083ed4266074ca911f
2023-07-22T11:24:28.2833154Z lightningd-1 2023-07-22T11:19:34.251Z **BROKEN** 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-channeld-chan#2: current blockheight 108 less than last 111
2023-07-22T11:24:28.2833630Z lightningd-1 2023-07-22T11:19:34.252Z DEBUG 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-channeld-chan#2: Trying commit
2023-07-22T11:24:28.2834165Z lightningd-1 2023-07-22T11:19:34.252Z **BROKEN** 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-channeld-chan#2: current blockheight 108 less than last 111
2023-07-22T11:24:28.2835070Z lightningd-1 2023-07-22T11:19:34.252Z DEBUG 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-channeld-chan#2: Can't send commit: nothing to send, feechange not wanted ({ SENT_ADD_ACK_REVOCATION:3750 }) blockheight not wanted ({ SENT_ADD_ACK_REVOCATION:111 })
2023-07-22T11:24:28.2835516Z lightningd-1 2023-07-22T11:19:34.350Z DEBUG lightningd: Adding block 110: 5f43f3ac9d808e3a309720d1b0727a00d5a3d3ddca71d97401e233637e87639c
2023-07-22T11:24:28.2835962Z lightningd-1 2023-07-22T11:19:34.476Z DEBUG lightningd: Adding block 111: 55b0d1e0a08ff6233e186e6735cb1cbec33e2b0a6e7d08f2622e8c1db30b54b9
```
This looked like a test flake, but was real:
```
l1.daemon.wait_for_log("closing soon due to the funding outpoint being spent")
# We won't gossip the dead channel any more (but we still propagate node_announcement). But connectd is not explicitly synced, so wait for "a bit".
time.sleep(1)
> assert len(get_gossip(l1)) == 2
E assert 4 == 2
```
We can see that two channel_updates come in *after* we mark it dying:
```
gossipd: channel 103x1x0 closing soon due to the funding outpoint being spent
gossipd: REPLY WIRE_GOSSIPD_NEW_BLOCKHEIGHT_REPLY with 0 fds
022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-gossipd: Received channel_update for channel 103x1x0/0 now DISABLED
022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-gossipd: Received channel_update for channel 103x1x0/1 now DISABLED
```
We should keep marking channel_updates the same way.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We get intermittant reports of subd->conn being leaked, but I could never find it.
That's because it's actually subd which is not referenced any more: subd->conn
gets reported because it's subd's tal_parent (and, except for the reference in
subd, not referenced either).
The real issue is that the channel->owner is reassigned to the new subdaemon,
and the old one is still exiting. During that time, we can see a "leak".
```
- Node /tmp/ltests-hkr089bp/test_sql_1/lightning-3/ has memory leaks: [
{
"backtrace": [
"ccan/ccan/tal/tal.c:477 (tal_alloc_)",
"ccan/ccan/io/io.c:91 (io_new_conn_)",
"lightningd/subd.c:774 (new_subd)",
"lightningd/subd.c:828 (new_channel_subd_)",
"lightningd/dual_open_control.c:3662 (peer_restart_dualopend)",
"lightningd/peer_control.c:1161 (connect_activate_subd)",
"lightningd/peer_control.c:1273 (peer_connected_hook_final)",
"lightningd/plugin_hook.c:213 (plugin_hook_callback)",
"lightningd/plugin.c:591 (plugin_response_handle)",
"lightningd/plugin.c:702 (plugin_read_json_one)",
"lightningd/plugin.c:747 (plugin_read_json)",
"ccan/ccan/io/io.c:59 (next_plan)",
"ccan/ccan/io/io.c:407 (do_plan)",
"ccan/ccan/io/io.c:417 (io_ready)",
"ccan/ccan/io/poll.c:453 (io_loop)",
"lightningd/io_loop_with_timers.c:22 (io_loop_with_timers)",
"lightningd/lightningd.c:1249 (main)"
],
"label": "ccan/ccan/io/io.c:91:struct io_conn",
"parents": [
"lightningd/lightningd.c:107:struct lightningd"
],
"value": "0x556c63c859f8"
}
```
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
```
channeld WARNING: Bad update_fail_malformed_htlc failure code 4103
```
Warren Togami reports this happening with Bitrefill on every reconnect,
so it's clearly something LND does.
(4103 is TEMPORARY_CHANNEL_FAILURE, which does not belong in update_fail_malformed_htlc).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Changed: Protocol: We allow update_fail_malformed_htlc with invalid error codes (LND?)
GH was complaining about Node v12 being used in v2.2.4. We used to
hold back from this upgrade because v3 was breaking quotas, but we
should try again.
The channel selection did not consider the amounts that we are trying
to transfer, which in a multiplexed channel world could end up always
selecting a channel that is too small for the payment. We also log
which channel was selected based on the selector that is passed in,
allowing us to better follow the decisions.
Changelog-Fixed: pay: `sendonion` and `sendpay` will now consider amounts involved when using picking one channel for a peer
Debugging a number of payments showed that we sometimes waste a number
of attempts routing through a channel via its alias, rather than its
scid. This is because while we annotate the scid when it has been set,
we do not do so for the alias. The alias then is picked for routing
despite not having enough capacity, failing the attempt locally.
It can also happen that we alternate between scid and alias, doubling
the number of failed attempts before we can make progress.
This patch sets the hint for the alias to a capacity of 0 and disables
it as if the peer were offline. This means when available we'll always
use the scid, which is also far easier to read in the logs.
Changelog-Fixed: pay: We now track spendable amounts when routing on both the local alias as well as the short channel ID
If you miss a wait event, you can catch up by doing listinvoices and
getting the max of these fields. It's also a good debugging clue.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Now we have defined ordering, we can add a start param.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Added: JSON-RPC: `listinvoices` has `index` and `start` parameters for listing control.
This will initially be for listinvoices, but can be expanded to other
list commands.
It's documented, but it makes promises which currently don't exist:
* listinvoice does not support `index` or `start` yet.
* It doesn't actually fire when invoices change yet.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Added: JSON-RPC: `wait`: new generic command to wait for events.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Added: JSON-RPC: `setchannel` adds a new `ignorefeelimits` parameter to allow peer to set arbitrary commitment transaction fees on a per-channel basis.
Recent cppcheck doesn't like our code; until we fix that, make it easy
to run every other source check.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We tried to fix this flake before, but now it actually happened again it shows that
b5845afd43 wasn't correct.
```
# If this happens fast enough, connect fails with "disconnected
# during connection"
try:
l1.rpc.connect(l2.info['id'], 'localhost', l2.port)
except RpcError as err:
> assert "disconnected during connection" in err.error
E assert 'disconnected during connection' in {'code': 402, 'message': 'disconnected during connection'}
E + where {'code': 402, 'message': 'disconnected during connection'} = RpcError("RPC call failed: method: connect, payload: {'id': '022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59', 'host': 'localhost', 'port': 41849}, error: {'code': 402, 'message': 'disconnected during connection'}").error
tests/test_misc.py:2728: AssertionError
```
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
l1 shuts down too fast, channeld doesn't get to tell gossipd about the channel,
and l2 sends (private) update_channel and we complain:
```
lightningd-2 2023-07-20T03:42:37.744Z DEBUG gossipd: received private channel announcement from channeld for 103x1x0
lightningd-2 2023-07-20T03:42:37.791Z DEBUG 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-hsmd: Got WIRE_HSMD_CUPDATE_SIG_REQ
lightningd-2 2023-07-20T03:42:37.796Z DEBUG hsmd: Client: Received message 3 from client
lightningd-1 2023-07-20T03:42:37.857Z DEBUG 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-gossipd: Bad gossip order: WIRE_CHANNEL_UPDATE before announcement 103x1x0/0
lightningd-1 2023-07-20T03:42:37.864Z DEBUG 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-gossipd: Bad gossip order: WIRE_CHANNEL_UPDATE before announcement 103x1x0/0
```
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Writing to the gossip_store file is not explicitly synchronized, so it
seems that connectd has not caught up with the dying flags we've set.
Simply wait for a second. Hacky, but should work.
```
def test_gossip_not_dying(node_factory, bitcoind):
l1 = node_factory.get_node()
l2, l3 = node_factory.line_graph(2, wait_for_announce=True)
l1.rpc.connect(l2.info['id'], 'localhost', l2.port)
# Wait until it sees all the updates, node announcments.
wait_for(lambda: len([n for n in l1.rpc.listnodes()['nodes'] if 'alias' in n])
+ len(l1.rpc.listchannels()['channels']) == 4)
def get_gossip(node):
out = subprocess.run(['devtools/gossipwith',
'--initial-sync',
'--timeout-after=2',
'{}@localhost:{}'.format(node.info['id'], node.port)],
check=True,
timeout=TIMEOUT, stdout=subprocess.PIPE).stdout
msgs = []
while len(out):
l, t = struct.unpack('>HH', out[0:4])
msg = out[2:2 + l]
out = out[2 + l:]
# Ignore pings, timestamp_filter
if t == 265 or t == 18:
continue
# channel_announcement node_announcement or channel_update
assert t == 256 or t == 257 or t == 258
msgs.append(msg)
return msgs
assert len(get_gossip(l1)) == 5
# Close l2->l3, mine block.
l2.rpc.close(l3.info['id'])
bitcoind.generate_block(1, wait_for_mempool=1)
l1.daemon.wait_for_log("closing soon due to the funding outpoint being spent")
# We won't gossip the dead channel any more (but we still propagate node_announcement)
> assert len(get_gossip(l1)) == 2
E assert 4 == 2
E + where 4 = len([b'\x01\x01L\xc2\xbe\x08\xbb\xa8~\x8f\x80R\x9e`J\x1cS\x18|\x12\n\xe5_6\xb0\xa6S\x9fU\xae\x19\x9c\x1fXB\xab\x81N\x13\xdc\x8e}\xb9\xb0\xb6\xe6\x14h\xd4:\x90\xce\xc3\xad\x9ezR`~\xba@\xc9\x91e\x89\xab\x00\x07\x88\xa0\x00\n\x02i\xa2d\xb8\xa9`\x02-"6 \xa3Y\xa4\x7f\xf7\xf7\xacD|\x85\xc4l\x92=\xa53\x89"\x1a\x00T\xc1\x1c\x1e<\xa3\x1dY\x02-"SILENTARTIST-27fc801-modded\x00\x00\x00\x00\x00\x00\x00', b'\x01\x01M\x00\x86\x8e4\xc8\x90p\n\x98\xf7\xce4\x1e\xd9\xd6-6\xfb(\xf0\xe4\xb7\x90\x7f\x89\xb9\xfa\x00\x82\x1b\xeb\x1fY\x93\x1e\xe0c\xb2\x0e<\xe6\x06x\xb7\xe54};\xfbd\xa0\x01S\xcf\xe8{\xf8\x8f/\xa7\xc0\xe2h\x00\x07\x88\xa0\x00\n\x02i\xa2d\xb8\xa9`\x03]+\x11\x92\xdf\xba\x13N\x10\xe5@\x87]6n\xbc\x8b\xc3S\xd5\xaavk\x80\xc0\x90\xb3\x9c:]\x88]\x03]+HOPPINGFIRE-27fc801-modded\x00\x00\x00\x00\x00\x00\x00\x00', b'\x01\x02~\xe0\x13\xb4\x84Gz\xcf(\xd4w\xa7\x9bZ\x1a\xe82\xd1\xe1\x1bLm\xc8\n\xcd\xd4\xfb\x88\xf8\xc6\xdbt\\v\x89~\xd1.e\xc8\xa8o\x9c`\xd5\xa8\x97\x11l\xf2g\xcb\xa8\xcf\r\x869\xd3\xb5\xd5\x9a\xa0my\x9f\x87\xebX\x0b\x9e_\x11\xdc!\x1e\x9f\xb6j\xbb6\x99\x99\x90D\xf8\xfe\x14h\x01\x16#\x936B\x86\xc6\x00\x00g\x00\x00\x01\x00\x00d\xb8\xa9d\x01\x02\x00\x06\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\n\x00\x00\x00\x00;\x023\x80', b"\x01\x0284\xf1a\x86z\x8e\xf2\xe5'\xf7\xfe1\x8d\x96R\x0c\xe7\x1fj#\xaf\xbd/\xba\x10e\xd1\xccQ-\xcf/>\xa5g\xc6\xd8\x9cO \xe7~\xb3\xda\xe0\\vg\xfb\x02&T\x93\xa0\xd4\x95\x8e\xd5L\x12\x9a\xf7\xe6\x9f\x87\xebX\x0b\x9e_\x11\xdc!\x1e\x9f\xb6j\xbb6\x99\x99\x90D\xf8\xfe\x14h\x01\x16#\x936B\x86\xc6\x00\x00g\x00\x00\x01\x00\x00d\xb8\xa9d\x01\x03\x00\x06\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\n\x00\x00\x00\x00;\x023\x80"])
```
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rune functionality moved into core from commando plugin.
Changelog-Added: JSON-RPC: `checkrune`: check rune validity for authorization; `createrune` to create/modify rune; `listrunes` to list existing runes; `blacklistrune` to revoke permission of rune
Changelog-Deprecated: JSON-RPC: `commando-rune`, `commando-listrunes` and `commando-blacklist`.
No-schema-diff-check
This extracts the core checking functionality for a rune, so they can
easily be used more widely than just commando.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>