This looked like a test flake, but was real:
```
l1.daemon.wait_for_log("closing soon due to the funding outpoint being spent")
# We won't gossip the dead channel any more (but we still propagate node_announcement). But connectd is not explicitly synced, so wait for "a bit".
time.sleep(1)
> assert len(get_gossip(l1)) == 2
E assert 4 == 2
```
We can see that two channel_updates come in *after* we mark it dying:
```
gossipd: channel 103x1x0 closing soon due to the funding outpoint being spent
gossipd: REPLY WIRE_GOSSIPD_NEW_BLOCKHEIGHT_REPLY with 0 fds
022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-gossipd: Received channel_update for channel 103x1x0/0 now DISABLED
022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-gossipd: Received channel_update for channel 103x1x0/1 now DISABLED
```
We should keep marking channel_updates the same way.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We get intermittant reports of subd->conn being leaked, but I could never find it.
That's because it's actually subd which is not referenced any more: subd->conn
gets reported because it's subd's tal_parent (and, except for the reference in
subd, not referenced either).
The real issue is that the channel->owner is reassigned to the new subdaemon,
and the old one is still exiting. During that time, we can see a "leak".
```
- Node /tmp/ltests-hkr089bp/test_sql_1/lightning-3/ has memory leaks: [
{
"backtrace": [
"ccan/ccan/tal/tal.c:477 (tal_alloc_)",
"ccan/ccan/io/io.c:91 (io_new_conn_)",
"lightningd/subd.c:774 (new_subd)",
"lightningd/subd.c:828 (new_channel_subd_)",
"lightningd/dual_open_control.c:3662 (peer_restart_dualopend)",
"lightningd/peer_control.c:1161 (connect_activate_subd)",
"lightningd/peer_control.c:1273 (peer_connected_hook_final)",
"lightningd/plugin_hook.c:213 (plugin_hook_callback)",
"lightningd/plugin.c:591 (plugin_response_handle)",
"lightningd/plugin.c:702 (plugin_read_json_one)",
"lightningd/plugin.c:747 (plugin_read_json)",
"ccan/ccan/io/io.c:59 (next_plan)",
"ccan/ccan/io/io.c:407 (do_plan)",
"ccan/ccan/io/io.c:417 (io_ready)",
"ccan/ccan/io/poll.c:453 (io_loop)",
"lightningd/io_loop_with_timers.c:22 (io_loop_with_timers)",
"lightningd/lightningd.c:1249 (main)"
],
"label": "ccan/ccan/io/io.c:91:struct io_conn",
"parents": [
"lightningd/lightningd.c:107:struct lightningd"
],
"value": "0x556c63c859f8"
}
```
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
```
channeld WARNING: Bad update_fail_malformed_htlc failure code 4103
```
Warren Togami reports this happening with Bitrefill on every reconnect,
so it's clearly something LND does.
(4103 is TEMPORARY_CHANNEL_FAILURE, which does not belong in update_fail_malformed_htlc).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Changed: Protocol: We allow update_fail_malformed_htlc with invalid error codes (LND?)
GH was complaining about Node v12 being used in v2.2.4. We used to
hold back from this upgrade because v3 was breaking quotas, but we
should try again.
The channel selection did not consider the amounts that we are trying
to transfer, which in a multiplexed channel world could end up always
selecting a channel that is too small for the payment. We also log
which channel was selected based on the selector that is passed in,
allowing us to better follow the decisions.
Changelog-Fixed: pay: `sendonion` and `sendpay` will now consider amounts involved when using picking one channel for a peer
Debugging a number of payments showed that we sometimes waste a number
of attempts routing through a channel via its alias, rather than its
scid. This is because while we annotate the scid when it has been set,
we do not do so for the alias. The alias then is picked for routing
despite not having enough capacity, failing the attempt locally.
It can also happen that we alternate between scid and alias, doubling
the number of failed attempts before we can make progress.
This patch sets the hint for the alias to a capacity of 0 and disables
it as if the peer were offline. This means when available we'll always
use the scid, which is also far easier to read in the logs.
Changelog-Fixed: pay: We now track spendable amounts when routing on both the local alias as well as the short channel ID
If you miss a wait event, you can catch up by doing listinvoices and
getting the max of these fields. It's also a good debugging clue.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Now we have defined ordering, we can add a start param.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Added: JSON-RPC: `listinvoices` has `index` and `start` parameters for listing control.
This will initially be for listinvoices, but can be expanded to other
list commands.
It's documented, but it makes promises which currently don't exist:
* listinvoice does not support `index` or `start` yet.
* It doesn't actually fire when invoices change yet.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Added: JSON-RPC: `wait`: new generic command to wait for events.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Added: JSON-RPC: `setchannel` adds a new `ignorefeelimits` parameter to allow peer to set arbitrary commitment transaction fees on a per-channel basis.
Recent cppcheck doesn't like our code; until we fix that, make it easy
to run every other source check.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We tried to fix this flake before, but now it actually happened again it shows that
b5845afd43 wasn't correct.
```
# If this happens fast enough, connect fails with "disconnected
# during connection"
try:
l1.rpc.connect(l2.info['id'], 'localhost', l2.port)
except RpcError as err:
> assert "disconnected during connection" in err.error
E assert 'disconnected during connection' in {'code': 402, 'message': 'disconnected during connection'}
E + where {'code': 402, 'message': 'disconnected during connection'} = RpcError("RPC call failed: method: connect, payload: {'id': '022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59', 'host': 'localhost', 'port': 41849}, error: {'code': 402, 'message': 'disconnected during connection'}").error
tests/test_misc.py:2728: AssertionError
```
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
l1 shuts down too fast, channeld doesn't get to tell gossipd about the channel,
and l2 sends (private) update_channel and we complain:
```
lightningd-2 2023-07-20T03:42:37.744Z DEBUG gossipd: received private channel announcement from channeld for 103x1x0
lightningd-2 2023-07-20T03:42:37.791Z DEBUG 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-hsmd: Got WIRE_HSMD_CUPDATE_SIG_REQ
lightningd-2 2023-07-20T03:42:37.796Z DEBUG hsmd: Client: Received message 3 from client
lightningd-1 2023-07-20T03:42:37.857Z DEBUG 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-gossipd: Bad gossip order: WIRE_CHANNEL_UPDATE before announcement 103x1x0/0
lightningd-1 2023-07-20T03:42:37.864Z DEBUG 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-gossipd: Bad gossip order: WIRE_CHANNEL_UPDATE before announcement 103x1x0/0
```
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Writing to the gossip_store file is not explicitly synchronized, so it
seems that connectd has not caught up with the dying flags we've set.
Simply wait for a second. Hacky, but should work.
```
def test_gossip_not_dying(node_factory, bitcoind):
l1 = node_factory.get_node()
l2, l3 = node_factory.line_graph(2, wait_for_announce=True)
l1.rpc.connect(l2.info['id'], 'localhost', l2.port)
# Wait until it sees all the updates, node announcments.
wait_for(lambda: len([n for n in l1.rpc.listnodes()['nodes'] if 'alias' in n])
+ len(l1.rpc.listchannels()['channels']) == 4)
def get_gossip(node):
out = subprocess.run(['devtools/gossipwith',
'--initial-sync',
'--timeout-after=2',
'{}@localhost:{}'.format(node.info['id'], node.port)],
check=True,
timeout=TIMEOUT, stdout=subprocess.PIPE).stdout
msgs = []
while len(out):
l, t = struct.unpack('>HH', out[0:4])
msg = out[2:2 + l]
out = out[2 + l:]
# Ignore pings, timestamp_filter
if t == 265 or t == 18:
continue
# channel_announcement node_announcement or channel_update
assert t == 256 or t == 257 or t == 258
msgs.append(msg)
return msgs
assert len(get_gossip(l1)) == 5
# Close l2->l3, mine block.
l2.rpc.close(l3.info['id'])
bitcoind.generate_block(1, wait_for_mempool=1)
l1.daemon.wait_for_log("closing soon due to the funding outpoint being spent")
# We won't gossip the dead channel any more (but we still propagate node_announcement)
> assert len(get_gossip(l1)) == 2
E assert 4 == 2
E + where 4 = len([b'\x01\x01L\xc2\xbe\x08\xbb\xa8~\x8f\x80R\x9e`J\x1cS\x18|\x12\n\xe5_6\xb0\xa6S\x9fU\xae\x19\x9c\x1fXB\xab\x81N\x13\xdc\x8e}\xb9\xb0\xb6\xe6\x14h\xd4:\x90\xce\xc3\xad\x9ezR`~\xba@\xc9\x91e\x89\xab\x00\x07\x88\xa0\x00\n\x02i\xa2d\xb8\xa9`\x02-"6 \xa3Y\xa4\x7f\xf7\xf7\xacD|\x85\xc4l\x92=\xa53\x89"\x1a\x00T\xc1\x1c\x1e<\xa3\x1dY\x02-"SILENTARTIST-27fc801-modded\x00\x00\x00\x00\x00\x00\x00', b'\x01\x01M\x00\x86\x8e4\xc8\x90p\n\x98\xf7\xce4\x1e\xd9\xd6-6\xfb(\xf0\xe4\xb7\x90\x7f\x89\xb9\xfa\x00\x82\x1b\xeb\x1fY\x93\x1e\xe0c\xb2\x0e<\xe6\x06x\xb7\xe54};\xfbd\xa0\x01S\xcf\xe8{\xf8\x8f/\xa7\xc0\xe2h\x00\x07\x88\xa0\x00\n\x02i\xa2d\xb8\xa9`\x03]+\x11\x92\xdf\xba\x13N\x10\xe5@\x87]6n\xbc\x8b\xc3S\xd5\xaavk\x80\xc0\x90\xb3\x9c:]\x88]\x03]+HOPPINGFIRE-27fc801-modded\x00\x00\x00\x00\x00\x00\x00\x00', b'\x01\x02~\xe0\x13\xb4\x84Gz\xcf(\xd4w\xa7\x9bZ\x1a\xe82\xd1\xe1\x1bLm\xc8\n\xcd\xd4\xfb\x88\xf8\xc6\xdbt\\v\x89~\xd1.e\xc8\xa8o\x9c`\xd5\xa8\x97\x11l\xf2g\xcb\xa8\xcf\r\x869\xd3\xb5\xd5\x9a\xa0my\x9f\x87\xebX\x0b\x9e_\x11\xdc!\x1e\x9f\xb6j\xbb6\x99\x99\x90D\xf8\xfe\x14h\x01\x16#\x936B\x86\xc6\x00\x00g\x00\x00\x01\x00\x00d\xb8\xa9d\x01\x02\x00\x06\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\n\x00\x00\x00\x00;\x023\x80', b"\x01\x0284\xf1a\x86z\x8e\xf2\xe5'\xf7\xfe1\x8d\x96R\x0c\xe7\x1fj#\xaf\xbd/\xba\x10e\xd1\xccQ-\xcf/>\xa5g\xc6\xd8\x9cO \xe7~\xb3\xda\xe0\\vg\xfb\x02&T\x93\xa0\xd4\x95\x8e\xd5L\x12\x9a\xf7\xe6\x9f\x87\xebX\x0b\x9e_\x11\xdc!\x1e\x9f\xb6j\xbb6\x99\x99\x90D\xf8\xfe\x14h\x01\x16#\x936B\x86\xc6\x00\x00g\x00\x00\x01\x00\x00d\xb8\xa9d\x01\x03\x00\x06\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\n\x00\x00\x00\x00;\x023\x80"])
```
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rune functionality moved into core from commando plugin.
Changelog-Added: JSON-RPC: `checkrune`: check rune validity for authorization; `createrune` to create/modify rune; `listrunes` to list existing runes; `blacklistrune` to revoke permission of rune
Changelog-Deprecated: JSON-RPC: `commando-rune`, `commando-listrunes` and `commando-blacklist`.
No-schema-diff-check
This extracts the core checking functionality for a rune, so they can
easily be used more widely than just commando.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We also document how the grpc test mode works, and why it currently
lacks coverage.
Changelog-Changed: pyln-testing: The grpc dependencies are now optional.