We can get more gossip_filter messages now. And we can also go over max-messages,
so increase that too.
```
del tally['query_short_channel_ids']
del tally['query_channel_range']
del tally['ping']
> assert tally == {'channel_announce': 1,
'channel_update': 3,
'node_announce': 1,
'gossip_filter': 1}
E AssertionError: assert {'channel_ann..._announce': 1} == {'channel_ann..._announce': 1}
E Omitting 2 identical items, use -vv to show
E Differing items:
E {'gossip_filter': 2} != {'gossip_filter': 1}
E {'channel_update': 2} != {'channel_update': 3}
E Full diff:
E {
E 'channel_announce': 1,...
E
E ...Full output truncated (10 lines hidden), use '-vv' to show
tests/test_gossip.py:2326: AssertionError
```
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We actually mine *300* blocks, not 200, and if timing is right l1
can have mined the txid before mine_txid_or_rbf() checks the mempool:
```
def test_onchaind_replay(node_factory, bitcoind):
disconnects = ['+WIRE_REVOKE_AND_ACK', 'permfail']
# Feerates identical so we don't get gratuitous commit to update them
l1, l2 = node_factory.line_graph(2, opts=[{'watchtime-blocks': 201, 'cltv-delta': 101,
'disconnect': disconnects,
'feerates': (7500, 7500, 7500, 7500)},
{'watchtime-blocks': 201, 'cltv-delta': 101}],
wait_for_announce=True)
inv = l2.rpc.invoice(10**8, 'onchaind_replay', 'desc')
rhash = inv['payment_hash']
routestep = {
'amount_msat': 10**8 - 1,
'id': l2.info['id'],
'delay': 101,
'channel': first_scid(l1, l2)
}
l1.rpc.sendpay([routestep], rhash, payment_secret=inv['payment_secret'])
l1.daemon.wait_for_log('sendrawtx exit 0')
bitcoind.generate_block(1, wait_for_mempool=1)
# Wait for nodes to notice the failure, this seach needle is after the
# DB commit so we're sure the tx entries in onchaindtxs have been added
l1.daemon.wait_for_log("Deleting channel .* due to the funding outpoint being spent")
l2.daemon.wait_for_log("Deleting channel .* due to the funding outpoint being spent")
# We should at least have the init tx now
assert len(l1.db_query("SELECT * FROM channeltxs;")) > 0
assert len(l2.db_query("SELECT * FROM channeltxs;")) > 0
# Generate some blocks so we restart the onchaind from DB (we rescan
# last_height - 100)
bitcoind.generate_block(100)
sync_blockheight(bitcoind, [l1, l2])
# l1 should still have a running onchaind
assert len(l1.db_query("SELECT * FROM channeltxs;")) > 0
l2.rpc.stop()
l1.restart()
# Can't wait for it, it's after the "Server started" wait in restart()
assert l1.daemon.is_in_log(r'Restarting onchaind \(ONCHAIN\): closed in block 109')
# l1 should still notice that the funding was spent and that we should react to it
_, txid, blocks = l1.wait_for_onchaind_tx('OUR_DELAYED_RETURN_TO_WALLET',
'OUR_UNILATERAL/DELAYED_OUTPUT_TO_US')
assert blocks == 200
bitcoind.generate_block(200)
# Could be RBF!
> l1.mine_txid_or_rbf(txid)
tests/test_closing.py:1864:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
contrib/pyln-testing/pyln/testing/utils.py:1375: in mine_txid_or_rbf
wait_for(lambda: rbf_or_txid_broadcast(txids))
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
success = <function LightningNode.mine_txid_or_rbf.<locals>.<lambda> at 0x7f9b129c4550>
timeout = 180
def wait_for(success, timeout=TIMEOUT):
start_time = time.time()
interval = 0.25
while not success():
time_left = start_time + timeout - time.time()
if time_left <= 0:
> raise ValueError("Timeout while waiting for {}".format(success))
E ValueError: Timeout while waiting for <function LightningNode.mine_txid_or_rbf.<locals>.<lambda> at 0x7f9b129c4550>
```
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We can fix the median calc by removing the (unused) reverse edges.
Also analyze the failure case in test_real_data: it's a real edge case, so
hardcode that one as "ok".
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The ratio of the median of the fees and probability cost is overall not
a bad factor to combine these two features. This is what the
test_real_data shows.
Changelog-None
Signed-off-by: Lagrang3 <lagrang3@protonmail.com>
The fee_fallback test would fail after fixing the computation of the
median. Now by we can restore it by making the probability cost factor
1000x higher than the ratio of the median. This shows how hard it is to
combine fee and probability costs and why is the current approach so
fragile.
Changelog-None
Signed-off-by: Lagrang3 <lagrang3@protonmail.com>
The calculation of the median values of probability and fee cost in the
linear approximation had a bug by counting on non-existing arcs.
Changelog-none: askrene: fix the median
Signed-off-by: Lagrang3 <lagrang3@protonmail.com>
It is possible for prevtx to be larger than max packet size, so for shared outputs (currently only the funding tx) we add support for sending the `txid` only across the wire and filling in the prevtx locally.
Changelog-None
As we can see from the previous test, l3 tells us why it rejected the payment:
```
lightningd-3 2024-11-19T03:56:27.151Z DEBUG 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-chan#1: Failing HTLC because of an invalid payload (TLV 10 pos 104): cltv_expiry 136 > payment_constraint 130
```
It turns out, we were not updating the block height in the plugin!
Without this, when we create a (non-dummy) blinded path we set a
too-low CLTV restriction, and it doesn't work after a few blocks!
Note we were actually triggering this error in the xpay tests!
Reported-by: Vincenzo Palazzo
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Fixed: Offers: Receiving bolt12 payments where we have no public channels would fail a few blocks after startup.
Don't reply with update_fail_malformed_htlc, even though WIRE_INVALID_ONION_BLINDING
has BADONION set. Fail it with a normal error message.
This fixes a known FIXME.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Fixed: Protocol: entry to blinded paths return more useful errors (e.g if it's the final node, you get a real error, otherwise you get invalid_onion_blinding).
We actually pruned before we got all the channels. Extend the pruning time,
which unfortunately makes the test slower.
```
2024-11-18T02:13:11.7013278Z node_factory = <pyln.testing.utils.NodeFactory object at 0x7ff72969e820>
2024-11-18T02:13:11.7014386Z bitcoind = <pyln.testing.utils.BitcoinD object at 0x7ff72968fe20>
2024-11-18T02:13:11.7014996Z
2024-11-18T02:13:11.7015271Z def test_gossip_pruning(node_factory, bitcoind):
2024-11-18T02:13:11.7016222Z """ Create channel and see it being updated in time before pruning
2024-11-18T02:13:11.7017037Z """
2024-11-18T02:13:11.7017871Z l1, l2, l3 = node_factory.get_nodes(3, opts={'dev-fast-gossip-prune': None,
2024-11-18T02:13:11.7018971Z 'allow_bad_gossip': True})
2024-11-18T02:13:11.7019634Z
2024-11-18T02:13:11.7020236Z l1.rpc.connect(l2.info['id'], 'localhost', l2.port)
2024-11-18T02:13:11.7021153Z l2.rpc.connect(l3.info['id'], 'localhost', l3.port)
2024-11-18T02:13:11.7021806Z
2024-11-18T02:13:11.7022226Z scid1, _ = l1.fundchannel(l2, 10**6)
2024-11-18T02:13:11.7022886Z scid2, _ = l2.fundchannel(l3, 10**6)
2024-11-18T02:13:11.7023458Z
2024-11-18T02:13:11.7023907Z mine_funding_to_announce(bitcoind, [l1, l2, l3])
2024-11-18T02:13:11.7025183Z l1_initial_cupdate_timestamp = only_one(l1.rpc.listchannels(source=l1.info['id'])['channels'])['last_update']
2024-11-18T02:13:11.7026179Z
2024-11-18T02:13:11.7027358Z # Get timestamps of initial updates, so we can ensure they change.
2024-11-18T02:13:11.7028171Z # Channels should be activated locally
2024-11-18T02:13:11.7029326Z > wait_for(lambda: [c['active'] for c in l1.rpc.listchannels()['channels']] == [True] * 4)
```
We can see in logs, it actually started pruning already:
```
2024-11-18T02:13:11.9622477Z lightningd-1 2024-11-18T01:52:03.570Z DEBUG gossipd: Pruning channel 105x1x0 from network view (ages 1731894723 and 0)
```
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Sometimes l1 ratelimits before l2, and l2 receives the warning message, not l1:
```
> assert l1.daemon.is_in_log('WARNING: Ratelimited onion_message: exceeded one per 250msec')
E AssertionError: assert None
E + where None = <bound method TailableProc.is_in_log of <pyln.testing.utils.LightningD object at 0x7f13435f45b0>>('WARNING: Ratelimited onion_message: exceeded one per 250msec')
E + where <bound method TailableProc.is_in_log of <pyln.testing.utils.LightningD object at 0x7f13435f45b0>> = <pyln.testing.utils.LightningD object at 0x7f13435f45b0>.is_in_log
E + where <pyln.testing.utils.LightningD object at 0x7f13435f45b0> = <fixtures.LightningNode object at 0x7f13435cbb80>.daemon
...
lightningd-1 2024-11-19T00:45:43.721Z DEBUG 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-connectd: peer_in WIRE_ONION_MESSAGE
lightningd-1 2024-11-19T00:45:43.721Z DEBUG 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-connectd: peer_out WIRE_WARNING
lightningd-2 2024-11-19T00:45:43.722Z DEBUG 0266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c03518-connectd: peer_out WIRE_ONION_MESSAGE
lightningd-2 2024-11-19T00:45:43.722Z DEBUG connectd: REPLY WIRE_CONNECTD_INJECT_ONIONMSG_REPLY with 0 fds
lightningd-2 2024-11-19T00:45:43.722Z DEBUG 0266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c03518-connectd: peer_in WIRE_WARNING
lightningd-2 2024-11-19T00:45:43.722Z INFO 0266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c03518-connectd: Received WIRE_WARNING: WARNING: Ratelimited onion_message: exceeded one per 250msec
```
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
I can't reproduce this, but CI did (with Elements):
```
[gw3] linux -- Python 3.8.18 /home/runner/.cache/pypoetry/virtualenvs/cln-meta-project-AqJ9wMix-py3.8/bin/python
node_factory = <pyln.testing.utils.NodeFactory object at 0x7fd0e20f57f0>
bitcoind = <pyln.testing.utils.ElementsD object at 0x7fd0e307dbe0>
executor = <concurrent.futures.thread.ThreadPoolExecutor object at 0x7fd0e307da30>
@pytest.mark.openchannel('v1')
@pytest.mark.openchannel('v2')
def test_lightningd_still_loading(node_factory, bitcoind, executor):
"""Test that we recognize we haven't got all blocks from bitcoind"""
mock_release = Event()
# This is slow enough that we're going to notice.
def mock_getblock(r):
conf_file = os.path.join(bitcoind.bitcoin_dir, 'bitcoin.conf')
brpc = RawProxy(btc_conf_file=conf_file)
if r['params'][0] == slow_blockid:
mock_release.wait(TIMEOUT)
return {
"result": brpc._call(r['method'], *r['params']),
"error": None,
"id": r['id']
}
# Start it, establish channel, get extra funds.
l1, l2, l3 = node_factory.get_nodes(3, opts=[{'may_reconnect': True,
'wait_for_bitcoind_sync': False},
{'may_reconnect': True,
'wait_for_bitcoind_sync': False},
{}])
node_factory.join_nodes([l1, l2])
# Balance l1<->l2 channel
l1.pay(l2, 10**9 // 2)
l1.stop()
# Now make sure l2 is behind.
bitcoind.generate_block(2)
# Make sure l2/l3 are synced
sync_blockheight(bitcoind, [l2, l3])
# Make it slow grabbing the final block.
slow_blockid = bitcoind.rpc.getblockhash(bitcoind.rpc.getblockcount())
l1.daemon.rpcproxy.mock_rpc('getblock', mock_getblock)
l1.start(wait_for_bitcoind_sync=False)
# It will warn about being out-of-sync.
assert 'warning_bitcoind_sync' not in l1.rpc.getinfo()
assert 'warning_lightningd_sync' in l1.rpc.getinfo()
# Make sure it's connected to l2 (otherwise we get TEMPORARY_CHANNEL_FAILURE)
wait_for(lambda: only_one(l1.rpc.listpeers(l2.info['id'])['peers'])['connected'])
# Payments will succced.
l1.pay(l2, 1000)
> assert l1.daemon.is_in_log(r"Sending HTLC while still syncing with bitcoin network \(104 vs 105\)")
E AssertionError: assert None
E + where None = <bound method TailableProc.is_in_log of <pyln.testing.utils.LightningD object at 0x7fd0e20f9fa0>>('Sending HTLC while still syncing with bitcoin network \\(104 vs 105\\)')
E + where <bound method TailableProc.is_in_log of <pyln.testing.utils.LightningD object at 0x7fd0e20f9fa0>> = <pyln.testing.utils.LightningD object at 0x7fd0e20f9fa0>.is_in_log
E + where <pyln.testing.utils.LightningD object at 0x7fd0e20f9fa0> = <fixtures.LightningNode object at 0x7fd0e20f59d0>.daemon
```
What was in logs was:
```
lightningd-1 2024-11-18T05:33:50.634Z DEBUG 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-chan#1: Sending HTLC while still syncing with bitcoin network (103 vs 105)
```
Implying that l1 was an extra block behind.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We need to wait for *l2* to see the channel in CHANNELD_NORMAL,
otherwise the array here is empty:
```
chan = only_one([c for c in l1.rpc.listpeerchannels(l2.info['id'])['channels'] if c['state'] == 'CHANNELD_NORMAL'])
amount = chan['funding']['local_funds_msat']
assert amount > Millisatoshi(str((1 << 24) - 1) + "sat")
# We should know we can spend that much!
spendable = chan['spendable_msat']
assert spendable > Millisatoshi(str((1 << 24) - 1) + "sat")
# So should peer.
> chan = only_one([c for c in l2.rpc.listpeerchannels(l1.info['id'])['channels'] if c['state'] == 'CHANNELD_NORMAL'])
tests/test_connection.py:3552:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
arr = []
def only_one(arr):
"""Many JSON RPC calls return an array; often we only expect a single entry
"""
> assert len(arr) == 1
E AssertionError
```
This is required for VLS which wants to know (and potentially decline) invoices
we're trying to pay.
As a nice side effect, our "check" command for xpay now does much more thorough
checking of arguments.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
As the first user of a persistent layer, this tripped tests which
assumed the datastore would be empty!
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
These are automatically marked "important", in the sense that we won't startup
if they are not working, but this wasn't meant to disallow stopping them.
Changelog-Changed: JSON-RPC: built-in plugins can now be stopped using "plugin stop".
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Note: won't work with grpc (or probably other tools), since the output
is different. But good for testing.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Added: Config: option `xpay-handle-pay` can be used to call xpay when pay is used in many cases (but output is different from pay!)
It only works on BOLT11, and has long been replaced by the more
generic "decode".
Removing it will stop the confusion!
(Note: documentation claims it was introduced in 23.08, but that was
wrong, as it's been in CLN since the beginning).
[ Fixup from: niftynei <niftynei@gmail.com> ]
Fixes: https://github.com/ElementsProject/lightning/issues/6419
Changelog-Deprecated: JSON-RPC: `decodepay`: use `decode`.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Fails when l3 doesn't know address for l1, to connect to it:
```
2024-11-16T04:45:42.2243366Z lightningd-3 2024-11-16T04:35:10.582Z DEBUG 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-connectd: peer_in WIRE_ONION_MESSAGE
2024-11-16T04:45:42.2244342Z lightningd-3 2024-11-16T04:35:10.582Z DEBUG lightningd: Got onionmsg reply_path
2024-11-16T04:45:42.2245398Z lightningd-3 2024-11-16T04:35:10.582Z DEBUG plugin-offers: Note: disallowing deprecated onion_message_recv.blinding
2024-11-16T04:45:42.2246408Z lightningd-3 2024-11-16T04:35:10.586Z UNUSUAL plugin-offers: No incoming channel for 5msat, so no blinded path
2024-11-16T04:45:42.2247289Z lightningd-3 2024-11-16T04:35:10.605Z DEBUG hsmd: Client: Received message 25 from client
2024-11-16T04:45:42.2248372Z lightningd-3 2024-11-16T04:35:10.606Z DEBUG plugin-offers: connecting directly to 0266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c03518
2024-11-16T04:45:42.2249451Z lightningd-3 2024-11-16T04:35:10.606Z DEBUG gossipd: REPLY WIRE_GOSSIPD_GET_ADDRS_REPLY with 0 fds
2024-11-16T04:45:42.2250743Z lightningd-3 2024-11-16T04:35:10.607Z DEBUG 0266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c03518-connectd: Failed connected out: Unable to connect, no address known for peer
```
This is because the test which was supposed to wait for addresses is
wrong: it passes when l3 knows nothing! (`all([])` == `True`)
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Couldn't figure out why hsmtool.proc.wait(WAIT_TIMEOUT) returns 1?
hsmtool doesn't ever seem to exit status 1!
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Looking at the logs (and comparing a successful run), it seems the connect happens before
the connect_stream is ready, so we miss it:
```
________________________ test_grpc_connect_notification ________________________
[gw7] linux -- Python 3.8.18 /home/runner/.cache/pypoetry/virtualenvs/cln-meta-project-AqJ9wMix-py3.8/bin/python
node_factory = <pyln.testing.utils.NodeFactory object at 0x7fb08bb969d0>
def test_grpc_connect_notification(node_factory):
l1, l2 = node_factory.get_nodes(2)
# Test the connect notification
connect_stream = l1.grpc.SubscribeConnect(clnpb.StreamConnectRequest())
l2.connect(l1)
> for connect_event in connect_stream:
tests/test_cln_rs.py:425:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
../../../.cache/pypoetry/virtualenvs/cln-meta-project-AqJ9wMix-py3.8/lib/python3.8/site-packages/grpc/_channel.py:543: in __next__
return self._next()
../../../.cache/pypoetry/virtualenvs/cln-meta-project-AqJ9wMix-py3.8/lib/python3.8/site-packages/grpc/_channel.py:960: in _next
_common.wait(self._state.condition.wait, _response_ready)
../../../.cache/pypoetry/virtualenvs/cln-meta-project-AqJ9wMix-py3.8/lib/python3.8/site-packages/grpc/_common.py:156: in wait
_wait_once(wait_fn, MAXIMUM_WAIT_TIMEOUT, spin_cb)
../../../.cache/pypoetry/virtualenvs/cln-meta-project-AqJ9wMix-py3.8/lib/python3.8/site-packages/grpc/_common.py:116: in _wait_once
wait_fn(timeout=timeout)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = <Condition(<unlocked _thread.RLock object owner=0 count=0 at 0x7fb089730f00>, 0)>
timeout = 0.1
def wait(self, timeout=None):
"""Wait until notified or until a timeout occurs.
If the calling thread has not acquired the lock when this method is
called, a RuntimeError is raised.
This method releases the underlying lock, and then blocks until it is
awakened by a notify() or notify_all() call for the same condition
variable in another thread, or until the optional timeout occurs. Once
awakened or timed out, it re-acquires the lock and returns.
When the timeout argument is present and not None, it should be a
floating point number specifying a timeout for the operation in seconds
(or fractions thereof).
When the underlying lock is an RLock, it is not released using its
release() method, since this may not actually unlock the lock when it
was acquired multiple times recursively. Instead, an internal interface
of the RLock class is used, which really unlocks it even when it has
been recursively acquired several times. Another internal interface is
then used to restore the recursion level when the lock is reacquired.
"""
if not self._is_owned():
raise RuntimeError("cannot wait on un-acquired lock")
waiter = _allocate_lock()
waiter.acquire()
self._waiters.append(waiter)
saved_state = self._release_save()
gotit = False
try: # restore state no matter what (e.g., KeyboardInterrupt)
if timeout is None:
waiter.acquire()
gotit = True
else:
if timeout > 0:
> gotit = waiter.acquire(True, timeout)
E Failed: Timeout >1200.0s
```
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>