We have CI runs which timeout (after 2 hours). It's not clear why,
but we can at least eliminate CLN lockups as the answer.
Since pytest disabled the --timeout option on test shutdown, we could be
seeing an issue on stopping taking a long time?
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The updated API requires typed htables to explicitly state whether they
allow duplicates: for most cases we don't, but we've had issues in the
past.
This is a big patch, but mainly mechanical.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cut & paste from the forwarding code, where we don't let onions use the
unannounced scids. Obviously local commands can use them.
Reported-by: @michael1011
Changelog-Fixed: JSON-RPC: xpay now works through unannounced channels.
If they give us the invstring, we can at least set who signed the invoice. Of course,
it might not be a real node_id (with blinded paths).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This appears in listsendpays / listpays, and is useful information (if we know!).
This doesn't fix old payments, but means that xpay can use this for new payments.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We ratelimited DEBUG messages, but that can be annoying and cause us to miss things.
We demoted the worst offenders in the last release, to TRACE level.
Now, only log trace if it's wanted, and never suppress DEBUG.
Changelog-Changed: Logging: we no longer suppress DEBUG messages from subdaemons.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Fixes: https://github.com/ElementsProject/lightning/issues/7917
We are checking if `txw->depth` is `-1` and then print it, when we
clearly want `depth` instead.
Changelog-Fixed logs: When printing depths some unsigned numbers could overflow
It's undocumented and only used in one place, so we can change it (it
was new in 24.08).
We really want to be able to just handle a raw onionmessage: this allows
oblivious sending of messages, but also, in combination with the coming
onionmessage_forward_fail notification, allows us to connect then
reinject.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It's freaking people out when they see things like:
```
2024-11-11T05:26:41.281Z DEBUG ...53c-connectd: peer_out INVALID 22859
```
Fixes: https://github.com/ElementsProject/lightning/issues/7802
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Changed: connectd: log unknown messages as "UNKNOWN" not "INVALID" to avoid freaking people out.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Changed: Protocol: we now create a low-priority (2016 down to 12 blocks fee target) anchor for low-fee unilateral closes even if there's no urgency.
We're going to call this twice. But this patch also takes care
that a failed attempt to create an anchor doesn't alter other
variables!
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
1. merge_deadlines can't really fail.
2. We don't care about incoming HTLCs with no preimage.
3. Debug print anchor points as soon as we calc them.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cleans up the API: we have two functions now, one which is explicitly for
"I'm failing this because I saw this tx onchain".
Now we can correctly report the tx which closed the channel (previously
we would always report our own tx(s)!).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Fixed: JSON-RPC: `close` now correctly reports the txid of the remote onchain unilateral tx if it races with a peer close.
Changelog-Fixed: Protocol: we no longer try to spend anchors if a commitment tx is already mined (reported by @niftynei).
Fixes: #7526
Rather than have lightningd call us repeatedly to try to connect, have
it tell us what peers are transient and aren't, and connectd will
automatically try to maintain that connection.
There's a new "downgrade_peer" message to tell it a peer is now
transient: to make it non-transient we simply tell connectd to
connect as a non-transient.
The first time, I missed that dual_open_control does its own state
transitions :(
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Changed: `connectd` now handles maintaining/reconnecting to important peers, and we remember the last successful address we connected to.
This is more useful than the last address, which may be it connecting
to us. And use it when we restore it.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
If we connected out, remember that address. We always remember the last
address, but that may be an incoming address. This is explicitly the last
outgoing address which worked.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We're going to use this to ask if there are any channels which make it
important to reconnect to the peer.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
In fact, only 951 of 17419 (5%) of node announcements are missing an address
(and gossipd doesn't know if we can connect to Tor addresses anyway) so
just check it *has* a node_announcement.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Let lightningd feed us hints to try first, but we can extract the
addresses from node_announcement messages ourselves.
(Lightningd used to ask gossipd on our behalf: this is far simpler!)
One side effect of this is that we don't hand back address hints given to us
by lightningd: it would use these again for reconnecting. This is breaks
test_sendpay_grouping, so we disable it temporarily.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This was a bit harder to identify: during an `io_loop` run we suspend
the current span before handing over to `io_loop`, and later when a callback
is called we resume the span again. Depending on how we return from
the `io_loop` instance that is used to drive the startup, we either
have resumed the last span, or we don't. Since we start a span before
`io_loop` and want it to be emitted afterwards, we need to take care
of the case where we returned from a callback that did not resume, and
therefore the current context is empty.
Making `trace_span_resume` idempotent means we can just resume it
manually.
Ideally we'd push the suspend / resume logic down into `io_loop`
itself, and then we'd have just one place. Maybe suspend and resume
callbacks that can be configured in `io_loop`?
lightningd/test/run-find_my_abspath.c includes ../lightningd.c, which includes
header_versions_gen.h, a generated header file.
lightningd/Makefile correctly declares that lightningd/lightningd.o depends on
header_versions_gen.h, but lightningd/test/Makefile lacks any such declaration
regarding lightningd/test/run-find_my_abspath.c, which leads to build failure:
In file included from lightningd/test/run-find_my_abspath.c:5:
lightningd/test/../lightningd.c:64:10: fatal error: header_versions_gen.h: No such file or directory
64 | #include <header_versions_gen.h>
| ^~~~~~~~~~~~~~~~~~~~~~~
Declare the missing dependency in lightningd/test/Makefile so that Make will
ensure that header_versions_gen.h is generated before it attempts to build
lightningd/test/run-find_my_abspath.o.
Changelog-None
This does not validate a node announcement and address, but it
does select a node at random from the gossmap and asks lightningd
to attempt a connection to it.
Gossipd uses this to ask lightningd -> connectd to initiate
a connection to a new gossip peer. This can be used when
there are insufficient peers already connected to gossip with.
Changelog-Changed: Gossipd can now request connections to additional nodes for improved gossip sync
While we have (I hope!) fixed the underlying sync problem, this can still happen with
older gossip. So now we tell it the channel is dead, so it won't happen more than once.
Fixes: #7703
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Set the remote funding pubkey on both lightningd and channeld when mutual splice lock is achieved.
This will be needed once rotating funding keys is enabled during splicing
Changelog-None.
Enable storing the remote funding pubkey in DB if the channel peer decides to change it during splicing. It needs to be in DB incase of restarts mid-splice.
Changelog-None
Added and updated error messages when splicing to make it more clear to the user why a splice is failing.
Changelog-Changed: Improved error messaging for splice commands.
Don't reply with update_fail_malformed_htlc, even though WIRE_INVALID_ONION_BLINDING
has BADONION set. Fail it with a normal error message.
This fixes a known FIXME.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Fixed: Protocol: entry to blinded paths return more useful errors (e.g if it's the final node, you get a real error, otherwise you get invalid_onion_blinding).
Payer metadata is a field that controls the payer ID
provided during the fetchinvoice process.
There are use cases where this is highly useful, such as
proving that the payer has paid for the correct item.
Imagine visiting a merchant's website to pay for multiple offers, where
one of these offers is a default offer (with no description and no set amount).
In this scenario, the merchant could claim not to have received
payment for a specific item. Since the same offer may be used to
fetch invoices for different products, there needs to be a way to
identify which product the invoice corresponds to.
With this commit, it will be possible to inject payer metadata,
which helps solve the issue described above.
For example, possible payer metadata could be `to_hex(b"{payer_node_id}.{product_id}.{created_at}")`.
Changelog-Added: JSON-RPC: `fetchinvoice` allows setting invreq_metadata via `payer_metadata` parameter.
Signed-off-by: Vincenzo Palazzo <vincenzopalazzodev@gmail.com>
We actually pruned before we got all the channels. Extend the pruning time,
which unfortunately makes the test slower.
```
2024-11-18T02:13:11.7013278Z node_factory = <pyln.testing.utils.NodeFactory object at 0x7ff72969e820>
2024-11-18T02:13:11.7014386Z bitcoind = <pyln.testing.utils.BitcoinD object at 0x7ff72968fe20>
2024-11-18T02:13:11.7014996Z
2024-11-18T02:13:11.7015271Z def test_gossip_pruning(node_factory, bitcoind):
2024-11-18T02:13:11.7016222Z """ Create channel and see it being updated in time before pruning
2024-11-18T02:13:11.7017037Z """
2024-11-18T02:13:11.7017871Z l1, l2, l3 = node_factory.get_nodes(3, opts={'dev-fast-gossip-prune': None,
2024-11-18T02:13:11.7018971Z 'allow_bad_gossip': True})
2024-11-18T02:13:11.7019634Z
2024-11-18T02:13:11.7020236Z l1.rpc.connect(l2.info['id'], 'localhost', l2.port)
2024-11-18T02:13:11.7021153Z l2.rpc.connect(l3.info['id'], 'localhost', l3.port)
2024-11-18T02:13:11.7021806Z
2024-11-18T02:13:11.7022226Z scid1, _ = l1.fundchannel(l2, 10**6)
2024-11-18T02:13:11.7022886Z scid2, _ = l2.fundchannel(l3, 10**6)
2024-11-18T02:13:11.7023458Z
2024-11-18T02:13:11.7023907Z mine_funding_to_announce(bitcoind, [l1, l2, l3])
2024-11-18T02:13:11.7025183Z l1_initial_cupdate_timestamp = only_one(l1.rpc.listchannels(source=l1.info['id'])['channels'])['last_update']
2024-11-18T02:13:11.7026179Z
2024-11-18T02:13:11.7027358Z # Get timestamps of initial updates, so we can ensure they change.
2024-11-18T02:13:11.7028171Z # Channels should be activated locally
2024-11-18T02:13:11.7029326Z > wait_for(lambda: [c['active'] for c in l1.rpc.listchannels()['channels']] == [True] * 4)
```
We can see in logs, it actually started pruning already:
```
2024-11-18T02:13:11.9622477Z lightningd-1 2024-11-18T01:52:03.570Z DEBUG gossipd: Pruning channel 105x1x0 from network view (ages 1731894723 and 0)
```
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>