Large nodes were not always getting their own channel gossip out
reliably. The number of peers we spam our own channel gossip to
is limited to save large nodes on startup, but this should be
relaxed slightly to ensure propagation.
Changelog-Fixed: Own-channel gossip is broadcast to more peers on connect.
Using jsonrpc_request_sync, layers are loaded before we finish init,
so we never can be asked to create a layer before we've loaded it
(xpay creates a layer immediately on startup).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-None: persistent layers new this release.
Create lower-level versions of routines to create biases, layers,
constraints, etc and only save the ones called from the public APIs.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-None: persistent layers were only added in this release
They're not always 34 (aka '"'). This is a side-effect of ids
changing from u64 to strings quite a while ago.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Changed: Protocol: we now create a low-priority (2016 down to 12 blocks fee target) anchor for low-fee unilateral closes even if there's no urgency.
We're going to call this twice. But this patch also takes care
that a failed attempt to create an anchor doesn't alter other
variables!
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
1. merge_deadlines can't really fail.
2. We don't care about incoming HTLCs with no preimage.
3. Debug print anchor points as soon as we calc them.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cleans up the API: we have two functions now, one which is explicitly for
"I'm failing this because I saw this tx onchain".
Now we can correctly report the tx which closed the channel (previously
we would always report our own tx(s)!).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Fixed: JSON-RPC: `close` now correctly reports the txid of the remote onchain unilateral tx if it races with a peer close.
Changelog-Fixed: Protocol: we no longer try to spend anchors if a commitment tx is already mined (reported by @niftynei).
Fixes: #7526
The seeker can send a full gossip query, which means the ping doesn't happen
(it needs 14-45 seconds of quiet!).
We disable the gossip_queries feature, so it doesn't ask.
```
def test_ping_timeout(node_factory):
# Disconnects after this, but doesn't know it.
l1_disconnects = ['xWIRE_PING']
l1, l2 = node_factory.get_nodes(2, opts=[{'dev-no-reconnect': None,
'disconnect': l1_disconnects},
{'dev-no-ping-timer': None}])
l1.rpc.connect(l2.info['id'], 'localhost', l2.port)
# This can take 10 seconds (dev-fast-gossip means timer fires every 5 seconds)
l1.daemon.wait_for_log('seeker: startup peer finished', timeout=15)
# Ping timers runs at 15-45 seconds, *but* only fires if also 60 seconds
# after previous traffic.
> l1.daemon.wait_for_log('dev_disconnect: xWIRE_PING', timeout=60 + 45 + 5)
tests/test_connection.py:4194:
...
> raise TimeoutError('Unable to find "{}" in logs.'.format(exs))
E TimeoutError: Unable to find "[re.compile('dev_disconnect: xWIRE_PING')]" in logs.
```
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We wait until a connection fails, or a subd is connected to the peer,
before letting another one through. This should prevent us from
overwhelming lightningd on large nodes, but unlike the previous back-off,
it's based on how fast lightningd is, not an arbitrary time.
We also let one through each second, in case we're connecting to many,
but not doing anything but gossip (e.g. 100 explicit connect
commands).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Changed: Reconnecting to peers at startup should be significantly faster (dependent on machine speed).
Rather than have lightningd call us repeatedly to try to connect, have
it tell us what peers are transient and aren't, and connectd will
automatically try to maintain that connection.
There's a new "downgrade_peer" message to tell it a peer is now
transient: to make it non-transient we simply tell connectd to
connect as a non-transient.
The first time, I missed that dual_open_control does its own state
transitions :(
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Changed: `connectd` now handles maintaining/reconnecting to important peers, and we remember the last successful address we connected to.
This is more useful than the last address, which may be it connecting
to us. And use it when we restore it.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
If we connected out, remember that address. We always remember the last
address, but that may be an incoming address. This is explicitly the last
outgoing address which worked.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We're going to use this to ask if there are any channels which make it
important to reconnect to the peer.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
In fact, only 951 of 17419 (5%) of node announcements are missing an address
(and gossipd doesn't know if we can connect to Tor addresses anyway) so
just check it *has* a node_announcement.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Let lightningd feed us hints to try first, but we can extract the
addresses from node_announcement messages ourselves.
(Lightningd used to ask gossipd on our behalf: this is far simpler!)
One side effect of this is that we don't hand back address hints given to us
by lightningd: it would use these again for reconnecting. This is breaks
test_sendpay_grouping, so we disable it temporarily.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It turns out that under some circumstances we end up clearing the
pointee of `current` but not the pointer. Thus when we select the next
slot we can end up reusing the same slot, making it its own parent.
We forcefull break these cycles by enforcing that `current` should
never be returned and be set as its own parent.
Changelog-None
Trace spans form a tree, but we don't actually check that the
structure doesn't break. Breakage can for example come if we use the
same key accidentally, making a new span its own ancestor.
We have the space in memory set aside anyway, so let's just copy the
`trace_id` into the span itself, rather than resolving the `root` at
time of emission.
This was a bit harder to identify: during an `io_loop` run we suspend
the current span before handing over to `io_loop`, and later when a callback
is called we resume the span again. Depending on how we return from
the `io_loop` instance that is used to drive the startup, we either
have resumed the last span, or we don't. Since we start a span before
`io_loop` and want it to be emitted afterwards, we need to take care
of the case where we returned from a callback that did not resume, and
therefore the current context is empty.
Making `trace_span_resume` idempotent means we can just resume it
manually.
Ideally we'd push the suspend / resume logic down into `io_loop`
itself, and then we'd have just one place. Maybe suspend and resume
callbacks that can be configured in `io_loop`?
After adding the DB query instrumentation we ran into a couple of
issues, with spans not being resumed correctly, and it was rather hard
to identify the problem. This adds debug statements so we can trace
the tracing (traception if you will).
Changelog-None
lightningd/test/run-find_my_abspath.c includes ../lightningd.c, which includes
header_versions_gen.h, a generated header file.
lightningd/Makefile correctly declares that lightningd/lightningd.o depends on
header_versions_gen.h, but lightningd/test/Makefile lacks any such declaration
regarding lightningd/test/run-find_my_abspath.c, which leads to build failure:
In file included from lightningd/test/run-find_my_abspath.c:5:
lightningd/test/../lightningd.c:64:10: fatal error: header_versions_gen.h: No such file or directory
64 | #include <header_versions_gen.h>
| ^~~~~~~~~~~~~~~~~~~~~~~
Declare the missing dependency in lightningd/test/Makefile so that Make will
ensure that header_versions_gen.h is generated before it attempts to build
lightningd/test/run-find_my_abspath.o.
Changelog-None
Due to Darwin-arm64 conditional setting of `CPPFLAGS`, the subsequent
`CPPFLAGS +=` is resolved earlier on ARM macOS which results in empty
paths being used.
Changelog-None