Commit Graph

11422 Commits

Author SHA1 Message Date
Alex Myers
08a2b3b86c pytest: test_gossip_ratelimit checks routing graph and squelch 2022-07-06 14:31:19 +09:30
Alex Myers
9dc794dba8 gossipd: make use of new ratelimit bit in gossip_store length mask
routing.c now flags rate-limited gossip as it enters the gossip_store but
makes use of it in updating the routing graph. Flagged gossip is not
rebroadcast to gossip peers.

Changelog-Changed: gossipd: now accepts spam gossip, but squelches it for
peers.
2022-07-06 14:31:19 +09:30
Alex Myers
cbafc0fa33 gossip_store: add flag for spam gossip, update to v10
This will be used to decouple internal use of gossip from what is
passed to gossip peers. Updates GOSSIP_STORE_VERION to 10.

Changelog-Changed: gossip_store updated to version 10.
2022-07-06 14:31:19 +09:30
Christian Decker
669bca4a02 ld: Use the local alias in the htlc_accepted hook
If we have no real short-channel-id this is the best we can do. Use
the local one since we can be sure we have assigned one.
2022-07-04 22:14:06 +02:00
Christian Decker
d115d01105 zeroconf: Add channel_type variant support
If zeroconf was negotiated we'll add it to the basic channel
type. Similarly we'll accept it if it was negotiated too.
2022-07-04 22:14:06 +02:00
Christian Decker
a07797c166 features: Add function to unset a featurebit
Needed so we can blank optional bits when comparing channel_types.
2022-07-04 22:14:06 +02:00
Christian Decker
695a98e5d8 pyln-testing: Add gossip_store parser to testing framework
I had to parse quite a few of these files debugging zeroconf, so I
thought it might be nice to have direct access here.

Changelog-Added: pyln-testing: Added utilities to read and parse `gossip_store` file for nodes.
2022-07-04 22:14:06 +02:00
Christian Decker
db61b048a9 zeroconf: Announce the channel with the real scid as well as aliases
With zeroconf we have to duplicate the `local_channel_announcement`
since we locally announce the aliased version, and then on the first
confirmation we also add the funding scid version.
2022-07-04 22:14:06 +02:00
Christian Decker
29157735fb channeld: Track the funding depth while awaiting lockin
We used to agree up on the `minimum_depth` with the peer, thus when
they told us that the funding locked we'd be sure we either have a
scid or we'd trigger the state transition when we do. However if we
had a scid, and we got a funding_locked we'd trust them not to have
sent it early. Now we explicitly track the depth in the channel while
waiting for the funding to confirm.

Changelog-Fixed: channeld: Enforce our own `minimum_depth` beyond just confirming
2022-07-04 22:14:06 +02:00
Christian Decker
692a001198 ld: Use the local alias when reporting failures with zeroconf
Ran into this with a zeroconf channel, without confs, that was
disconnected.
2022-07-04 22:14:06 +02:00
Christian Decker
19f8ed3fe1 channeld: Explicitly use the first commitment point on reconnect
The spec explicitly asks for the first point, while we were using the
most recent one. This worked fine before zeroconf, but with zeroconf
it can happen.
2022-07-04 22:14:06 +02:00
Christian Decker
b195e6d9d4 pytest: Add test for zeroconf channels transitioning to be public
We need to switch to the real short_channel_id at 6 confirmations, and
gossip messages should reflect that in order to be valid to others.
2022-07-04 22:14:06 +02:00
Christian Decker
306d26357e pytest: Add test for forwarding over zeroconf channels 2022-07-04 22:14:06 +02:00
Christian Decker
df739956ab pytest: Add a test for direct payment over zeroconf channels 2022-07-04 22:14:06 +02:00
Christian Decker
252ccfa7ab db: Store the local alias for forwarded incoming payments
Not only can the outgoing edge be a zeroconf channel, it can also be
the incoming channel. So we revert to the usual trick of using the
local alias if the short_channel_id isn't known yet.

We use the LOCAL alias instead of the REMOTE alias even though the
sender likely used the REMOTE alias to refer to the channel. This is
because we control the LOCAL alias, and we keep it stable during the
lifetime of the channel, whereas the REMOTE one could change or not be
there yet.
2022-07-04 22:14:06 +02:00
Christian Decker
92b891bee3 ld: Add function to retrieve either the scid or the local alias
We use this in a couple of places, when we want to refer to a channel
by its `short_channel_id`, I'm moving this into a separate function
primarily to have a way to mark places where we do that.
2022-07-04 22:14:06 +02:00
Christian Decker
7930e34da3 pay: Populate the channel hints with either the scid or the alias
We'll use one of the two, and we reuse the `scid` field, since we
don't really care that much which one it is.
2022-07-04 22:14:06 +02:00
Christian Decker
0ce68b26c6 jsonrpc: Include the direction also if we have an alias
The direction only depends on the ordering between node_ids, not the
short_channel_id, so we can include it and it won't change. This was
causing some trouble loading the `channel_hints` in the `pay` plugin.
2022-07-04 22:14:06 +02:00
Christian Decker
22b6e33030 zeroconf: Trigger coin_movement on first real confirmation
We don't trigger on depth=0 since that'd give us bogus blockheights
and pointers into the chain, instead we defer until we get a first
confirmation. This extracts some of the logic from `lockin_complete`,
into the depth change listener (not the remote funding locked since at
that point we're certainly locked in and we don't really care about
that for bookkeeping anyway).
2022-07-04 22:14:06 +02:00
Christian Decker
2dc86bf29b db: Store the alias if that's all we got in a forward 2022-07-04 22:14:06 +02:00
Christian Decker
a4e6b58fa4 ld: Consider local aliases when forwarding 2022-07-04 22:14:06 +02:00
Christian Decker
1ae3dba529 invoice: Consider aliases too when selecting routehints 2022-07-04 22:14:06 +02:00
Christian Decker
78c9c6a9e0 ld: Allow lockin despite not having a scid yet
This is needed for us to transition to CHANNELD_NORMAL for zeroconf
channels, i.e., channels where we don't have a short channel ID yet.

We'll have to call lockin_complete a second time, once we learn the
real scid.
2022-07-04 22:14:06 +02:00
Christian Decker
cdedd433a4 jsonrpc: Add aliases to listpeers result 2022-07-04 22:14:06 +02:00
Christian Decker
5e74048508 gossip: Add both channel directions with their respective alias
We locally generate an update with our local alias, and get one from
the peer with the remote alias, so we need to add them both. We do so
only if using the alias in the first place though.
2022-07-04 22:14:06 +02:00
Christian Decker
bf44178047 gossipd: Use the remote alias if no real scid is known
This is for the local channel announcement that'll not leave this
host, as it doesn't have signatures.
2022-07-04 22:14:06 +02:00
Christian Decker
3e57d6f9d0 channeld: On funding_locked, remember either alias or real scid 2022-07-04 22:14:06 +02:00
Christian Decker
cf51edd95b channeld: Remember remote alias for channel announcements and update 2022-07-04 22:14:06 +02:00
Christian Decker
c98f011479 channeld: Send a depth=0 notification when channeld starts up
This is used in order to ensure zeroconf doesn't just wait for the
first confirmation despite mindepth being set to 0.
2022-07-04 22:14:06 +02:00
Christian Decker
b9817d395f zeroconf: Wire the aliases through channeld 2022-07-04 22:14:06 +02:00
Christian Decker
de1c0b51f0 zeroconf: Add alias_remote and alias_local to channel and DB
`alias_local` is generated locally and sent to the peer so it knows
what we're calling the channel, while `alias_remote` is received by
the peer so we know what to include in routehints when generating
invoices.
2022-07-04 22:14:06 +02:00
Christian Decker
9d3cb95489 wire: Add funding_locked tlv patch from PR lightning/bolts#910
Minimal set of changes to update the peer_wire.csv to include the TLV
field in the `funding_locked` message, and add type 1=alias from that
PR too.
2022-07-04 22:14:06 +02:00
Christian Decker
3fbaac3fdb jsonrpc: Add option_zeroconf handling to listpeers 2022-07-04 22:14:06 +02:00
Christian Decker
8609f9e00d pytest: Test the mindepth customizations of fundchannel and hook
Just test that we can customize, and we'll add mindepth=0 support in
the next couple of commits.
2022-07-04 22:14:06 +02:00
Christian Decker
1477873190 plugin: Allow plugins to customize the mindepth in accept_channel
This is the counterpart of the `mindepth` parameter in `fundchannel`
and friends. Allows dynamic lookups of `node_id` and selectively
opting into `option_zeroconf` being used.

Changelog-Added: plugin: The `openchannel` hook may return a `mindepth` indicating how many confirmations are required.
2022-07-04 22:14:06 +02:00
Christian Decker
adbb977053 openingd: If we have negotiated zeroconf we use our mindepth
With `option_zeroconf` we may now send `channel_ready` at any time we
want, rendering the `mindepth` parameter a mere heads up. We ignore it
in favor of our own value, since we plan to trigger releasing the
`channel_ready` once we reach our own depth.
2022-07-04 22:14:06 +02:00
Christian Decker
e4511452ac bolt: Reflect the zeroconf featurebits in code 2022-07-04 22:14:06 +02:00
Christian Decker
185cd81be4 jsonrpc: Add mindepth argument to fundchannel and multifundchannel
This will eventually enable us to specify 0 for zeroconf channels.

Changelog-Added: JSON-RPC: Added `mindepth` argument to specify the number of confirmations we require for `fundchannel` and `multifundchannel`
2022-07-04 22:14:06 +02:00
Christian Decker
6d07f4ed85 json: Add parser for u32 params 2022-07-04 22:14:06 +02:00
Vincenzo Palazzo
bad943da55 valgrind: ingore plugin build with rust
Ok this should be fixed the following stack trace


```
2022-06-29T14:19:41.183Z DEBUG   lightningd: Command returned result after jcon close
------------------------------- Valgrind errors --------------------------------
Valgrind error file: valgrind-errors.55581
==55581== Syscall param statx(file_name) points to unaddressable byte(s)
==55581==    at 0x4B0188E: statx (statx.c:29)
==55581==    by 0x1133481: std::sys::unix::fs::try_statx (weak.rs:178)
==55581==    by 0x11265E0: std::fs::buffer_capacity_required (fs.rs:851)
==55581==    by 0x112675B: <std::fs::File as std::io::Read>::read_to_string (fs.rs:644)
==55581==    by 0x10DACA8: num_cpus::linux::Cgroup::param (linux.rs:214)
==55581==    by 0x10DAB39: num_cpus::linux::Cgroup::quota_us (linux.rs:203)
==55581==    by 0x10DA9C2: num_cpus::linux::Cgroup::cpu_quota (linux.rs:188)
==55581==    by 0x10DA5A1: num_cpus::linux::load_cgroups (linux.rs:149)
==55581==    by 0x10DA23D: num_cpus::linux::init_cgroups (linux.rs:129)
==55581==    by 0x10DCDC8: core::ops::function::FnOnce::call_once (function.rs:227)
==55581==    by 0x10DC749: std::sync::once::Once::call_once::{{closure}} (once.rs:276)
==55581==    by 0x21EE89: std::sync::once::Once::call_inner (once.rs:434)
==55581==  Address 0x0 is not stack'd, malloc'd or (recently) free'd
==55581== 
==55581== Syscall param statx(buf) points to unaddressable byte(s)
==55581==    at 0x4B0188E: statx (statx.c:29)
==55581==    by 0x1133481: std::sys::unix::fs::try_statx (weak.rs:178)
==55581==    by 0x11265E0: std::fs::buffer_capacity_required (fs.rs:851)
==55581==    by 0x112675B: <std::fs::File as std::io::Read>::read_to_string (fs.rs:644)
==55581==    by 0x10DACA8: num_cpus::linux::Cgroup::param (linux.rs:214)
==55581==    by 0x10DAB39: num_cpus::linux::Cgroup::quota_us (linux.rs:203)
==55581==    by 0x10DA9C2: num_cpus::linux::Cgroup::cpu_quota (linux.rs:188)
==55581==    by 0x10DA5A1: num_cpus::linux::load_cgroups (linux.rs:149)
==55581==    by 0x10DA23D: num_cpus::linux::init_cgroups (linux.rs:129)
==55581==    by 0x10DCDC8: core::ops::function::FnOnce::call_once (function.rs:227)
==55581==    by 0x10DC749: std::sync::once::Once::call_once::{{closure}} (once.rs:276)
==55581==    by 0x21EE89: std::sync::once::Once::call_inner (once.rs:434)
==55581==  Address 0x0 is not stack'd, malloc'd or (recently) free'd
==55581==
--------------------------------------------------------------------------------
Leaving base_dir /tmp/ltests-hzt9ppqp intact, it still has test sub-directories with failure details: ['test_peers_1']
```

Signed-off-by: Vincenzo Palazzo <vincenzopalazzodev@gmail.com>
2022-07-03 20:36:20 +09:30
Rusty Russell
1771b8ec22 CI: re-enable checks, by changing errant tab back to spaces.
And the Python contrib/ stuff seems to fail under VALGRIND, so attach
it to a normal make line.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2022-07-03 20:36:20 +09:30
Igor Bubelov
ee3f059e80 Update README.md 2022-07-03 12:41:07 +02:00
Igor Bubelov
50107754a7 Add README.md 2022-07-03 12:41:07 +02:00
Greg Sanders
9f953b5efb No funding_wscript arg in initial_commit_tx 2022-07-01 13:30:19 -05:00
Rusty Russell
2fe17a5837 CI: make sure *someone* runs check-units under valgrind!
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2022-06-29 21:07:42 +09:30
Rusty Russell
9ab7c8aed3 connected/test: fix memleak in test.
```
VALGRIND=1 valgrind -q --error-exitcode=7 --track-origins=yes --leak-check=full --show-reachable=yes --errors-for-leak-kinds=all connectd/test/run-netaddress > /dev/null
==2483395== 16 bytes in 1 blocks are still reachable in loss record 1 of 15
==2483395==    at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==2483395==    by 0x10D59A: autodata_register_ (autodata.c:20)
==2483395==    by 0x10EB26: register_autotype_type_to_string (type_to_string.h:77)
==2483395==    by 0x10EB6B: register_one_type_to_string0 (type_to_string.c:8)
==2483395==    by 0x188C0C: __libc_csu_init (in /home/rusty/devel/cvs/lightning/connectd/test/run-netaddress)
==2483395==    by 0x4A3A00F: (below main) (libc-start.c:264)
==2483395==
==2483395== 40 bytes in 1 blocks are still reachable in loss record 2 of 15
==2483395==    at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
...
```

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2022-06-29 21:07:42 +09:30
Rusty Russell
fd90e5746b connectd: don't keep around more than one old connection.
This was fixed in 1c495ca5a8 ("connectd:
fix accidental handling of old reconnections.") and then reverted by
the rework in "connectd: avoid use-after-free upon multiple
reconnections by a peer".

The latter made the race much less likely, since we cleaned up the
reconnecting struct once the connection was hung up by the remote
node, but it's still theoretically possible.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2022-06-28 13:47:27 +09:30
Rusty Russell
afbddcf7f3 lightningd: fix crash on rapid reconnect.
Happens occasionally when running
`tests/test_connection.py::test_mutual_reconnect_race` (which is too
flaky to add, without more fixes):


```
lightningd: lightningd/peer_control.c:1252: peer_active: Assertion `!channel->owner' failed.
lightningd: FATAL SIGNAL 6 (version v0.11.0.1-38-g4f167da)
0x5594a41f8f45 send_backtrace
	common/daemon.c:33
0x5594a41f8fef crashdump
	common/daemon.c:46
0x7f7cb585c08f ???
	/build/glibc-SzIz7B/glibc-2.31/signal/../sysdeps/unix/sysv/linux/x86_64/sigaction.c:0
0x7f7cb585c00b __GI_raise
	../sysdeps/unix/sysv/linux/raise.c:51
0x7f7cb583b858 __GI_abort
	/build/glibc-SzIz7B/glibc-2.31/stdlib/abort.c:79
0x7f7cb583b728 __assert_fail_base
	/build/glibc-SzIz7B/glibc-2.31/assert/assert.c:92
0x7f7cb584cfd5 __GI___assert_fail
	/build/glibc-SzIz7B/glibc-2.31/assert/assert.c:101
0x5594a41b45ca peer_active
	lightningd/peer_control.c:1252
0x5594a418794c connectd_msg
	lightningd/connect_control.c:457
0x5594a41cd457 sd_msg_read
	lightningd/subd.c:556
0x5594a41ccbe5 read_fds
	lightningd/subd.c:357
0x5594a4269fc2 next_plan
	ccan/ccan/io/io.c:59
0x5594a426abca do_plan
	ccan/ccan/io/io.c:407
0x5594a426ac0c io_ready
	ccan/ccan/io/io.c:417
0x5594a426ceff io_loop
	ccan/ccan/io/poll.c:453
0x5594a41930d9 io_loop_with_timers
	lightningd/io_loop_with_timers.c:22
0x5594a4199293 main
	lightningd/lightningd.c:1181
0x7f7cb583d082 __libc_start_main
	../csu/libc-start.c:308
0x5594a416e15d ???
	???:0
0xffffffffffffffff ???
	???:0
```
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2022-06-28 13:47:27 +09:30
Matt Whitlock
83c825945c connectd: avoid use-after-free upon multiple reconnections by a peer
`peer_reconnected` was freeing a `struct peer_reconnected` instance
while a pointer to that instance was registered to be passed as an
argument to the `retry_peer_connected` callback function. This caused a
use-after-free crash when `retry_peer_connected` attempted to reparent
the instance to the temporary context.

Instead, never have `peer_reconnected` free a `struct peer_reconnected`
instance, and only ever allow such an instance to be freed after the
`retry_peer_connected` callback has finished with it. To ensure that the
instance is freed even if the connection is closed before the callback
can be invoked, parent the instance to the connection rather than to the
daemon.

Absent the need to free `struct peer_reconnected` instances outside of
the `retry_peer_connected` callback, there is no use for the
`reconnected` hashtable, so remove it as well.

See: https://github.com/ElementsProject/lightning/issues/5282#issuecomment-1141454255
Fixes: #5282
Fixes: #5284
Changelog-Fixed: connectd no longer crashes when peers reconnect.
2022-06-28 13:47:27 +09:30
Rusty Russell
4ee55acc71 connectd: don't start connecting in parallel in peer_conn_closed.
The crash below from @zerofeerouting left me confused.  The invalid
value in fmt_wireaddr_internal is a telltale sign of use-after-free.

This backtrace shows us destroying the conn *twice*: what's happening?

Well, tal carefully protects against destroying twice: it's not that
unusual to free something in a destructor which has already been freed.
So this indicates that there are *two* io_conn hanging off one
struct connecting, which isn't supposed to happen!  We deliberately
call try_connect_one_addr() initially, then inside the io_conn destructor.

But due to races in connectd vs lightningd connection state, we added
a fix which allows a connect command to sit around while the peer is
cleaning up (6cc9f37cab) and get fired
off when it's done.

But what if, in the chaos, we are already connecting again?  Now we'll
end up with *two* connections.

Fortunately, we have a `conn` pointer inside struct connecting, which
(with a bit of additional care) we can ensure is only non-NULL while
we're actually trying to connect.  This lets us check that before
firing off a new connection attempt in peer_conn_closed.

```
lightning_connectd: FATAL SIGNAL 6 (version v0.11.2rc2-2-g8f7e939)
0x5614a4915ae8 send_backtrace
	common/daemon.c:33
0x5614a4915b72 crashdump
	common/daemon.c:46
0x7ffa14fcd72f ???
	???:0
0x7ffa14dc87bb ???
	???:0
0x7ffa14db3534 ???
	???:0
0x5614a491fc71 fmt_wireaddr_internal
	common/wireaddr.c:255
0x5614a491fc7a fmt_wireaddr_internal_
	common/wireaddr.c:257
0x5614a491ea6b type_to_string_
	common/type_to_string.c:32
0x5614a490beaa destroy_io_conn
	connectd/connectd.c:754
0x5614a494a2f1 destroy_conn
	ccan/ccan/io/poll.c:246
0x5614a494a313 destroy_conn_close_fd
	ccan/ccan/io/poll.c:252
0x5614a4953804 notify
	ccan/ccan/tal/tal.c:240
0x5614a49538d6 del_tree
	ccan/ccan/tal/tal.c:402
0x5614a4953928 del_tree
	ccan/ccan/tal/tal.c:412
0x5614a4953e07 tal_free
	ccan/ccan/tal/tal.c:486
0x5614a4908b7a try_connect_one_addr
	connectd/connectd.c:870
0x5614a490bef1 destroy_io_conn
	connectd/connectd.c:759
0x5614a494a2f1 destroy_conn
	ccan/ccan/io/poll.c:246
0x5614a494a313 destroy_conn_close_fd
	ccan/ccan/io/poll.c:252
0x5614a4953804 notify
	ccan/ccan/tal/tal.c:240
0x5614a49538d6 del_tree
	ccan/ccan/tal/tal.c:402
0x5614a4953e07 tal_free
	ccan/ccan/tal/tal.c:486
0x5614a4948f08 io_close
	ccan/ccan/io/io.c:450
0x5614a4948f59 do_plan
	ccan/ccan/io/io.c:401
0x5614a4948fe1 io_ready
	ccan/ccan/io/io.c:417
0x5614a494a8e6 io_loop
	ccan/ccan/io/poll.c:453
0x5614a490c12f main
	connectd/connectd.c:2164
0x7ffa14db509a ???
	???:0
0x5614a4904e99 ???
	???:0
0xffffffffffffffff ???
	???:0
```

Fixes: #5339
Changelog-Fixed: connectd: occasional crash when we reconnect to a peer quickly.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2022-06-28 13:46:59 +09:30