It's actually two separate u16 fields, so actually treat it as
such!
Cleans up zombie handling code a bit too.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The gossip_store version byte was unaccounted for in the initial traversal
of gossip_store_end. This lead to an offset and a bogus message length
field. As a result, an early portion of the gossip_store could have been
skipped, potentially leading to gossip propagation issues downstream.
Fixes#5572#5565
Changelog-fixed: proper gossip_store operation may resolve some previous gossip propagation issues
This contains the zeroconf stuff, with funding_locked renamed to
channel_ready. I change that everywhere, and try to fix up the
comments.
Also the `alias` field is called `short_channel_id`.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Changed: Protocol: `funding_locked` is now called `channel_ready` as per latest BOLTs.
This will be used to decouple internal use of gossip from what is
passed to gossip peers. Updates GOSSIP_STORE_VERION to 10.
Changelog-Changed: gossip_store updated to version 10.
@whitslack complained of large CPU usage by connectd at startup;
I ran perf record on connectd on my machine (which sees a little spike, only)
and I see the cost of reading and discarding the entries:
```
- 95.52% 5.24% lightning_conne lightning_connectd [.] gossip_store_next
- 90.28% gossip_store_next
+ 40.27% tal_alloc_arr_
+ 22.78% tal_free
+ 11.74% crc32c
+ 9.32% fromwire_peektype
+ 4.10% __libc_pread64 (inlined)
1.70% be32_to_cpu
```
Much of this is caused by the search for our own gossip: keeping this separately
would be even better, but this fix is minimal.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Fixed: connectd: reduce initial CPU load when connecting to peers.
We now let gossipd do it.
This also means there's nothing left in 'struct per_peer_state' to
send across the wire (the fds are sent separately), so that gets
removed from wire messages too.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We actually intercept the gossip_timestamp_filter, so the gossip_store
mechanism inside the per-peer daemon never kicks off for normal connections.
The gossipwith tool doesn't set OPT_GOSSIP_QUERIES, so it gets both, but
that only effects one place.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Before:
Ten builds, laptop -j5, no ccache:
```
real 0m36.686000-38.956000(38.608+/-0.65)s
user 2m32.864000-42.253000(40.7545+/-2.7)s
sys 0m16.618000-18.316000(17.8531+/-0.48)s
```
Ten builds, laptop -j5, ccache (warm):
```
real 0m8.212000-8.577000(8.39989+/-0.13)s
user 0m12.731000-13.212000(12.9751+/-0.17)s
sys 0m3.697000-3.902000(3.83722+/-0.064)s
```
After:
Ten builds, laptop -j5, no ccache: 8% faster
```
real 0m33.802000-35.773000(35.468+/-0.54)s
user 2m19.073000-27.754000(26.2542+/-2.3)s
sys 0m15.784000-17.173000(16.7165+/-0.37)s
```
Ten builds, laptop -j5, ccache (warm): 1% faster
```
real 0m8.200000-8.485000(8.30138+/-0.097)s
user 0m12.485000-13.100000(12.7344+/-0.19)s
sys 0m3.702000-3.889000(3.78787+/-0.056)s
```
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
They're not defined to be, though we've not seen this on Linux (testing
showed that it is page-level atomic, which means it can still happen across
page boundaries though!). This was pointed out by whitslack in
https://github.com/ElementsProject/lightning/issues/4288
In practice, this just means not complaining when it happens, and also
not trying to get tricky to use it on MacOS (we can safely seek & write,
since we're single-threaded).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Removed: Removed bogus UNUSUAL log about gossip_store 'short test'.
We should actually be including this (as it may define _GNU_SOURCE
etc) before any system headers. But where we include <assert.h> we
often didn't, because check-includes would complain that the headers
included it too.
Weaken that check, and include config.h in C files before assert.h.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This avoids overwriting the ones in git, and generally makes things neater.
We have convenience headers wire/peer_wire.h and wire/onion_wire.h to
avoid most #ifdefs: simply include those.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This happened on Travis, and the gossip_store was a suspicious 4096
bytes long. This implies they're using some non-atomic filesystem
(gossipd always does atomic writes to gossip_store), but if they are,
others surely are too.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
I decided to try a faster implementation, only to find our crc32c was
not correct! Ouch.
I removed the crc32c functions from ccan/crc, and added a new crc32c
module which has the Mark Adler x86-64-optimized variants.
We bump gossip_store version again, since csums have changed.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This means we intercept the peer's gossip_timestamp_filter request
in the per-peer subdaemon itself. The rest of the semantics are fairly
simple however.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(We don't increment the gossip_store version, since there are only a
few commits since the last time we did this).
This lets the reader simply filter messages; this is especially nice since
the channel_announcement timestamp is *derived*, not in the actual message.
This also creates a 'struct gossip_hdr' which makes the code a bit
clearer.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Keeping the uintmap ordering all the broadcastable messages is expensive:
130MB for the million-channels project. But now we delete obsolete entries
from the store, we can have the per-peer daemons simply read that sequentially
and stream the gossip itself.
This is the most primitive version, where all gossip is streamed;
successive patches will bring back proper handling of timestamp filtering
and initial_routing_sync.
We add a gossip_state field to track what's happening with our gossip
streaming: it's initialized in gossipd, and currently always set, but
once we handle timestamps the per-peer daemon may do it when the first
filter is sent.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We use the high bit of the length field: this way we can still check
that the checksums are valid on deleted fields.
Once this is done, serially reading the gossip_store file will result
in a complete, ordered, minimal gossip broadcast. Also, the horrible
corner case where we might try to delete things from the store during
load time is completely gone: we only load non-deleted things.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>