In particular, this lets you find the exact htlc_maximum_msat/htlc_minimum_msat
values.
This means we actually create real channel_updates for local mods, which
requires a second "local" scratch region.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Since we don't compact the gossmap on the fly (FIXME!) we can
easily surpass 4GB in the gossmap, and 32 bit offsets are not
sufficient.
I'm a bit surprised we don't crash immediately, but we've definitely
seen issues.
Changelog-Fixed: gossipd: crash errors with large gossip_store (>4MB) growth on longer-running nodes.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It was weird not to have a capacity associated with localmods channels, and
fixing it has some very nice side effects.
Now the gossmap_chan_get_capacity() call never fails (we prevented reading
of channels from gossmap in the partially-written case already), so we
make it return the capacity. We do this in msat, because that's what
all the callers want.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is actually what we want in several places: to only override one or
two fields in a channel_update.
We add a gossmap_local_setchan() with a similar API to the old
gossmap_local_updatechan(), for the case where we want to set every
field.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We allow adding them, but crash when we remove the localmods. Yet
this could theoretically happen if a channel we modified was removed
from the gossmap, anyway.
Reported-by: Lagrang3 <lagrang3@protonmail.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This simplifies the callers significantly: all channel_announcements now
have an amount, so gossmap_chan_get_capacity() only fails on a local
modification.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
If we need to iterate forward to find a timestamp (only happens if we have gossip older than
2 hours), we didn't exit the loop, as it didn't actually move the offset.
Fixes: https://github.com/ElementsProject/lightning/issues/7462
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This seems to be happening to some people, so don't panic. Unfortunately we don't have
a good error callback here, so msg to stderr.
Fixes: https://github.com/ElementsProject/lightning/issues/7249
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We only write these in two places: one where we get a message from lightningd about
our own channel, and one where we get a reply from lightningd about a txout check.
The former case we explicitly check that we don't already have it in gossmap, so
add checks to the latter case, and give verbose detail if it's found.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It's a u64, we should pass by copy. This is a big sweeping change,
but mainly mechanical (change one, compile, fix breakage, repeat).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Wrote a test program which passed num_channel_updates_rejected as NULL
(which we don't usually do), and valgrind complained:
```
==1048302== Conditional jump or move depends on uninitialised value(s)
==1048302== at 0x118B90: update_channel (gossmap.c:550)
==1048302== by 0x119EEE: map_catchup (gossmap.c:663)
==1048302== by 0x11A299: load_gossip_store (gossmap.c:726)
==1048302== by 0x11A352: gossmap_load (gossmap.c:1052)
==1048302== by 0x125362: main (run-route-infloop.c:90)
```
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Thanks to amazing debugging assistance from grubles, we figured out
that indeed, my memory was correct: write and mmap are not consistent
on all platforms. The easiest fix is to disable mmap on OpenBSD for now:
the better fix is to do in-place updates using the mmap, and only rely
on write() for append (which always causes a remap anyway before it's accessed).
Fixes: https://github.com/ElementsProject/lightning/issues/7109
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We never enabled it, because we seemed to be eliminating valid
channels. We discard zombie-marked records on loading.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
In particular, allow callers to see unknown records we ignore (and let
them fail as a result), and get called if we can't pack a
channel_update into our internal format.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The only way you'll see private channel_updates is if you put them
there yourself with localmods.
I also renamed the confusing gossmap_chan_capacity to gossmap_chan_has_capacity.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Doesn't happen on x86, but struct gossmap_chan defines:
```
u32 private: 1;
u32 plus_scid_off: 31;
```
And complains when we initialize plus_scid_off and access it later:
```
VALGRIND=1 valgrind -q --error-exitcode=7 --track-origins=yes --leak-check=full --show-reachable=yes --errors-for-leak-kinds=all plugins/renepay/test/run-mcf > /dev/null
==186886== Conditional jump or move depends on uninitialised value(s)
==186886== at 0x10076388: chan_iter (gossmap.c:1098)
==186886== by 0x100797F3: gossmap_next_chan (gossmap.c:1112)
==186886== by 0x1008C5AF: main (run-mcf.c:309)
==186886== Uninitialised value was created by a heap allocation
==186886== at 0x40F0A44: malloc (vg_replace_malloc.c:431)
==186886== by 0x10072BAF: allocate (tal.c:256)
==186886== by 0x100737A7: tal_alloc_ (tal.c:463)
==186886== by 0x100738DF: tal_alloc_arr_ (tal.c:506)
==186886== by 0x10079507: load_gossip_store (gossmap.c:690)
==186886== by 0x10079667: gossmap_load (gossmap.c:978)
==186886== by 0x1008C4AF: main (run-mcf.c:295)
```
Reported-by: @grubles
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Fixes: #6557
This will fix a crash that I caused on armv7
and by looking inside the coredump with gdb
(by adding an assert on n that must be
different from null) I get the following stacktrace
```
(gdb) bt
\#0 0x00000000 in ?? ()
\#1 0x0043a038 in send_backtrace (why=0xbe9e3600 "FATAL SIGNAL 11") at common/daemon.c:36
\#2 0x0043a0ec in crashdump (sig=11) at common/daemon.c:46
\#3 <signal handler called>
\#4 0x00406d04 in node_announcement (map=0x938ecc, nann_off=495146) at common/gossmap.c:586
\#5 0x00406fec in map_catchup (map=0x938ecc, num_rejected=0xbe9e3a40) at common/gossmap.c:643
\#6 0x004073a4 in load_gossip_store (map=0x938ecc, num_rejected=0xbe9e3a40) at common/gossmap.c:697
\#7 0x00408244 in gossmap_load (ctx=0x0, filename=0x4e16b8 "gossip_store", num_channel_updates_rejected=0xbe9e3a40) at common/gossmap.c:976
\#8 0x0041a548 in init (p=0x93831c, buf=0x9399d4 "\n\n{\"jsonrpc\":\"2.0\",\"id\":\"cln:init#25\",\"method\":\"init\",\"params\":{\"options\":{},\"configuration\":{\"lightning-dir\":\"/home/vincent/.lightning/testnet\",\"rpc-file\":\"lightning-rpc\",\"startup\":true,\"network\":\"te"..., config=0x939cdc) at plugins/topology.c:622
\#9 0x0041e5d0 in handle_init (cmd=0x938934, buf=0x9399d4 "\n\n{\"jsonrpc\":\"2.0\",\"id\":\"cln:init#25\",\"method\":\"init\",\"params\":{\"options\":{},\"configuration\":{\"lightning-dir\":\"/home/vincent/.lightning/testnet\",\"rpc-file\":\"lightning-rpc\",\"startup\":true,\"network\":\"te"..., params=0x939c8c)
at plugins/libplugin.c:1208
\#10 0x0041fc04 in ld_command_handle (plugin=0x93831c, toks=0x939bec) at plugins/libplugin.c:1572
\#11 0x00420050 in ld_read_json_one (plugin=0x93831c) at plugins/libplugin.c:1667
\#12 0x004201bc in ld_read_json (conn=0x9391c4, plugin=0x93831c) at plugins/libplugin.c:1687
\#13 0x004cb82c in next_plan (conn=0x9391c4, plan=0x9391d8) at ccan/ccan/io/io.c:59
\#14 0x004cc67c in do_plan (conn=0x9391c4, plan=0x9391d8, idle_on_epipe=false) at ccan/ccan/io/io.c:407
\#15 0x004cc6dc in io_ready (conn=0x9391c4, pollflags=1) at ccan/ccan/io/io.c:417
\#16 0x004cf8cc in io_loop (timers=0x9383c4, expired=0xbe9e3ce4) at ccan/ccan/io/poll.c:453
\#17 0x00420af4 in plugin_main (argv=0xbe9e3eb4, init=0x41a46c <init>, restartability=PLUGIN_STATIC, init_rpc=true, features=0x0, commands=0x6167e8 <commands>, num_commands=4, notif_subs=0x0, num_notif_subs=0, hook_subs=0x0, num_hook_subs=0, notif_topics=0x0, num_notif_topics=0) at plugins/libplugin.c:1891
\#18 0x0041a6f8 in main (argc=1, argv=0xbe9e3eb4) at plugins/topology.c:679
```
I do not know if this is a solution because I do not know
when I can parse a node announcement for a node that
it is not longer in the gossip map.
So, I hope this is just usefult for @rustyrussell
Changelog-Fixed: fixes `FATAL SIGNAL 11` on gossmap node announcement parsing.
Signed-off-by: Vincenzo Palazzo <vincenzopalazzodev@gmail.com>
It's actually two separate u16 fields, so actually treat it as
such!
Cleans up zombie handling code a bit too.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Though BOLT 7 says a channel may be pruned when one side becomes inactive
and fails to refresh their channel_update, in practice, the
channel_announcement can be difficult to recover if deleted entirely.
Here the channel_announcement is tagged as zombie such that gossip_store
consumers may safely ignore it, but it may be retained should the channel
come back online in the future. Node_announcements and channel_updates may
also be retained in such a fashion until the channel is ready to be
resurrected.
Changelog-Fixed: Pruned channels are more reliably restored.
This is needed for offers to generate blinded paths.
No documentation changes since listincoming is an undocumented
internal hack interface which topology presents for production
of routehints.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We will now simply reject old-style ones as invalid. Turns out the
only trace we could find is a channel between two nodes unconnected to
the rest of the network.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Changed: Protocol: We now require all channel_update messages include htlc_maximum_msat (as per latest BOLTs)
Many changes to gossmap (including the pending ones!) don't actually
concern readers, as long as they obey certain rules:
1. Ignore unknown messages.
2. Treat all 16 upper bits of length as flags, ignore unknown ones.
So now we split the version byte into MAJOR and MINOR, and you can
ignore MINOR changes.
We don't expose the internal version (for creating the map)
programmatically: you should really hardcode what major version you
understand!
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
They are surprisingly expensive!
Running `time ./plugins/renepay/test/run-not_mcf-gossmap gossip_store-sgl.rustcorp.com.au-2022-04-19 024b9a1fa8e006f1e3937f65f66c408e6da8e1ca728ea43222a7381df1cc449605 02ebb3b8a2316b3e876ea3f3d8124a3ab97f30b128f619608eb06b5251235dc2d9 10000000000 0.1`:
Before (-Og):
real 0m1.495s
Before (no opt):
real 0m2.552s
After (-Og):
real 0m0.579s
After (no opt):
real 0m1.061s
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Apparently we had two private channel announcements (the !private assert
failed). While this shouldn't happen, don't crash because of it.
Fixes: #5578
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Fixed: Plugins: topology plugin could crash when it sees duplicate private channel announcements.
Usually we won't see this, since private is deleted. But we could
have already read the private channel before that. Handle it properly.
(Tested by removing the gossip_store deletion code and making sure
this worked).
We have to fix up the test, which announces a channel twice!
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
There's a race under CI, where a channel is deleted then we see the
channel_update in the gossip store. We assumed this wouldn't happen,
but it can!
```
[gw1] [ 95%] FAILED tests/test_connection.py::test_multichan
[gw1] [ 95%] ERROR tests/test_connection.py::test_multichan
...
> raise ValueError(str(errors))
E ValueError:
E Node errors:
E - lightningd-3: had BROKEN messages
E - lightningd-3: Node exited with return code 1
E Global errors:
...
lightningd-3: 2022-03-28T00:11:42.160Z DEBUG wallet: Owning output 0 100000sat (SEGWIT) txid 30616903feba1839a3834e2b3b6123759ce1fe0d76414ca77e2dbc17414772e0 CONFIRMED
lightningd-3: 2022-03-28T00:11:42.392Z DEBUG hsmd: Client: Received message 5 from client
lightningd-3: 2022-03-28T00:11:42.393Z DEBUG hsmd: new_client: 2
lightningd-3: 2022-03-28T00:11:42.398Z INFO plugin-topology: Killing plugin: exited during normal operation
lightningd-3: 2022-03-28T00:11:42.400Z **BROKEN** plugin-topology: Plugin marked as important, shutting down lightningd!
...
----------------------------- Captured stderr call -----------------------------
topology: update for channel 105x1x1 not found!
```
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Before:
Ten builds, laptop -j5, no ccache:
```
real 0m36.686000-38.956000(38.608+/-0.65)s
user 2m32.864000-42.253000(40.7545+/-2.7)s
sys 0m16.618000-18.316000(17.8531+/-0.48)s
```
Ten builds, laptop -j5, ccache (warm):
```
real 0m8.212000-8.577000(8.39989+/-0.13)s
user 0m12.731000-13.212000(12.9751+/-0.17)s
sys 0m3.697000-3.902000(3.83722+/-0.064)s
```
After:
Ten builds, laptop -j5, no ccache: 8% faster
```
real 0m33.802000-35.773000(35.468+/-0.54)s
user 2m19.073000-27.754000(26.2542+/-2.3)s
sys 0m15.784000-17.173000(16.7165+/-0.37)s
```
Ten builds, laptop -j5, ccache (warm): 1% faster
```
real 0m8.200000-8.485000(8.30138+/-0.097)s
user 0m12.485000-13.100000(12.7344+/-0.19)s
sys 0m3.702000-3.889000(3.78787+/-0.056)s
```
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>