This involves removing some fields from the now-misnamed routing.h
datastructures, and various internal messages.
One non-obvious change is to our "keepalive" logic which refreshes
channels every 13 days: instead of using the 'enabled' flag on the
last channel broadcast to decide whether to refresh it, we use the
local connected status directly.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
No more sending "all-channel" errors; in particular, gossipd now only
sends warnings (which make us hang up), not errors, and peer_connected
rejections are warnings (and disconnect), not errors.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Changed: Plugins: `peer_connected` rejections now send a warning, not an error, to the peer.
This overcomes the internal spam filter on updates, which can be useful
if we're actually trying to send through such a node.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Fixed: Protocol: always accept channel_updates from errors, even they'd otherwise be rejected as spam.
Fixes: #4300
Instead of a boutique message, use a "real" channel_announcement for
private channels (with fake sigs and pubkeys). This makes it far
easier for gossmap to handle local channels.
Backwards compatible update, since we update old stores.
We also fix devtools/dump-gossipstore to know about the tombstone markers.
Since we increment our channel_announce count for local channels now,
the stats in the tests changed too.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This avoids overwriting the ones in git, and generally makes things neater.
We have convenience headers wire/peer_wire.h and wire/onion_wire.h to
avoid most #ifdefs: simply include those.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We are logging way too much from gossipd, causing noisy logs. This PR reduces
logs for incoming messages to those that actually caused a change in our
internal state (duplicate and old messages are just dropped silently now).
Changelog-Changed: gossipd: The `gossipd` is now a lot quieter, and will log only when a message changed our network topology.
See https://github.com/lightningnetwork/lightning-rfc/pull/767
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Changed: Protocol: channels now pruned after two weeks unless both peers refresh it (see lightning-rfc#767)
It's not all that rare to do these operations, and requiring annotations
for it is a little painful.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
There were no channel updates in my log; because sendonion doesn't know the
actual node_ids or channel_ids, we can't tell gossipd what node/channel it was
so it can no longer remove them on PERM errors.
However, we can tell it the error message so it can apply the update.
Fixes: #3877
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This saves us keeping it in memory (so far, no channels have features), but
lets us optimize that case so we don't need to hit the disk for most of the
channels in listchannels.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This will be used when we want to specify these in a route. But for now, they
only alter gossipd, which always sets them to NULL.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
```
clang10 -DBINTOPKGLIBEXECDIR="\"../libexec/c-lightning\"" -Wall -Wundef -Wmissing-prototypes -Wmissing-declarations -Wstrict-prototypes -Wold-style-definition -Werror -std=gnu11 -g -fstack-protector -I ccan -I external/libwally-core/include/ -I external/libwally-core/src/secp256k1/include/ -I external/jsmn/ -I external/libbacktrace/ -I external/libbacktrace-build -I . -I/usr/local/include -DCCAN_TAKE_DEBUG=1 -DCCAN_TAL_DEBUG=1 -DCCAN_JSON_OUT_DEBUG=1 -DSHACHAIN_BITS=48 -DJSMN_PARENT_LINKS -DBUILD_ELEMENTS=1 -DBINTOPKGLIBEXECDIR="\"../libexec/c-lightning\"" -c -o gossipd/routing.o gossipd/routing.c
gossipd/routing.c:651:10: error: implicit conversion from 'unsigned long' to 'double' changes value
from 18446744073709551615 to 18446744073709551616 [-Werror,-Wimplicit-int-float-conversion]
if (r > UINT64_MAX)
~ ^~~~~~~~~~
```
It is ok to change the values because they are approximate anyway. Thus,
explicitly typecast to `double` to silence the warning without changing
behavior.
Changelog-None
This is a common thing to do, so create a macro.
Unfortunately, it still needs the type arg, because the paramter may
be const, and the return cannot be, and C doesn't have a general
"(-const)" cast.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It really, really doesn't matter. But we were dramatically reducing
our view of the network:
In my gossip_store (mainnet):
channel_announcement: 30349
channel_update: 55119
node_announcment: 1783
Changelog-Fixed: No longer discard most node_announcements (fixes#3194)
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The flat feature PR changes the rules so these are OK to propagate.
That makes sense: the unsupported features means there's something
unsupported about the *node* or *channel*, not the msg itself
(for that we'd use a different message type).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This prevents a gratuitous lookup of we get a late channel_announce,
but even better, it suppresses the "bad gossip" messages in case of
a late channel_update, which have plagued Travis (especially since we
got aggressive in pushing our own updates).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is a better fix than doing it manually, which turned out
to do it in the wrong order (node_announcement followed by
channel_announcement) anyway.
Should fix many "Bad gossip" messages.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
If we get a channel_update while we're still verifying the channel_announcement
we didn't set the peer pointer, so it didn't get credit. As a result, the
seeker tended to think we were done gossiping sooner than we were.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is mainly an internal-only change, especially since we don't
offer any globalfeatures.
However, LND (as of next release) will offer global features, and also
expect option_static_remotekey to be a *global* feature. So we send
our (merged) feature bitset as both global and local in init, and fold
those bitsets together when we get an init msg.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It usually means we're missing something, but there's no way to ask what.
Simply start a broad scid probe.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We assume that the time for gossip propagation is < 10 minutes, so by
going back that far from last gossip we won't miss anything,
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It's simple: if we wouldn't accept the timestamp we see, don't put
the channel in the stale_scid_map.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We eliminate the "need peer" states and instead check if the
random_peer_softref has been cleared.
We can also unify our restart handlers for all these cases; even the
probe_scids case, by giving gossip credit for the scids as they come
in (at a discount, since scids are 8 bytes vs the ~200 bytes for
normal gossip messages).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Since we have to validate, there can be a delay (and peer might
vanish) between receiving the gossip and actually confirming it, hence
the use of softref.
We will use this information to check that the peers are making progress
as we start asking them for specific information.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We do this by keeping a current and an old map, and moving the current to old
every hour or 10,000 entries.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
I was seeing some accidental pruning under load / Travis, and in
particular we stopped accepting channel_updates because they were 103
seconds old. But making it too long makes the prune test untenable,
so restore a separate flag that this test can use.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Now we queue them, we should place a limit. It's not the worst thing in
the world if we discard them (we'll catch up eventually), but we should
try not to in case we're just a bit behind.
Our behaviour here is also O(n^2) so we don't want a massive queue
anyway.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>