Asking for the last few blocks was logical, but my node is missing
most gossip in practice.
For the moment, simply ask a peer for every channel it knows, once
we're started up.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
When probing, no point probing for before lightning became cool. Current
logic means we often probe below block 500,000, and think things are OK
because there are no short_channel_ids.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We always ended up sending an empty query when we had stale scids!
And it turns out we consider such a query invalid:
Bad query_short_channel_ids query_flags 010506226e46111a0b59caaf126043eb5bbf28c34f3a5e332a1fc7b2b73cf188910f000100010100
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We only chose 3 peers to gossip with us (down from 8 last release).
There's no justification for this number, or reason to believe that
it is sufficient to keep us in sync.
Be more conservative for now; we can always decrease it later once
we have more data.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
polling the last 32 is fairly useless in practice, since they tend to
be recent nodes; it won't detect long-forgotten ones.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
I thought LND had a bug, but turns out it doesn't like out-of-order
short_channel_ids: in fact, the spec says they have to be in order!
This means we use uintmap instead of a htable for unknown_scids and
stale_scids so they're nicely ordered.
But our nodes-missing-announcements probe is harder since they can
also contain duplicates: we switch that to iterate through channels
rather than nodes.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We weren't supposed to do any gossiping until we were synced (and thus
knew blockheight), but our seeker_check() didn't wait for it! Move the
waiting all into seeker.c, so it can handle it all consistently.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
On testing, I found a node which would hang up every time we asked it
for query_short_channel_ids (despite it offering features 0x81, meaning
it should handle this message).
Then it would reconnect, and we'd choose it again as our
PROBING_NANNOUNCES peer!
Instead, leave finding another peer to the once-a-minute
seeker_check() function.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
By combining set_state() with selected_peer() we can give a single log
line describing what we're asking for, from whom.
We also add more verbosity to a few key areas, such as gossip rotation
and when gossipd tells a peer to send an error. And move a comment which
was above the wrong function (due to rebase?).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It usually means we're missing something, but there's no way to ask what.
Simply start a broad scid probe.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This should give more reliable results, though it risks us getting
suckered into always consulting the same peer.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We assume that the time for gossip propagation is < 10 minutes, so by
going back that far from last gossip we won't miss anything,
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It's simple: if we wouldn't accept the timestamp we see, don't put
the channel in the stale_scid_map.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Just try to choose another. Under Travis, this causes many failures due
to slowness (they only get 10 seconds in -dev mode).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We eliminate the "need peer" states and instead check if the
random_peer_softref has been cleared.
We can also unify our restart handlers for all these cases; even the
probe_scids case, by giving gossip credit for the scids as they come
in (at a discount, since scids are 8 bytes vs the ~200 bytes for
normal gossip messages).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Build up a map of short_channel_ids which we have old info for (only
if peer supports gossip_query_ex).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This asks peers to append the timestamps or checksums: if it has
gossip_query_ex support, it will, otherwise it's ignored.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
If the peer supports `gossip_query_ex` we can use query_flags to simply
request the node_announcements when probing for nodes, rather than
getting everything. If a peer doesn't support `gossip_query_ex` then
it's harmless to add it.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We pick some nodes which don't seem to have node_announcements and we
ask a channel associated with them. Again, if this reveals more
node_announcements, we probe for twice as many next time.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
If we have any unknown short_channel_ids, we ask a random peer for
those channels. Once it responds, we probe again for a small random
range in case more are missing, again enlarging if we find some.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Instead of a linear array which is fairly inefficient if it turns out
we know nothing at all.
We remove the gossip_missing() call by changing the api to
remove_unknown_scid() to include a flag as to whether the scid turned
out to be real or not.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Once we've finished streaming gossip from the first peer, we ask a
random peer (maybe the same one) for all short_channel_ids in the last
6 blocks from the latest channel we know about.
If this reveals new channels we didn't know about, we expand the probe
by a factor of 2 each time.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The seeker starts by asking a peer (the first peer!) for all gossip
since a minute before the modified time of the gossip store.
This algorithm is enhanced in successive patches.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>