This patch guts gossipd of all peer-related functionality, and hands
all the peer-related requests to channeld instead.
gossipd now gets the final announcable addresses in its init msg, since
it doesn't handle socket binding any more.
lightningd now actually starts connectd, and activates it. The init
messages for both gossipd and connectd still contain redundant fields
which need cleaning up.
There are shims to handle the fact that connectd's wire messages are
still (mostly) gossipd messages.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
connectd has a dedicated fd to gossipd, so it can ask for a new gossip_fd
for a peer.
gossipd has a standalone routine to create a remote peer (this will
eventually be the only way gossipd creates a new peer).
For now lightningd creates a socketpair but doesn't run connectd, so
gossipd never sees any requests on this fd.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Store the two we care about as booleans. Once connectd is complete we won't
even have the feature bitmaps for peers.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We weren't waiting for gossipd to actually process the
dev_set_max_scids_encode_size message, so under Travis it sometimes
split the reply before processing that.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Note that we mark both directions of the channel disabled immediately,
it's just the broadcast of the update which is delayed, just like the
ones generated when channeld tells us to.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We disable the channel every time the peer disconnects; if it reconnects
we get two updates.
The simplest solution: delay all updates by 15 seconds. Replace any
pending delayed update. If update is redundant after 15 seconds,
discard.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This doesn't do anything for us now, since we actually tend to produce
DISABLE/ENABLE update pairs. But the infrastructure is useful for the
next patch.
We also add more details to the trace message in the core update code.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
json_listpeers returns an array of peers, and an array of nodes: the latter
is a subset of the former, and is used for printing alias/color information.
This changes it so there is a 1:1 correspondance between the peer information
and nodes, meaning no more O(n^2) search.
If there is no node_announce for a peer, we use a negative timestamp
(already used to indicate that the rest of the gossip_getnodes_entry
is not valid).
Other fixes:
1. Use get_node instead of iterating through the node map.
2. A node without addresses is perfectly valid: we have to use the timestamp
to see if the alias/color are set. Previously we wouldn't print that
if it didn't also advertize an address.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
structeq() is too dangerous: if a structure has padding, it can fail
silently.
The new ccan/structeq instead provides a macro to define foo_eq(),
which does the right thing in case of padding (which none of our
structures currently have anyway).
Upgrade ccan, and use it everywhere. Except run-peer-wire.c, which
is only testing code and can use raw memcmp(): valgrind will tell us
if padding exists.
Interestingly, we still declared short_channel_id_eq, even though
we didn't define it any more!
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We wrap it in 'struct pubkey' for typesafety and consistency, and the
next patch takes advantage of that when we move to pubkey_eq.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Only --addr implies announce-if-public: --bind-addr does not.
It's also possible to have --bind-addr to an automatic Tor address:
you'd have to dig the onion address out of the logs or getinfo to use
it, but it's possible.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is a best effort attempt to skip connection attempts if we detect a broken
ISP resolver. A broken ISP resolver is a resolver that will replace NXDOMAIN
replies with a dummy response. This is best effort in that it'll only detect a
single fixed dummy reply, it'll check only on startup, and will not detect if we
switched networks. It should be good enough for most cases, and in the worst
case it will result in a connection attempt that does not complete.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
Reported-by: Glenn Willen <@gwillen>
Cut & paste means we sometimes sent NULL:
```
2018-06-15T00:13:51.908Z lightningd(23653): lightning_closingd-03864ef025fde8fb587d989186ce6a4a186895ee44a926bfc370e2c366597a3f8f chan #436: Gossipd gave us bad send_gossip message 0bc80000
```
Fixes: #1581
Reported-by: @Xian001
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
In this case, local and remote are *both* NULL; so if someone tries to
send a packet with take(), we need to free it.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
I think this is what is causing #1536: getting disconnected causes gossipd to
attempt to reach the peer again, unconditionally setting the flag to tell the
master. At the same time the master also issues a reaching command (which is
allowed since it is its first), but then it clashes on the already set
flag. Setting this flag only when the master actually needs to be told should
fix this.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
A failed compaction shouldn't be deadly, but we should also not attempt to do
one on every gossip message after the first one fails.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
`gossip_store_add` is the entry point for messages from the network, so it
should do the bookkeeping and disable on failures. `gossip_store_append` is the
shared function that wraps messages and writes it to the given file. This is
shared between the from network path and the compaction path, so we don't
directly use the `gossip_store` instance, but `fd`s.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
We write both when coming from outside, as well as when compacting, so we
extract the write functionality to use it in both cases.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
This makes the exposed interface much smaller, cleaner and will allow us to just
replay gossip messages from the broadcast.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
Two cases:
1. Node no longer has any public channels: remove node_announcement.
2. Node's node_announcement now preceeds all the channel_announcements:
move node_announcement to the end.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This lets detect if a node announce preceeds a channel announce once we
delete the node announcement.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We *accept* a node_announce if we have a channel_announce, but we
can't queue it until we queue the channel_announce, which we only do
once we have recieved a channel_update.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Since we currently only (ab)use it to send everything, we need a way to
generate boutique queries for testing.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We have a function called 'wake_pkt_out' which is really 'start
gossiping', so rename it to 'wake_gossip_out'.
In addition, it's fired both on a timer, and in response to our first
gossip_timestamp_filter, which leads to very confusing (though,
technically, not incorrect) behavior.
Keep a single timer at all times, which now doubles as the flag to
indicating we're syncing right now. Set it once we're done syncing
gossip.
Technically this means we got from once-every-60-seconds to
quiet-for-60-seconds-between-gossip, but that's OK.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
And initialize filter (to "never") when we negotiated LOCAL_GOSSIP_QUERIES,
and send initial filter message.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is kind of orthogonal to the other changes, but makes sense: if we
would instantly or never prune the message, don't accept it.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We use the same system as for gossip: we trickle out replies when we're
otherwise idle.
As we trickle out replies to query_short_channel_ids, we remember the
pubkeys of nodes we mention. At the end, we sort and uniquify, and
then send any node_announcements we have for those.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We use the same system as for gossip: we trickle out replies when we're
otherwise idle.
This is minimal infrastructure: we don't actually process the
query_short_channel_ids message yet, nor do we append node
announcements.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
In general, we need to only publish node announcements after
publishing channel announcements, though we can accept node
announcements as soon as we see channel announcements. So we keep a
flag for those node_announcement which haven't been broadcast yet.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
handle_pending_cannouncement might not actually add the announcment,
as it could be waiting for a channel_update. We need to wait for
the actual announcement before considering announcing our node.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We generate new ones anyway; removing this code changes fixes coming
up which now only need to change one place.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We don't have any connection yet, so how could they be active? Disable both
sides to avoid trying to route through them or telling others to use them as
`contact_points` in invoices.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
We're telling gossipd about disconnections anyway, so let's just use that signal
to disable both sides of the channel.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
This was failing some of our integration tests, i.e., the ones closing a channel
and not waiting for sigexchange. The remote node would often not be quick enough
to send us its disabling channel_update, and hence we'd still remember the
incoming direction. That could then be sent out as part of an invoice, and fail
subsequently. So just set both directions to be disabled and let the onchain
spend clean up once it happens.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
This resolves the problem where both channeld and gossipd can generate
updates, and they can have the same timestamp. gossipd is always able
to generate them, so can ensure timestamp moves forward.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We erroneously create updates with the same timestamps when tests run
quickly, and the second one is ignored.
We've already noted that this should be fixed: gossipd should generate
all the updates, as it already has to do the case where channeld
crashed, for example. But that's a bigger change.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
@cdecker points out that in test_forward, where we manually create a route,
we get an error back which contains an update for an unknown channel.
We should still note this, but it's not an error for testing.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is something which generally shouldn't happen, but we didn't
notice it previously.
We ignore this warning in the case where a channel was deleted: this
happens because one side can send an update while the other notices
that the channel is closed.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Note: this will break the gossip_store if they have current channels,
but it will fail to parse and be discarded.
Have local_add_channel do just that: the update is logically separate
and can be sent separately.
This removes the ugly 'bool add_to_store' flag.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Tor wasn't actually working for me to connect to anything, but it worked
for 'ssh -D' testing.
Note that the resulting 'netaddr' is a bit weird, but I guess it's honest.
$ ./cli/lightning-cli connect 021f2cbffc4045ca2d70678ecf8ed75e488290874c9da38074f6d378248337062b
{
"id": "021f2cbffc4045ca2d70678ecf8ed75e488290874c9da38074f6d378248337062b"
}
$ ./cli/lightning-cli listpeers
{
"peers": [
{
"state": "GOSSIPING",
"id": "021f2cbffc4045ca2d70678ecf8ed75e488290874c9da38074f6d378248337062b",
"netaddr": [
"ln1qg0je0lugpzu5ttsv78vlrkhteyg9yy8fjw68qr57mfhsfyrxurzkq522ah.lseed.bitcoinstats.com:9735"
],
"connected": true,
"owner": "lightning_gossipd"
}
]
}
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Good for debugging (you have to send SIGUSR1 to lightning_gossipd to turn
it on though, and --log-level=io on the lightningd cmdline to have it
output IO messages by default).
I also noticed that io_tor_connect_after_req_host() does a useless
test on reach->buffer[0] after it's *written*: remove it.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Use a wireaddr_internal directly (which is what we want).
Also, don't hardcode 9735, use DEFAULT_PORT internally in
seed_resolve_addr().
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>