Basically, if we don't have an announcement for the channel, stash it,
and once we get an announcement, replay if necessary.
Fixes: #1485
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This means it will effect connect commands too (though it's too
late to stop DNS lookups caused by commandline options).
We also warn that this is one case where we allow forcing through Tor
without a proxy set: it just means all connections will fail.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This takes the Tor service address in the same option, rather than using
a separate one. Gossipd now digests this like any other type.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
For the moment, this is a straight handing of current parameters through
from master to the gossip daemon. Next we'll change that.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Risks leakage. We could do lookup via the proxy, but that's a TODO.
There's only one occurance of getaddrinfo (and no gethostbyname), so
we add a flag to the callers.
Note: the use of --always-use-proxy suppresses *all* DNS lookups, even
those from connect commands and the command line.
FIXME: An implicit setting of use_proxy_always is done in gossipd if it
determines that we are announcing nothing but Tor addresses, but that
does *not* suppress 'connect'.
This is fixed in a later patch.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
1. Only force proxy use if we don't announce any non-TOR address.
There's no option to turn it off, so this makes more sense.
2. Don't assume we want an IPv4 socket to reach proxy, use the family
from the struct addrinfo.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Instead of storing a wireaddr and converting to an addrinfo every
time, just convert once (which also avoids the memory leak in the
current code).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rename tor_proxyaddrs and tor_serviceaddrs to tor_proxyaddr and tor_serviceaddr:
the 's' at the end suggests that there can be more than one.
Make them NULL or non-NULL, rather than using all-zero if unset.
Hand them the same way to gossipd; it's a bit of a hack since we don't
have optional fields, so we use a counter which is always 0 or 1.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
All gossipd needs from common/tor is do_we_use_tor_addr(), so move
that and the rest of the tor-specific handshake code into gossip/tor.c
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is a rebased and combined patch for Tor support. It is extensively
reworked in the following patches, but the basis remains Saibato's work,
so it seemed fairest to begin with this.
Minor changes:
1. Use --announce-addr instead of --tor-external.
2. I also reverted some whitespace and unrelated changes from the patch.
3. Removed unnecessary ';' after } in functions.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is a leftover from before splitting the `gossip_store` injection path from
the handling of gossip messages.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
Someone could try to announce an internal address, and we might probe
it.
This breaks tests, so we add '--dev-allow-localhost' for our tests, so
we don't eliminate that one. Of course, now we need to skip some more
tests in non-developer mode.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
If we're given a wildcard address, we can't announce it like that: we need
to try to turn it into a real address (using guess_address). Then we
use that address. As a side-effect of this cleanup, we only announce
*any* '--addr' if it's routable.
This fix means that our tests have to force '--announce-addr' because
otherwise localhost isn't routable.
This means that gossipd really controls the addresses now, and breaks
them into two arrays: what we bind to, and what we announce. That is
now what we return to the master for json_getinfo(), which prints them
as 'bindings' and 'addresses' respectively.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
1. Add special option where an empty host means 'wildcard for IPv4 and/or IPv6'
which means ':1234' can be used to set only the portnum.
2. Only add this protocol wildcard if --autolisten=1 (default)
and no other addresses specified.
3. Pass it down to gossipd, so it can handle errors correctly: in most cases,
it's fatal not to be able to bind to a port, but for this case, it's OK
if we can only bind to one of IPv4/v6 (fatal iff neither).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This replacement is a little menial, but it explicitly catches all
the places where we allow a local socket. The actual implementation of
opening a AF_UNIX socket is almost hidden in the patch.
The detection of "valid address" is now more complex:
p->addr.itype != ADDR_INTERNAL_WIREADDR || p->addr.u.wireaddr.type != ADDR_TYPE_PADDING
But most places we do this, we should audit: I'm pretty sure we can't
get an invalid address any more from gossipd (they may be in db, but
we should fix that too).
Closes: #1323
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It does all the other address handling, do this too. It also proves useful
as we clean up wildcard address handling.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It's become clear that our network options are insufficient, with the coming
addition of Tor and unix domain support.
Currently:
1. We always bind to local IPv4 and IPv6 sockets, unless --port=0, --offline,
or any address is specified explicitly. If they're routable, we announce.
2. --addr is used to announce, but not to control binding.
After this change:
1. --port is deprecated.
2. --addr controls what we bind to and announce.
3. --bind-addr/--announce-addr can be used to control one and not the other.
4. Unless --autolisten=0, we add local IPv4 & IPv6 port 9735 (and announce if they are routable).
5. --offline still overrides listening (though announcing is still the same).
This means we can bind to as many ports/interfaces as we want, and for
special effects we can announce different things (eg. we're sitting
behind a port forward or a proxy).
What remains to implement is semi-automatic binding: we should be able
to say '--addr=0.0.0.0:9999' and have the address resolve at bind
time, or even '--addr=0.0.0.0:0' and have the port autoresolve too
(you could determine what it was from 'lightning-cli getinfo'.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We set no_reconnect with --offline, but that doesn't work if !DEVELOPER.
Make the flag positive, and non-DEVELOPER mode for gossipd.
We also don't override portnum with --offline, but have an explicit
'listen' flag.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This means gossipd is live and we can tell it things, but it won't
receive incoming connections. The split also means that the main daemon
continues (eg. loading peers from db) while gossipd is loading from the store,
potentially speeding startup.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
When a connect fails, if it's an important peer, we set a timer. If
we have a manual connect command, this means we do this again, leading
to another timer.
For a manual command, free any existing timer; the normal fail logic
will start another if necessary.
Reported-by: @ZmnSCPxj
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
At least say whether we failed to connect at all, or failed cryptographic
handshake, or failed reading/writing init messages.
The errno can be "Operation now in progress" if the other end closes the
socket on us: this happens when we handshake with the wrong key and it
hangs up on us. Fixing this would require work on ccan/io though.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
When we get a reconnection, kill the current remote peer, and wait for the
master to tell us it's dead. Then we hand it the new peer.
Previously, we would end up with gossipd holding multiple peers, and
the logging was really hard to interpret; I'm not completely convinced
that we did the right thing when one terminated, either.
Note that this now means we can have peers with neither ->local nor ->remote
populated, so we check that more carefully.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Currently we intuit it from the fd being closed, but that may happen out
of order with when the master thinks it's dead.
So now if the gossip fd closes we just ignore it, and we'll get a
notification from the master when the peer is disconnected.
The notification is slightly ugly in that we have to disable it for
a channel when we manually hand the channel back to gossipd.
Note: as stands, this is racy with reconnects. See the next patch.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This was sitting in my gossip-enchancement patch queue, but it simplifies
this set too, so I moved it here).
In 94711969f we added an explicit gossip_index so when gossipd gets
peers back from other daemons, it knows what gossip it has sent (since
gossipd can send gossip after the other daemon is already complete).
This solution is insufficient for the more general case where gossipd
wants to send other messages reliably, so replace it with the other
solution: have gossipd drain the "gossip fd" which the daemon returns.
This turns out to be quite simple, and is probably how I should have
done it originally :(
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
1. Lifetime of 'struct reaching' now only while we're actively doing connect.
2. Always free after a single attempt: if it's an important peer, retry
on a timer.
3. Have a single response message to master, rather than relying on
peer_connected on success and other msgs on failure.
4. If we are actively connecting and we get another command for the same
id, just increment the counter
The result is much simpler in the master daemon, and much nicer for
reconnection: if they say to connect they get an immediate response,
rather than waiting for 10 retries. Even if it's an important peer,
it fires off another reconnect attempt, unless it's actively
connecting now.
This removes exponential backoff: that's restored in next patch. It
also doesn't handle multiple addresses for a single peer.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rather than using a flag in reaching/peer; we make it self-contained
as the next patch puts it straight into a timer callback.
Also remove unused 'succeeded' field from struct peer.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
And on channel_fail_permanent and closing (the two places we drop to
chain), we tell gossipd it's no longer important.
Fixes: #1316
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
These don't have a maximum number of reconnect attempts, and ensure
that we try to reconnect when the peer dies.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We're about to remove automatic retrying of connect, and that uncovered
that we actually print out our "Server started" message before we create
the listening socket.
Move the init higher (outside the db transaction) and make it a
request/response, the loop until it's done.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Internally both payment and routing use 64-bit, but the interface
between them used 32-bit.
Since both components already support 64-bit we should use that.
In particular, the main daemon and subdaemons share the backtrace code,
with hooks for logging.
The daemon hook inserts the io_poll override, which means we no longer
need io_debug.[ch]. Though most daemons don't need it, they still link
against ccan/io, so it's harmess (suggested by @ZmnSCPxj).
This was tested manually to make sure we get backtraces still.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
If we only remember the actions that added channels then we'd restore them when
re-reading the gossip_store, so put a tombstone in there to remember to delete
it. These will be cleared upon re-writing the store since the announcements wont
be written anymore.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
This now works because we no longer call out to masterd or bitcoind to verify
the channels. It's also rather quick and silent so we can just process all
stored messages until we're done.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
Messages from peers and messages from the gossip_store now have completely
different entrypoints, so we don't need to trace their origin around the message
handling code any longer.
This stores and reads the channel_announcements in the wrapping message which
allows us to store associated data with the raw channel_announcements.
The gossip_store applies channel_announcements directly but it also returns it,
and it gets discarded as a duplicate. In the next commit we'll have gossip_store
apply all changes, bypassing verification, so the duplication is only temporary.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
Since we may want to extend the on-disk format by adding custom information we
may as well just go the extra mile and reuse the serialization primitives we
already have.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
When we read from the gossip_store we set store=false so that we don't duplicate
messages in the store.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
Ee will be replaying gossip messages from the gossip_store soon. This means that
not all messages originate from a peer, so we move the queuing of error messages
up into the peer message handler.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
If we're going to simply take() a pointer, don't allocate it off a random
object. Using NULL makes our intent clear, particularly with allocating
packets we're going to take() onto a queue.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Now it just returns true if it queued something. This allows it
to queue multiple packets, and lets it share code paths with other code
in future patches.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We currently keep two copies; one in the broadcast structure to send
in order, and one in the routing information. Since we already keep
the broadcast index in the routing information, use that.
Conveniently, a zero index is the same as the old NULL test.
Rename struct node's announcement_idx to node_announce_msgidx to
make it match the other users.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
1. make queue_peer_msg() use both if branches, as both equally likely.
2. Remove redundant *scid = NULL in handle_channel_announcement.
3. Log failing pending channel_updates.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
As per BOLT #7.
We don't do this for channel_update which are queued because the
channel_announcement is pending though.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We already have 'struct node', so rename 'struct routing_channel' to
'struct chan', and 'struct node_connection' to 'struct half_chan'.
Other minor changes:
1. rstate->channels -> rstate->chanmap.
2. 'connections' -> 'half'.
3. connection_to -> half_chan_to
4. connection_from -> half_chan_from
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Failure and pruning were the two places where a node_connection could
be freed; now they both deal with entire channels, we can remove the
NULL checks, and the destructor.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is twice the 'update_channel_interval' we get handed.
We delete the non-existent channel_add_connection and delete_connection
declarations from the header too.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We currently give them a free pass. The simplest fix is to give them
an old timestamp on initialization.
We still skip unannounced channels, on the assumption that they're
ours. And we set the last_update_timestamp to -1 when we convert to
gossip_getchannels_entry to indicate no update.
This breaks the DEVELOPER=1 pruning test, since we hardcode the 1
week timeout. That's fixed in the next patch.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We make new_routing_channel() populate both connections
(active=false), so local_add_channel becomes simpler. We also
suppress listchannels output of active=false unannounced channels, to
avoid breaking tests (also, these are unusable, so it makes sense to
omit them)
It also seems the logic in add_channel_direction is legacy: a
channel_announce cannot replace the scid (that would be a different
channel), we don't allow duplicate announcements, and the announcement
is never NULL.
And since we disallow repeated channel_announce already, I believe
'forward' is always true, greatly simplifying the logic in
handle_pending_cannouncement.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This makes 'routing_channel' the primary object in the system; it can have
one or two 'node_connection's attached, and points to two nodes.
The nodes are freed when no more routing_channel refer to them. The
routing_channel are freed when they contain no more 'node_connection'.
This fixes#1072 which I surmise was caused by a dangling
routing_channel after pruning.
Each node contains a single array of 'routing_channel's, not one for
each direction. The 'routing_channel' itself orders nodes in key
order (conveniently the index is equal to the direction flag we use),
and 'node_connection' with source in the same order.
There are helpers to assist with common questions like "which
'node_connection' leads out of this node?".
There are now two ways to find a channel:
1. Direct scid lookup via rstate->channels map.
2. Node key lookup, followed by channel traversal.
Several FIXMEs are inserted for where we can now do things more optimally.
Fixes: #1072
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>