Commit Graph

72 Commits

Author SHA1 Message Date
Christian Decker
6dbd99ddc6 gossip: Fix a race condition between release_peer and fail_peer
There was a race condition that would cause an assertion to segfault
if a call to release_peer was interleaved with a fail_peer. The
release_peer was making the peer non-local, which was then causing the
assertion in fail_peer to fail. Now we just have 3 cases: not found,
local, and non-local.

Signed-off-by: Christian Decker <decker.christian@gmail.com>
2017-08-17 10:31:55 +09:30
Christian Decker
1a1e29a4bc gossip: Re-initiate the broadcast timer upon reconnect
We weren't registering reconnecting peers for broadcasts. Just
starting a timer is enough. Also added an integration test to check
that the gossip sync is being resumed.
2017-07-12 11:00:26 +09:30
Rusty Russell
4185153d81 gossipd: interface to get a client gossip_fd for a reconnect.
At the moment, master simply keeps the gossip fd open when peer
disconnects.  That's inefficient, and wrong anyway (it may want a
complete new sync, or may not, but we'll currently send all the
messages including stale ones).

This interface will be required for restart anyway.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-06-27 10:25:53 +09:30
Rusty Russell
f2d4309add lightningd/subd: explicit failure reply support.
We had a terrible hack in gossip when a peer didn't exist.  Formalize
a pattern when code+200 is a failure (with no fds passed), and use it
here.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-06-27 10:25:53 +09:30
Rusty Russell
996567c250 lightningd: update BOLT to add channel_reestablish message.
We don't handle it yet though.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-06-23 09:29:42 +09:30
Rusty Russell
d4618fa199 subdaemons: handle master or gossipd failing.
We should simply exit in this case.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-06-07 09:19:04 +09:30
Rusty Russell
87fc3439c4 test_lightningd.py: fix obscure corner case in gossip.
I actually hit this very hard to reproduce race: if we haven't process
the channeld message when block #6 comes in, we won't send the gossip
message.  We wait for logs, but don't generate new blocks, and timeout
on l1.daemon.wait_for_log('peer_out WIRE_ANNOUNCEMENT_SIGNATURES').

The solution, which also tests that we don't send announcement signatures
immediately, is to generate a single block, wait for CHANNELD_NORMAL,
then (in gossip tests), generate 5 more.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-06-07 09:19:04 +09:30
Rusty Russell
693457a580 lightningd: remove unused offset field from CSV files.
The format we use to generate marshal/unmarshal code is from
the spec's tools/extract-formats.py which includes the offset:
we don't use it at all, so rather than having manually-calculated
(and thus probably wrong) values, or 0, emit it altogther.

Reported-by: Christian Decker
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-05-25 14:24:47 +09:30
Rusty Russell
f504f0be47 gossip: handle release race.
We can go to release a gossip peer, and it can fail at the same time.
We work around the problem that the reply must be a gossipctl_release_peer_reply
with two fds, but it's not pretty.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-05-25 14:24:47 +09:30
Rusty Russell
f4722e93a0 peer_fail: handle incoming reconnections.
We kill the existing connection if possible; this may mean simply
forgetting the prior peer altogether if it's in an early state.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-05-25 14:24:47 +09:30
Rusty Russell
ccad93edb3 gossipd: add fail_peer.
We use this if it reconnects via another fd.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-05-25 14:24:47 +09:30
Rusty Russell
5fb1f20898 gossip: make connection own peer.
We steal it when we're closing connection, but we normally want to forget
it if connection just dies.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-05-25 14:24:47 +09:30
Rusty Russell
662dfef436 lightningd/gossip: Move INIT message handling to handshake daemon.
We need to do this on every connection, whether reconnecting or not,
so it makes sense for the handshake daemon to handle it and return
the feature fields.

Longer term I'm considering having the handshake daemon handle the
listening and connecting, and simply hand the fds back once the peers
are ready.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-05-25 14:24:47 +09:30
Christian Decker
2d76b066c2 routing: Cleaning up old hostname and port handling
The single string-based hostname and port has been retired in favor of
having multiple `struct ipaddr`s from the `node_announcement`. This
breaks the hostnames and ports from IRC, but I didn't bother to
backport ipaddr for it since it is only used in the legacy daemon.
2017-05-10 12:37:44 +09:30
Christian Decker
e972b208c7 routing: Passing back all addresses on getnodes 2017-05-10 12:37:44 +09:30
Christian Decker
daf8866eb5 gossip: Implement the basic node_announcement
Rather a big commit, but I couldn't figure out how to split it
nicely. It introduces a new message from the channel to the master
signaling that the channel has been announced, so that the master can
take care of announcing the node itself. A provisorial announcement is
created and passed to the HSM, which signs it and passes it back to
the master. Finally the master injects it into gossipd which will take
care of broadcasting it.
2017-05-10 12:37:44 +09:30
Rusty Russell
cd90c8408c lightningd/gossip: clean up channel resolve failure handling.
We were getting an assert "!secp256k1_fe_is_zero(&ge->x)", because
an all-zero pubkey is invalid.  We allow marshal/unmarshal of NULL for
now, and clean up the error handling.

1. Use status_failed if master sends a bad message.
2. Similarly, kill the gossip daemon if it gives a bad reply.
3. Use an array for returned pubkeys: 0 or 2.
4. Use type_to_string(trc, struct short_channel_id, &scid) for tracing.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-05-05 16:11:44 +09:30
Christian Decker
25f1cba3cf routing: Ask gossipd to resolve channel_id and forward HTLCs
Since we now use the short_channel_id to identify the next hop we need
to resolve the channel_id to the pubkey of the next hop. This is done
by calling out to `gossipd` and stuffing the necessary information
into `htlc_end` and recovering it from there once we receive a reply.
2017-05-02 11:49:14 +02:00
Christian Decker
b4beab6537 gossip: Make the broadcast interval configurable
Adds a new command line flag `--dev-broadcast-interval=<ms>` that
allows us to specify how often the staggered broadcast should
trigger. The value is passed down to `gossipd` via an init message.

This is mainly useful for integration tests, since we do not want to
wait forever for gossip to propagate.
2017-05-02 11:59:24 +09:30
Christian Decker
4bc6ee1088 gossip: Fix two bugs in the forwarding of gossip
We were using an uninitialized `broadcast_index` on the peer which
would occasionally result in no forwardings at all, segmenting the
network. And during the `msg_queue` refactor, some wait targets were
not updated, resulting in the waits never to be woken up.
2017-05-02 11:59:24 +09:30
Rusty Russell
6d55a17642 lightningd/dev_ping: expand to cover gossipd.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-04-25 22:00:28 +02:00
Rusty Russell
d5be8d26f2 lightningd/ping: ping support.
A spec update brings ping support.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-04-25 22:00:28 +02:00
Rusty Russell
a436fa77fc lightningd/msg_queue: add msg_wake helper.
A cleaner wrapper than a raw io_wake.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-04-25 22:00:28 +02:00
Rusty Russell
d967c2ba64 lightningd/gossip: interleave local and gossip messages.
Rather than dumping all gossip messages then handling local ones again.
This should help us give timely ping replies.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-04-25 22:00:28 +02:00
Rusty Russell
4f0f2c0f4e lightningd/gossip: use msg_queue instead of open-coded queue.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-04-25 22:00:28 +02:00
Rusty Russell
3509f10e30 lightningd/gossip: shut down daemon when status fd closed.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-04-12 09:09:19 -07:00
Rusty Russell
27764b65f9 lightningd: fix shachain to be 48-bits, with hack for legacy.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-04-01 23:59:46 +10:30
Rusty Russell
0197353bd9 lightningd/Makefile: put timeout into OLD_LIB_SRC for all daemons.
And remove the now-redundant mention in gossipd.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-04-01 23:59:46 +10:30
Rusty Russell
e45bc0ce47 Update wire from spec 9a572ff0a3a4d6bceba9c0a9b859692180ecf18f
Only changes commit_sig to commitment_signed, which we don't implement
yet anyway.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-04-01 23:59:46 +10:30
Christian Decker
638594f3c6 jsonrpc: Implemented getchannels JSON-RPC call 2017-03-24 13:24:58 +10:30
Christian Decker
c8da420a9d routing: Fill in the getroute functionality
Copied the JSON-request parsing from `pay.c`, passing through to
`gossipd`, filling the reply with the `route_hop` serialization, and
serializing as JSON-RPC response.
2017-03-23 13:34:03 +10:30
Christian Decker
9273b615ab gossip: Added scheleton for getroute calls
This includes the request/reply passed from the main daemon to
`gossipd` and the basic handlers.
2017-03-23 13:34:03 +10:30
Christian Decker
23a1d7d475 gossip: Do not log anything, it breaks a daemon connection!
This came up while debugging the gossip daemon breaking upon calling
`getroute`. It turns out that log was still writing to stdout, but
stdout had been reused for an inter-daemon socket, which would
break...
2017-03-23 13:34:03 +10:30
Rusty Russell
cf6d25cad6 lightningd/connection: rename to lightningd/daemon_conn
To match the structure name.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-03-20 07:50:53 +10:30
Rusty Russell
5637564cd4 lightningd/status: support daemon_conn for status_trace and status_failed.
We remove the unused status_send_fd, and rename status_send_sync (it
should only be used for that case now).

We add a status_setup_async(), and wire things internally to use that
if it's set up: status_setup() is renamed status_setup_sync().

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-03-20 07:50:53 +10:30
Rusty Russell
f511012e29 lightningd/gossip: don't hand client fd until release.
The gossip subdaemon previously passed the fd after init: this is
unnecessary for peers which simply want to gossip (and not establish
channels).

Now we hand the gossip fd back with the peer fd.  This adds another
error message for when we fail to create the gossip fds.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-03-20 07:50:53 +10:30
Rusty Russell
3938f40274 lightningd/gossip: use daemon_conn for status updates.
We're going to remove status_send (it's sync, and so doesn't play well with
async io).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-03-20 07:50:53 +10:30
Rusty Russell
1f894b6234 lightningd/gossip: use daemon_conn for master daemon interaction.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-03-20 07:50:53 +10:30
Rusty Russell
83466b2b32 ccan: update to get close option to io/fdpass.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-03-20 07:50:53 +10:30
Rusty Russell
4bf398c4e7 status: move into lightningd/status.
It's really a lightningd-only thing, and we're about to do surgery on it.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-03-20 07:50:53 +10:30
Rusty Russell
1b0dc7414c lightningd/connection: use msg_queue_wait so it can wake us.
This measn that gossip (which also wants to wake it) needs to wake
the queue, not the daemon_conn.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-03-20 07:50:53 +10:30
Rusty Russell
de651bf8fb gossip: don't use hand-coded nested messages for getnodes array.
We can simply tell it 'nodes' is 'num_nodes*struct gossip_getnodes_entry'.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-03-16 14:35:26 +10:30
Rusty Russell
e042198cf8 tools/generate-wire.py: allow typename instead of type sizes.
We use the fourth value (size) to determine the type, unless the fifth
value is suppled.  That's silly: allow the fourth value to be a typename,
since that's the only reason we care about the size at all!

Unfortunately there are places in the spec where we use a raw fieldname
without '*1' for a length, so we have to distingish this from the
typename case.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-03-16 14:35:25 +10:30
Christian Decker
d6918dc4ef gossip: Implemented getnodes JSON-RPC call 2017-03-15 21:32:55 +01:00
Christian Decker
05e64db6cd gossip: Refactored non-local peers to use daemon_conn 2017-03-13 11:26:48 +01:00
Christian Decker
fd1cbf9030 gossip: Refactoring message handling and removed redundant timers
We were firing off the wakeup timers all over the place, out of fear
that we would be triggering two concurrent broadcasts. This is not
really the case since the wakeup calls are idempotent. This also
allows us not to differentiate between triggering a broadcast on a
local peer or on a proxied peer.
2017-03-13 11:26:48 +01:00
Christian Decker
5721d7d194 gossip: Marking peer as non-local and forwarding broadcasts
This includes some code duplication, but since the two write targets
are fundamentally different we might need to refactor a bit more to
unify them again.
2017-03-13 11:26:48 +01:00
Christian Decker
0596a6cc2c gossip: Initialize the secp256k1_context
We will be doing signature verifications and creation so we better
have the context to do so.
2017-03-13 11:26:48 +01:00
Christian Decker
ab4f1f0550 gossip: Add missing log to gossip daemon
We will eventually ween off of the logging, or replace it with status
messages that log in `lightningd`, but for now we still have the
routing module that does some logging.
2017-03-13 11:26:48 +01:00
Christian Decker
11fc750383 gossip: Handle incoming client requests asynchronously 2017-03-13 11:26:48 +01:00