Commit Graph

572 Commits

Author SHA1 Message Date
Rusty Russell
954a3990fa gossipd: don't send a peer to master with half-written or half-read packet.
In this case, it was a gossip message half-sent, when we asked the peer
to be released.  Fix the problem in general by making send_peer_with_fds()
wait until after the next packet.

test_routing_gossip/lightning-4/log:
	b'lightning_openingd(8738): TRACE: First per_commit_point = 02e2ff759ed70c71f154695eade1983664a72546ebc552861f844bff5ea5b933bf'
	b'lightning_openingd(8738): TRACE: Failed hdr decrypt with rn=11'
	b'lightning_openingd(8738): STATUS_FAIL_PEER_IO: Reading accept_channel: Success'

test_routing_gossip/lightning-5/log:

	b'lightning_gossipd(8461): UPDATE WIRE_GOSSIP_PEER_NONGOSSIP'
	b'lightning_gossipd(8461): UPDATE WIRE_GOSSIP_PEER_NONGOSSIP'
	b'lightningd(8308): Failed to get netaddr for outgoing: Transport endpoint is not connected'

The problem occurs here on release, but could be on any place where we hand
a peer over when using ccan/io.  Note the other case (channel.c).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-25 18:34:35 +02:00
Rusty Russell
ebdecebb1a channeld: send channel_announce and initial update to master, not gossipd.
There is a race we see sometimes under valgrind on Travis which shows
gossipd receiving the node_announce from master before it reads the
channel_announce from channeld, and thus fails.  The simplest solution
is to send the channel_announce and channel_update to master as well,
so it can ensure it sends them to gossipd in order

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-24 16:12:22 +02:00
Rusty Russell
2394c9a2e7 crypto_state: move to its own file.
In particular, the main daemon needs to pass it about (marshal/unmarshal)
but it won't need to actually use it after the next patch.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-20 18:31:32 +02:00
Rusty Russell
8f057f7fc7 Revert "gossip: send the *other* node's cltv_expiry_delta in channel_announce."
This reverts commit 297e278132.
2017-10-11 11:54:50 +02:00
Rusty Russell
297e278132 gossip: send the *other* node's cltv_expiry_delta in channel_announce.
Include tests from example doc.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-10 20:17:37 +02:00
Rusty Russell
2a28173891 Typo fix: CTLV -> CLTV.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-10 20:17:37 +02:00
Rusty Russell
e137e2527f Update BOLT references with typo fixes.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-10 20:17:37 +02:00
Rusty Russell
32631b4278 generate-wire.py: add --bolt arg, use size->type hacks only when that's specified.
For our own internal comms CSVs, we should always name explicit types.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-09-29 14:40:34 +02:00
Rusty Russell
8bb20d127d channeld: add debugging into io_loop.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-09-29 10:20:08 +09:30
Rusty Russell
72b215f6fe Make all internal message numbers unique.
We were sending a channeld message to onchaind, which was v. confusing
due to overlap.  We make all the numbers distinct, which means we can
also add an assert() that it's valid for that daemon, which catches
such errors immediately.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-09-28 13:07:05 +09:30
Rusty Russell
ab8251c214 lightningd: dev-reenable-commit RPC command to re-enable commit timer.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-09-28 13:07:05 +09:30
Rusty Russell
ce160d9b17 lightnind: _ dev-disconnect argument to suppress commit timer.
Required for catching daemon in exact state.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-09-28 13:07:05 +09:30
Rusty Russell
ef28b6112c status: use common status codes for all the failures.
This change is really to allow us to have a --dev-fail-on-subdaemon-fail option
so we can handle failures from subdaemons generically.

It also neatens handling so we can have an explicit callback for "peer
did something wrong" (which matters if we want to close the channel in
that case).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-09-12 23:00:53 +02:00
Christian Decker
006d664b59 channeld: Make sure status_setup_sync is called before status_failed
This was still happening if reading the `channel_init` message failed.

Signed-off-by: Christian Decker <decker.christian@gmail.com>
2017-09-10 10:48:53 +09:30
Rusty Russell
cc34f572ca channeld: fix sync write to master.
We hit:
	assert(!peer->handle_master_reply);

#4  0x000055bba3b030a0 in master_sync_reply (peer=0x55bba41c0030, 
    msg=0x55bba41c6a80 "", replytype=WIRE_CHANNEL_GOT_COMMITSIG_REPLY, 
    handle=0x55bba3b041cf <handle_reply_wake_peer>) at channeld/channel.c:518
#5  0x000055bba3b049bc in handle_peer_commit_sig (conn=0x55bba41c10d0, 
    peer=0x55bba41c0030, msg=0x55bba41c6a80 "") at channeld/channel.c:959
#6  0x000055bba3b05c69 in peer_in (conn=0x55bba41c10d0, peer=0x55bba41c0030, 
    msg=0x55bba41c67c0 "") at channeld/channel.c:1339
#7  0x000055bba3b123eb in peer_decrypt_body (conn=0x55bba41c10d0, 
    pcs=0x55bba41c0030) at common/cryptomsg.c:155
#8  0x000055bba3b2c63b in next_plan (conn=0x55bba41c10d0, plan=0x55bba41c1100)
    at ccan/ccan/io/io.c:59

We got a commit_sig from the peer while waiting for the master to
reply to acknowledge the commitsig we want to send
(handle_sending_commitsig_reply).

The fix is to go always talk to the master synchronous, and not try to
process anything but messages from the master daemon.  This avoids the
whole class of problems.

There's a fairly simple way to do this, as ccan/io lets you override
its poll call: we process any outstanding master requests there, or
add the master fd to the pollfds array.

Fixes: #266
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-09-09 10:31:31 +09:30
Rusty Russell
5acbc04ec8 channeld: assert we're not somehow nonblocking in init_channel.
Christian reported seeing a zero-length packet come in; this seems the
most likely possibility.  

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-09-09 10:31:31 +09:30
Christian Decker
b0c0e28a43 gossip: Simplify announce_signature exchange
The logic of dispatching the announcement_signatures message was
distributed over several places and daemons. This aims to simplify it
by moving it all into `channeld`, making peer_control only report
announcement depth to `channeld`, which then takes care of the
rest. We also do not reuse the funding_locked tx watcher since it is
easier to just fire off a new watcher with the specific purpose of
waiting for the announcement_depth.

Signed-off-by: Christian Decker <decker.christian@gmail.com>
2017-09-05 12:47:25 +09:30
Rusty Russell
4e81d2431b channeld: fix corruption when dealing with queued packets.
master is not actually a tal object!

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-09-04 20:46:26 +02:00
Rusty Russell
7e13e9e457 channeld: don't allow NULL htlcmap for full_channel
That was only for the initial state, which is now in initial_channel.c.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-09-03 02:01:54 +02:00
Rusty Russell
1cf33eefe2 lightningd: handle case where channeld fails locally-generated HTLC.
jl777 reported a crash when we try to pay past reserve.  Fix that (and
a whole class of related bugs) and add tests.

In test_lightning.py I had to make non-async path for sendpay() non-threaded
to get the exception passed through for testing.

Closes: #236
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-08-30 11:36:37 +02:00
Rusty Russell
52db7fd27b channeld: correctly send failure message on local HTLC failure.
valgrind was complaining about uninitialized bytes over the wire.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-08-30 11:36:37 +02:00
Rusty Russell
bbed5e3411 Rename subdaemons, move them into top level.
We leave the *build* results in lightningd/ for ease of in-place testing though.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-08-29 17:54:14 +02:00