Commit Graph

4829 Commits

Author SHA1 Message Date
Rusty Russell
06bc035f59 Minor fixes: feedback from Christian
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-05-25 14:24:47 +09:30
Rusty Russell
7d5b923719 test_lightningd.py: test reconnect after funding_signed.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-05-25 14:24:47 +09:30
Rusty Russell
1df40156f6 test_lightningd.py: disconnection tests.
In particular, we test the cases where it will simply forget the old peer.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-05-25 14:24:47 +09:30
Rusty Russell
ed16bb3134 channeld: send optional init message.
We use this to make it send the funding_signed message, rather than having
the master daemon do it (which was even more hacky).  It also means it
can handle the crypto, so no need for the packet to be handed up encrypted,
and also make --dev-disconnect "just work" for this packet.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-05-25 14:24:47 +09:30
Rusty Russell
9b1d240c1f lightningd: --dev-disconnect support.
We use a file descriptor, so when we consume an entry, we move past it
(and everyone shares a file offset, so this works).

The file contains packet names prefixed by - (treat fd as closed when
we try to write this packet), + (write the packet then ensure the file
descriptor fails), or @ ("lose" the packet then ensure the file
descriptor fails).

The sync and async peer-write functions hook this in automatically.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>



Header from folded patch 'test-run-cryptomsg__fix_compilation.patch':

test/run-cryptomsg: fix compilation.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-05-25 14:24:47 +09:30
Rusty Russell
8f190c673c channel: don't send disable update to gossipd if we haven't announced channel yet.
Valgrind error file: /tmp/lightning-8k06jbb3/test_disconnect/lightning-7/valgrind-errors
==32307== Uninitialised byte(s) found during client check request
==32307==    at 0x11EBAD: memcheck_ (mem.h:247)
==32307==    by 0x11EC18: towire (towire.c:14)
==32307==    by 0x11EF19: towire_short_channel_id (towire.c:92)
==32307==    by 0x12203E: towire_channel_update (gen_peer_wire.c:918)
==32307==    by 0x1148D4: send_channel_update (channel.c:185)
==32307==    by 0x1175C5: peer_conn_broken (channel.c:1010)
==32307==    by 0x13186F: destroy_conn (poll.c:173)
==32307==    by 0x13188F: destroy_conn_close_fd (poll.c:179)
==32307==    by 0x13B279: notify (tal.c:235)
==32307==    by 0x13B721: del_tree (tal.c:395)
==32307==    by 0x13BB3A: tal_free (tal.c:504)
==32307==    by 0x130522: io_close (io.c:415)
==32307==  Address 0xffefff87d is on thread 1's stack
==32307==  in frame #2, created by towire_short_channel_id (towire.c:88)

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-05-25 14:24:47 +09:30
Rusty Russell
302ed0ca3b peer_control: always keep gossip_client_fd.
This is simpler than passing back and forth, for the moment at least.  That
means we don't need to ask for a new one on reconnect.

This partially reverts the gossip handling in openingd, since it no longer
passes the gossip fd back.  We also close it when peer is freed, so it
needs initializing to -1.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-05-25 14:24:47 +09:30
Rusty Russell
d1fcc434c8 subd: use array of fd pointers, not fds, and use take().
This lets us specify that we want to keep some fds.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-05-25 14:24:47 +09:30
Rusty Russell
f504f0be47 gossip: handle release race.
We can go to release a gossip peer, and it can fail at the same time.
We work around the problem that the reply must be a gossipctl_release_peer_reply
with two fds, but it's not pretty.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-05-25 14:24:47 +09:30
Rusty Russell
b3a2a0623c lightningd: set up reconnect timer if we don't want to forget peer.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-05-25 14:24:47 +09:30
Rusty Russell
58bf67c310 dns: expose multiaddress connect code.
Internally the dns code has the ability to try connecting to multiple
addresses in a sequence.  Expose this, as we'll want it for reconnection.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-05-25 14:24:47 +09:30
Rusty Russell
f4722e93a0 peer_fail: handle incoming reconnections.
We kill the existing connection if possible; this may mean simply
forgetting the prior peer altogether if it's in an early state.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-05-25 14:24:47 +09:30
Rusty Russell
3126eed4de patch peer_control-keep-init-information.patch 2017-05-25 14:24:47 +09:30
Rusty Russell
05875d7f35 opening: handle gossip.
We don't send out gossip, but we do relay it to the gossip daemon if we
get some.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-05-25 14:24:47 +09:30
Rusty Russell
43ec37b865 openingd: fundee: don't send watch command to master.
Instead, send it the funding_signed message; it can watch, save to
database, and send it.

Now the openingd fundee path is a simple request and response, too.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-05-25 14:24:47 +09:30
Rusty Russell
c2cfc3dd69 opening: funder: don't ask master for TXID, calculate it ourselves.
Simplifies state machine.  Master still has to calculate the tx to get
the signature and broadcast, but now the opening daemon funding path
is a simple request/response.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-05-25 14:24:47 +09:30
Rusty Russell
4220362692 derive_basepoints: make arguments optional.
We want to use it in peer_control to generate the transaction, but we
really only need the funding_pubkey.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-05-25 14:24:47 +09:30
Rusty Russell
7bfd282319 lightningd/utxo: helpers to translate from utxo * <-> utxo **
We need the former for marshalling, the latter for build_utxos and funding_tx.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-05-25 14:24:47 +09:30
Rusty Russell
805228b939 lightningd/funding_tx: fix no-change-needed case.
We only allocate one output in that case, and changekey is undefined.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-05-25 14:24:47 +09:30
Rusty Russell
3ab3281a4c lightningd/opening: rename functions and wire messages for clarity.
Each one is either funder or fundee now.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-05-25 14:24:47 +09:30
Rusty Russell
1e36a19164 lightningd/peer_control: keep cryptostate.
Like the fd, it's only useful when the peer is not in a daemon, so we
free & NULL it when that happens.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-05-25 14:24:47 +09:30
Rusty Russell
ccad93edb3 gossipd: add fail_peer.
We use this if it reconnects via another fd.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-05-25 14:24:47 +09:30
Rusty Russell
5fb1f20898 gossip: make connection own peer.
We steal it when we're closing connection, but we normally want to forget
it if connection just dies.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-05-25 14:24:47 +09:30
Rusty Russell
aef745e37d peer_control: simplify code flow in depth callback.
I modified it to use states, and messed it up.  Rewrite it to be
far simpler.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-05-25 14:24:47 +09:30
Rusty Russell
780b3870ad lightningd: peer state cleanup.
1. We explicitly assert what state we're coming from, to make transitions
   clearer.
2. Every transition has a state, even between owners while waiting for HSM.
3. Explictly step though getting the HSM signature on the funding tx
   before starting channeld, rather than doing it in parallel: makes
   states clearer.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-05-25 14:24:47 +09:30
Rusty Russell
662dfef436 lightningd/gossip: Move INIT message handling to handshake daemon.
We need to do this on every connection, whether reconnecting or not,
so it makes sense for the handshake daemon to handle it and return
the feature fields.

Longer term I'm considering having the handshake daemon handle the
listening and connecting, and simply hand the fds back once the peers
are ready.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-05-25 14:24:47 +09:30
Rusty Russell
0c8b24cf97 daemon/dns: hand netaddr we connected to through to callback.
That way it doesn't have to extract it from fd.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-05-25 14:24:47 +09:30
Rusty Russell
f6495d3310 lightningd/peer_control: don't create peer struct until we've connected.
We currently create a peer struct, then complete handshake to find out
who it is.  This means we have a half-formed peer, and worse: if it's
a reconnect we get two peers the same.

Add an explicit 'struct connection' for the handshake phase, and
construct a 'struct peer' once that's done.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-05-25 14:24:47 +09:30
Rusty Russell
61a2ed97e1 lightningd/peer_control: start of reconnect logic.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-05-25 14:24:47 +09:30
Rusty Russell
fe1ff33419 lightningd/subd: don't take ownership of peer.
Use callback which fails the peer if subd dies: that will later allow
reconnect.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-05-25 14:24:47 +09:30
Rusty Russell
34e6e56471 lightningd: introduce peer_state enum.
The actual state names are place holders for now, really.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-05-25 14:24:47 +09:30
Rusty Russell
93849b1e02 lightningd/peer_control: save funder side in struct peer.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-05-25 14:24:47 +09:30
Rusty Russell
be9bb5f9cb lightningd: peer_fail helper to fail/reconnect peer.
This will eventually hook into restart logic.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-05-25 14:24:47 +09:30
Christian Decker
744d657860 doc: Updating README and related documentation
Like many I don't read any documentation besides the readme in the
repo, so I thought I could just pull some simple getting started info
into the readme to make it easy for people to get started :-)
2017-05-20 20:02:45 +09:30
Christian Decker
05e951d748 wire: Correct the short channel id serialization to use 3+3+2
Fixes the `short_channel_id` being serialized as 4 bytes block height,
3 bytes transaction index and 1 byte output number, to use 3+3+2 as
the spec says.

The reordering in the unit test structs is mainly to be able to still
use `eq_upto` for tests.
2017-05-20 20:01:34 +09:30
Christian Decker
b8ba8f003c logging: Removed automatic subprocess logging for bitcoind
This was responsible for a huge number of loglines simply because we
log every subprocess start and termination. Moving the logging
upstream to where it is really needed gets rid of the polling, that
are successful. A highly unscientific test shows a reduction in
loglines produced by lightningd from 17000 to 10000 lines.
2017-05-20 19:59:16 +09:30
Christian Decker
80bf908922 script: Consolidate pubkey comparison 2017-05-20 19:59:16 +09:30
Christian Decker
3f4e43081c bitcoind: Respect testnet for bitcoin-cli 2017-05-20 19:59:16 +09:30
Christian Decker
80486a6669 pytest: Do not check for valgrind errors if disabled
Checking for valgrind errors if we disabled valgrind will kill the
teardown, so don't!
2017-05-20 19:59:16 +09:30
Christian Decker
6154020f67 fix: Header include order 2017-05-19 14:06:44 +02:00
Rusty Russell
f4d92813a0 lightningd: handle bad failure message.
We used to core dump if unwrap_onionreply() returned NULL!

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-05-19 13:30:32 +02:00
Rusty Russell
55510ea27a io_write_wire: always make a copy (or steal if take).
I caught the gossip daemon freeing a message, while it was queued to be
written.  Using tal_dup_arr() is the Right Thing, as it handles taken()
properly automatically.

------------------------------- Valgrind errors --------------------------------
Valgrind error file: /tmp/lightning-rvc7d5oi/test_forward/lightning-3/valgrind-errors
==11057== Invalid read of size 8
==11057==    at 0x1328F2: to_tal_hdr (tal.c:174)
==11057==    by 0x133894: tal_len (tal.c:659)
==11057==    by 0x11BBE7: do_write_wire (wire_io.c:103)
==11057==    by 0x127B95: do_plan (io.c:369)
==11057==    by 0x127C31: io_ready (io.c:390)
==11057==    by 0x129461: io_loop (poll.c:295)
==11057==    by 0x10CBB4: main (gossip.c:722)
==11057==  Address 0x55a99d8 is 24 bytes inside a block of size 200 free'd
==11057==    at 0x4C2ED5B: free (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==11057==    by 0x133000: del_tree (tal.c:416)
==11057==    by 0x132F77: del_tree (tal.c:405)
==11057==    by 0x13333E: tal_free (tal.c:504)
==11057==    by 0x1123F1: queue_broadcast (broadcast.c:38)
==11057==    by 0x111EB0: handle_node_announcement (routing.c:918)
==11057==    by 0x10B166: handle_gossip_msg (gossip.c:170)
==11057==    by 0x10B76B: owner_msg_in (gossip.c:335)
==11057==    by 0x12712E: next_plan (io.c:59)
==11057==    by 0x127BD0: do_plan (io.c:376)
==11057==    by 0x127C09: io_ready (io.c:386)
==11057==    by 0x129461: io_loop (poll.c:295)
==11057==  Block was alloc'd at
==11057==    at 0x4C2DB2F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==11057==    by 0x132AE7: allocate (tal.c:245)
==11057==    by 0x1330A3: tal_alloc_ (tal.c:443)
==11057==    by 0x1332A6: tal_alloc_arr_ (tal.c:491)
==11057==    by 0x133FEC: tal_dup_ (tal.c:846)
==11057==    by 0x112347: new_queued_message (broadcast.c:20)
==11057==    by 0x11240B: queue_broadcast (broadcast.c:43)
==11057==    by 0x111EB0: handle_node_announcement (routing.c:918)
==11057==    by 0x10B166: handle_gossip_msg (gossip.c:170)
==11057==    by 0x10B76B: owner_msg_in (gossip.c:335)
==11057==    by 0x12712E: next_plan (io.c:59)
==11057==    by 0x127BD0: do_plan (io.c:376)

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

wire_io: make a copy in io_write_wire (unless taken()).

I hit a corner case where gossipd freed a duplicate while it was being
sent out; this kind of thing doesn't happen if io_write_wire() makes
a copy by default.

We also do a memcheck() here; this gives us a caller in the backtrace
if there are uninitialized bytes, rather than waiting until the write
which happens later.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-05-19 13:30:32 +02:00
Rusty Russell
3d2f166364 test_lightningd.py: don't fail if valgrind turned off.
eg:
test_routing_gossip (__main__.LightningDTests) ... ERROR

======================================================================
ERROR: test_routing_gossip (__main__.LightningDTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "tests/test_lightningd.py", line 150, in tearDown
    err_count += self.printValgrindErrors(node)
  File "tests/test_lightningd.py", line 137, in printValgrindErrors
    errors, fname = self.getValgrindErrors(node)
  File "tests/test_lightningd.py", line 132, in getValgrindErrors
    with open(error_file, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/lightning-l106st0a/test_routing_gossip/lightning-1/valgrind-errors'

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-05-19 13:30:32 +02:00
Rusty Russell
811e14ff66 Update .gitignore files.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-05-12 12:59:09 +02:00
Rusty Russell
6e0e1c7067 Update to latest BOLT (hyphens changed to underscores).
Now in sync with 8ee57b97738b1e9467a1342ca8373d40f0c4aca5.

Our tool doesn't need to convert them any more, but we actually had a
mis-typed field in the HSM which needed fixing.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-05-12 12:59:09 +02:00
Rusty Russell
e97046f797 BOLT update: temporary_channel_failure with update.
Aka d140405a6f0d95e3ccf650e3560383768cbf3e03.

This doesn't make it work, just compile.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-05-12 12:59:09 +02:00
Christian Decker
f6d60b9076 pytest: Fail tests that produce valgrind errors
Check and print all valgrind errors and fail tests that produce
errors. We abort using an exception since `fail()` is not allowed in
the teardown.
2017-05-10 12:38:12 +09:30
Christian Decker
2d76b066c2 routing: Cleaning up old hostname and port handling
The single string-based hostname and port has been retired in favor of
having multiple `struct ipaddr`s from the `node_announcement`. This
breaks the hostnames and ports from IRC, but I didn't bother to
backport ipaddr for it since it is only used in the legacy daemon.
2017-05-10 12:37:44 +09:30
Christian Decker
e972b208c7 routing: Passing back all addresses on getnodes 2017-05-10 12:37:44 +09:30
Christian Decker
26892e79bb routing: Reading multiple addresses from node_announcements 2017-05-10 12:37:44 +09:30