Like the fd, it's only useful when the peer is not in a daemon, so we
free & NULL it when that happens.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We steal it when we're closing connection, but we normally want to forget
it if connection just dies.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
1. We explicitly assert what state we're coming from, to make transitions
clearer.
2. Every transition has a state, even between owners while waiting for HSM.
3. Explictly step though getting the HSM signature on the funding tx
before starting channeld, rather than doing it in parallel: makes
states clearer.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We need to do this on every connection, whether reconnecting or not,
so it makes sense for the handshake daemon to handle it and return
the feature fields.
Longer term I'm considering having the handshake daemon handle the
listening and connecting, and simply hand the fds back once the peers
are ready.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We currently create a peer struct, then complete handshake to find out
who it is. This means we have a half-formed peer, and worse: if it's
a reconnect we get two peers the same.
Add an explicit 'struct connection' for the handshake phase, and
construct a 'struct peer' once that's done.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Like many I don't read any documentation besides the readme in the
repo, so I thought I could just pull some simple getting started info
into the readme to make it easy for people to get started :-)
Fixes the `short_channel_id` being serialized as 4 bytes block height,
3 bytes transaction index and 1 byte output number, to use 3+3+2 as
the spec says.
The reordering in the unit test structs is mainly to be able to still
use `eq_upto` for tests.
This was responsible for a huge number of loglines simply because we
log every subprocess start and termination. Moving the logging
upstream to where it is really needed gets rid of the polling, that
are successful. A highly unscientific test shows a reduction in
loglines produced by lightningd from 17000 to 10000 lines.
I caught the gossip daemon freeing a message, while it was queued to be
written. Using tal_dup_arr() is the Right Thing, as it handles taken()
properly automatically.
------------------------------- Valgrind errors --------------------------------
Valgrind error file: /tmp/lightning-rvc7d5oi/test_forward/lightning-3/valgrind-errors
==11057== Invalid read of size 8
==11057== at 0x1328F2: to_tal_hdr (tal.c:174)
==11057== by 0x133894: tal_len (tal.c:659)
==11057== by 0x11BBE7: do_write_wire (wire_io.c:103)
==11057== by 0x127B95: do_plan (io.c:369)
==11057== by 0x127C31: io_ready (io.c:390)
==11057== by 0x129461: io_loop (poll.c:295)
==11057== by 0x10CBB4: main (gossip.c:722)
==11057== Address 0x55a99d8 is 24 bytes inside a block of size 200 free'd
==11057== at 0x4C2ED5B: free (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==11057== by 0x133000: del_tree (tal.c:416)
==11057== by 0x132F77: del_tree (tal.c:405)
==11057== by 0x13333E: tal_free (tal.c:504)
==11057== by 0x1123F1: queue_broadcast (broadcast.c:38)
==11057== by 0x111EB0: handle_node_announcement (routing.c:918)
==11057== by 0x10B166: handle_gossip_msg (gossip.c:170)
==11057== by 0x10B76B: owner_msg_in (gossip.c:335)
==11057== by 0x12712E: next_plan (io.c:59)
==11057== by 0x127BD0: do_plan (io.c:376)
==11057== by 0x127C09: io_ready (io.c:386)
==11057== by 0x129461: io_loop (poll.c:295)
==11057== Block was alloc'd at
==11057== at 0x4C2DB2F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==11057== by 0x132AE7: allocate (tal.c:245)
==11057== by 0x1330A3: tal_alloc_ (tal.c:443)
==11057== by 0x1332A6: tal_alloc_arr_ (tal.c:491)
==11057== by 0x133FEC: tal_dup_ (tal.c:846)
==11057== by 0x112347: new_queued_message (broadcast.c:20)
==11057== by 0x11240B: queue_broadcast (broadcast.c:43)
==11057== by 0x111EB0: handle_node_announcement (routing.c:918)
==11057== by 0x10B166: handle_gossip_msg (gossip.c:170)
==11057== by 0x10B76B: owner_msg_in (gossip.c:335)
==11057== by 0x12712E: next_plan (io.c:59)
==11057== by 0x127BD0: do_plan (io.c:376)
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
wire_io: make a copy in io_write_wire (unless taken()).
I hit a corner case where gossipd freed a duplicate while it was being
sent out; this kind of thing doesn't happen if io_write_wire() makes
a copy by default.
We also do a memcheck() here; this gives us a caller in the backtrace
if there are uninitialized bytes, rather than waiting until the write
which happens later.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
eg:
test_routing_gossip (__main__.LightningDTests) ... ERROR
======================================================================
ERROR: test_routing_gossip (__main__.LightningDTests)
----------------------------------------------------------------------
Traceback (most recent call last):
File "tests/test_lightningd.py", line 150, in tearDown
err_count += self.printValgrindErrors(node)
File "tests/test_lightningd.py", line 137, in printValgrindErrors
errors, fname = self.getValgrindErrors(node)
File "tests/test_lightningd.py", line 132, in getValgrindErrors
with open(error_file, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/lightning-l106st0a/test_routing_gossip/lightning-1/valgrind-errors'
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Now in sync with 8ee57b97738b1e9467a1342ca8373d40f0c4aca5.
Our tool doesn't need to convert them any more, but we actually had a
mis-typed field in the HSM which needed fixing.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The single string-based hostname and port has been retired in favor of
having multiple `struct ipaddr`s from the `node_announcement`. This
breaks the hostnames and ports from IRC, but I didn't bother to
backport ipaddr for it since it is only used in the legacy daemon.
Rather a big commit, but I couldn't figure out how to split it
nicely. It introduces a new message from the channel to the master
signaling that the channel has been announced, so that the master can
take care of announcing the node itself. A provisorial announcement is
created and passed to the HSM, which signs it and passes it back to
the master. Finally the master injects it into gossipd which will take
care of broadcasting it.
We alternated between using a sha256 and using a privkey, but there are
numerous places where we have a random 32 bytes which are neither.
This fixes many of them (plus, struct privkey is now defined in terms of
struct secret).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Under stress, the tests can mine blocks too soon, and the funding never
locks. This gives more of a chance, at least.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We were getting an assert "!secp256k1_fe_is_zero(&ge->x)", because
an all-zero pubkey is invalid. We allow marshal/unmarshal of NULL for
now, and clean up the error handling.
1. Use status_failed if master sends a bad message.
2. Similarly, kill the gossip daemon if it gives a bad reply.
3. Use an array for returned pubkeys: 0 or 2.
4. Use type_to_string(trc, struct short_channel_id, &scid) for tracing.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>