Commit Graph

2212 Commits

Author SHA1 Message Date
Rusty Russell
3f84ca1052 gossipd: really fix peer handoff.
954a3990fa had two errors:
1) We created the handoff message *before* we sent the final packet, meaning
   that the cryptostate was out-of-sync.
2) We called io_wait() on the output side of a duplex connection: it has
   to be io_wait_out().

This time, stress testing for 2 hours revealed no more problems.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-26 13:03:51 +02:00
Rusty Russell
ac92138603 common: remove unused assert() headers.
Auditing for assert/abort in common/ code used by lightningd, this is all
that showed up.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-26 12:53:09 +02:00
Rusty Russell
81db5896e1 common/json: remove asserts() which may trigger from user input.
They don't currently, since callers check, but be safe.  In addition,
handle NULL returns from these in the bitcoind code.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-26 12:53:09 +02:00
Rusty Russell
11b43a422b lightningd: close one possibly-reachable abort.
There are others, but they really are casued by bad failure.  We need a
parachute system for these.

Closes: #176
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-26 12:53:09 +02:00
Rusty Russell
3c6eec87e3 Add DEVELOPER flag, set by default.
This is a bit messier than I'd like, but we want to clearly remove all
dev code (not just have it uncalled), so we remove fields and functions
altogether rather than stub them out.  This means we put #ifdefs in callers
in some places, but at least it's explicit.

We still run tests, but only a subset, and we run with NO_VALGRIND under
Travis to avoid increasing test times too much.

See-also: #176
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-26 12:53:09 +02:00
Rusty Russell
8d9818ff9c gossipd: receive global/local features the right way around
Fixes: #323
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-26 12:49:56 +02:00
Rusty Russell
954a3990fa gossipd: don't send a peer to master with half-written or half-read packet.
In this case, it was a gossip message half-sent, when we asked the peer
to be released.  Fix the problem in general by making send_peer_with_fds()
wait until after the next packet.

test_routing_gossip/lightning-4/log:
	b'lightning_openingd(8738): TRACE: First per_commit_point = 02e2ff759ed70c71f154695eade1983664a72546ebc552861f844bff5ea5b933bf'
	b'lightning_openingd(8738): TRACE: Failed hdr decrypt with rn=11'
	b'lightning_openingd(8738): STATUS_FAIL_PEER_IO: Reading accept_channel: Success'

test_routing_gossip/lightning-5/log:

	b'lightning_gossipd(8461): UPDATE WIRE_GOSSIP_PEER_NONGOSSIP'
	b'lightning_gossipd(8461): UPDATE WIRE_GOSSIP_PEER_NONGOSSIP'
	b'lightningd(8308): Failed to get netaddr for outgoing: Transport endpoint is not connected'

The problem occurs here on release, but could be on any place where we hand
a peer over when using ccan/io.  Note the other case (channel.c).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-25 18:34:35 +02:00
Rusty Russell
4e14185961 cryptomsg: add helpers to determine if we're partway through msg read/write.
For message read, we do it as header then body, so we can have
io_plan_in_started(conn) false, but we're between header and body.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-25 18:34:35 +02:00
Rusty Russell
a8f033f6ae ccan: update to get new ccan/io changes.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-25 18:34:35 +02:00
Rusty Russell
a2dc71b0a1 lightningd: close a take() leak.
test_routing_gossip (__main__.LightningDTests) ... lightningd: Outstanding taken pointers: lightningd/peer_control.c:2352:towire_errorfmt(ld, ((void *)0), "Can't resolve your address")

This caused by the other end closing due to the next bug.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-25 18:34:35 +02:00
Rusty Russell
9e869e641a take: turn labels on.
Gives us meaningful errors when there's a take() leak.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-25 18:34:35 +02:00
Rusty Russell
7d62de8632 lightningd: fix typo in fatal error.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-25 18:34:35 +02:00
Rusty Russell
b6a2b8c58b Add --rgb and --alias options.
And derive random ones from nodeid if they don't choose.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-25 09:16:14 +00:00
Rusty Russell
ebdecebb1a channeld: send channel_announce and initial update to master, not gossipd.
There is a race we see sometimes under valgrind on Travis which shows
gossipd receiving the node_announce from master before it reads the
channel_announce from channeld, and thus fails.  The simplest solution
is to send the channel_announce and channel_update to master as well,
so it can ensure it sends them to gossipd in order

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-24 16:12:22 +02:00
Rusty Russell
66a0c55322 test_lightning.py: fix float insanity with values.
When is 0.01 != 0.01?  When there are floats involved!  Jenkins hit an
error once, I have no idea why.

This works around the following intermittant error:

ERROR: test_closing_negotiation_reconnect (__main__.LightningDTests)
----------------------------------------------------------------------
Traceback (most recent call last):
File "tests/test_lightningd.py", line 1601, in test_closing_negotiation_reconnect
self.fund_channel(l1, l2, 10**6)
File "tests/test_lightningd.py", line 241, in fund_channel
raise ValueError("Can't find {} payment in {}".format(amount, tx))
ValueError: Can't find 1000000 payment in 02000000000101b0b27be92916faee59f197c263e1ae7b44c0e59acdf2c385c55b5b04670ac026010000001716001401fad90abcd66697e2592164722de4a95ebee165ffffffff02651d0f0000000000160014c2ccab171c2a5be9dab52ec41b825863024c546640420f00000000002200205b8cd3b914cf67cdd8fa6273c930353dd36476734fbd962102c2df53b90880cd02473044022017bd19a0ee85532f67a280c71ed02d1321f08975334bd281527478022265225702202ec448bf9c0890a31a26f0ef4f03d298c8ec16b277faff09a70ddd335df44b6e012103d745445c9362665f22e0d96e9e766f273f3260dea39c8a76bfa05dd2684ddccf00000000

For testing, that tx decodes to:
{
  "txid": "0165e92be762b352b665b76b9872d5189e1b2a8faf4918ab3cca7cd5d4b6a5fa",
  "hash": "8af9a36c79ee5243468c5cbed1c80f10238fba405f0ad957a0c2cfc46fb632f5",
  "version": 2,
  "size": 257,
  "vsize": 176,
  "locktime": 0,
  "vin": [
    {
      "txid": "26c00a67045b5bc585c3f2cd9ae5c0447baee163c297f159eefa1629e97bb2b0",
      "vout": 1,
      "scriptSig": {
        "asm": "001401fad90abcd66697e2592164722de4a95ebee165",
        "hex": "16001401fad90abcd66697e2592164722de4a95ebee165"
      },
      "txinwitness": [
        "3044022017bd19a0ee85532f67a280c71ed02d1321f08975334bd281527478022265225702202ec448bf9c0890a31a26f0ef4f03d298c8ec16b277faff09a70ddd335df44b6e01",
        "03d745445c9362665f22e0d96e9e766f273f3260dea39c8a76bfa05dd2684ddccf"
      ],
      "sequence": 4294967295
    }
  ],
  "vout": [
    {
      "value": 0.00990565,
      "n": 0,
      "scriptPubKey": {
        "asm": "0 c2ccab171c2a5be9dab52ec41b825863024c5466",
        "hex": "0014c2ccab171c2a5be9dab52ec41b825863024c5466",
        "type": "witness_v0_keyhash"
      }
    },
    {
      "value": 0.01000000,
      "n": 1,
      "scriptPubKey": {
        "asm": "0 5b8cd3b914cf67cdd8fa6273c930353dd36476734fbd962102c2df53b90880cd",
        "hex": "00205b8cd3b914cf67cdd8fa6273c930353dd36476734fbd962102c2df53b90880cd",
        "type": "witness_v0_scripthash"
      }
    }
  ]
}

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-24 16:12:22 +02:00
Rusty Russell
7f38943956 options: show the default network setting in --help.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-24 16:12:22 +02:00
Rusty Russell
7e022b522c gossipd: don't try to handle padding inside fromwire_ipaddr.
It makes it impossible to embed an ipaddr in another structure, since we
always try to skip over any zeroes, which may swallow a following field.

Do the skip specially for the case where we're parsing routing messages:
we never use padding for our own internal messages anyway.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-24 16:12:22 +02:00
Rusty Russell
c2dd0cb295 test_lightningd.py: return short channel id from fund_channel helper.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-24 16:12:22 +02:00
Rusty Russell
79962b3588 lightningd: return transaction from fundchannel RPC.
Lets tests figure out the short channel name, for example.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-24 16:12:22 +02:00
Rusty Russell
48cedef756 peer_control: remove unique_id field.
It's now completely useless.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-24 16:12:22 +02:00
Rusty Russell
ffaa15c7da hsm: remove unique_id.
It was only for error messages, so replace it with pubkey.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-24 16:12:22 +02:00
Rusty Russell
c3bed51b2d test_lightningd.py: make HSM seeds constant for tests.
Makes it easier to compare before/after failures.  Ideally, we should
run under Travis both with this option and with the seed based on the
entire tmp path (which is still reproducible with determination, but
not fixed every run like this is).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-24 16:12:22 +02:00
Rusty Russell
4c9f7542b2 subd: Clarify description of subd_release_peer.
Suggested-by: Christian Decker
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-22 16:24:10 +02:00
Rusty Russell
74e684cc0d is_all_channels: rename to channel_id_is_all
Suggested-by: Christian Decker
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-22 16:24:10 +02:00
Rusty Russell
1954844fbf lightningd: make peer_fail_permanent() only save the first error for peer.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-20 18:31:32 +02:00
Rusty Russell
0b953b86fe subd: automatically detect if callback frees subd.
This involves a tricky callback internally, but far less error-prone.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-20 18:31:32 +02:00
Rusty Russell
5a256c724a subd: simplify and cleanup lifetime handling.
There are now only two kinds of subdaemons: global ones (hsmd, gossipd) and
per-peer ones.  We can handle many callbacks internally now.

We can have a handler to set a new peer owner, and automatically do
the cleanup of the old one if necessary, since we now know which ones
are per-peer.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-20 18:31:32 +02:00
Rusty Russell
a117d595a4 subd: allow callbacks to free sd.
We'll need this for the next patch; we'll be freeing the old subd whenever
peer->owner changes.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-20 18:31:32 +02:00
Rusty Russell
2374b54ef2 ccan: update to get io fix for duplex pipes.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-20 18:31:32 +02:00
Rusty Russell
cb82bf7aa2 onchaind: send message when peer's transactions are irrevocably committed.
We currently rely on a zero exit status.  That's the only difference between
onchain finished handling and other per-peer daemons, so instead we should
have an explicit "done" message.  This is both clearer, and allows us to
unify.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-20 18:31:32 +02:00
Rusty Russell
f83ee6d5ea dev_disconnect: don't permfail more than once.
The coming tests trigger this latent bug under travis.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-20 18:31:32 +02:00
Rusty Russell
ebba5f85a2 handshaked: remove.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-20 18:31:32 +02:00
Rusty Russell
474887512d gossipd: rewrite to do the handshake internally.
Now the flow is much simpler from a lightningd POV:

1. If we want to connect to a peer, just send gossipd `gossipctl_reach_peer`.
2. Every new peer, gossipd hands up to lightningd, with global/local features
   and the peer fd and a gossip fd using `gossip_peer_connected`
3. If lightningd doesn't want it, it just hands the peerfd and global/local
   features back to gossipd using `gossipctl_handle_peer`
4. If a peer sends a non-gossip msg (eg `open_channel`) the gossipd sends
   it up using `gossip_peer_nongossip`.
5. If lightningd wants to fund a channel, it simply calls `release_channel`.

Notes:
* There's no more "unique_id": we use the peer id.
* For the moment, we don't ask gossipd when we're told to list peers, so
  connected peers without a channel don't appear in the JSON getpeers API.
* We add a `gossipctl_peer_addrhint` for the moment, so you can connect to
  a specific ip/port, but using other sources is a TODO.
* We now (correctly) only give up on reaching a peer after we exchange init
  messages, which changes the test_disconnect case.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-20 18:31:32 +02:00
Rusty Russell
2273ce783e dev_disconnect: support multiple disconnects in the same daemon.
We currently assume the daemon gives up; gossipd won't, and we want to
test it there too.

This reveals a bug (returning io_close() is bad if the call is to
duplex()), and breaks a test which now continues after dropping a
packet..

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-20 18:31:32 +02:00
Rusty Russell
6ceec17943 dev_disconnect: make commit suppression a "-nocommit" modifier.
Useful if we want to drop & suppress, for example.  We change '=' to mean
do nothing to the packet.

We use this to clean up the test_reconnect_sender_add test.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-20 18:31:32 +02:00
Rusty Russell
a88ac22711 gossipd: include ccan/io version of handshake code, with tests.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-20 18:31:32 +02:00
Rusty Russell
98ad6b9231 lightningd: change connect RPC args.
We're going to make the ip/port optional, so they should go at the end.
In addition, using ip:port is nicer, for gethostbyaddr().

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-20 18:31:32 +02:00
Rusty Russell
c9828d146a contrib/lightning-open-channel: remove
It doesn't work on new lightningd anyway.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-20 18:31:32 +02:00
Rusty Russell
e11553fc55 lightningd: expose ipaddr parsing.
We don't do DNS lookups, but hack in localhost for the moment.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-20 18:31:32 +02:00
Rusty Russell
79ebb9dfd0 json: helper to parse pubkeys.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-20 18:31:32 +02:00
Rusty Russell
8430e33f3b common/status: add status_tracev() for making status wrappers.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-20 18:31:32 +02:00
Rusty Russell
9b589fb5ba common/wire_error: helpers to create/parse WIRE_ERROR messages.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-20 18:31:32 +02:00
Rusty Russell
871d0b1d74 lightningd: simplify peer destruction.
We have to do a dance when we get a reconnect in openingd, because we
don't normally expect to free both owner and peer.  It's a layering
violation: freeing a peer should clean up the owner's pointer to it,
to avoid a double free, and we can eliminate this dance.

The free order is now different, and the test_reconnect_openingd was
overprecise.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-20 18:31:32 +02:00
Rusty Russell
61786b9c90 subd: don't leak fds if we fail to create subdaemon.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-20 18:31:32 +02:00
Rusty Russell
4fa36c585d gossipd: receive hsm fd from master.
We'll need this soon.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-20 18:31:32 +02:00
Rusty Russell
f172be71dc gossipd: fail peer for the master daemon.
This fixes the only case where the master currently has to write directly
to the peer: re-sending an error.  We make gossipd do it, by adding
a new gossipctl_fail_peer message.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-20 18:31:32 +02:00
Rusty Russell
2394c9a2e7 crypto_state: move to its own file.
In particular, the main daemon needs to pass it about (marshal/unmarshal)
but it won't need to actually use it after the next patch.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-20 18:31:32 +02:00
Rusty Russell
399b5f61bc gossipd: rename fail_peer to drop_peer.
We don't actually send it a failure message, we just close it.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-20 18:31:32 +02:00
Rusty Russell
0969626918 close_tx: make version 1, not version 2.
Fixes: #311
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-18 13:00:29 +02:00
Rusty Russell
8f057f7fc7 Revert "gossip: send the *other* node's cltv_expiry_delta in channel_announce."
This reverts commit 297e278132.
2017-10-11 11:54:50 +02:00