Commit Graph

5779 Commits

Author SHA1 Message Date
Rusty Russell
b257b8960b test_lightning.py: more debugging for the tx decoding fail under travis.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-31 04:14:33 +00:00
Rusty Russell
390bf6359e test_lightning.py: don't get confused by valgrind core files.
I run with ulimit -c unlimited, and valgrind leaves core files like
valgrind-errors.22114.core.22114 which test_lightning.py tries to
parse as log files.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-31 04:14:33 +00:00
Rusty Russell
bc9918ad46 dev: option not to do backtracing.
It crashes under valgrind, causing a valgrind error: valgrind gives us a
backtrace anyway, so we don't need it.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-31 04:14:33 +00:00
Rusty Russell
4a06da8f78 wallet: fix wallet_update_output_status where oldstatus == output_state_any
"near \"AND\": syntax error"

This was caught by the "always keep errors for db_commit_transaction".

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-31 04:14:33 +00:00
Rusty Russell
21305c0d28 fatal: cause a backtrace.
Much nicer for debugging.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-31 04:14:33 +00:00
Rusty Russell
45b92dc2b0 pytest: pass DEVELOPER flag through.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-31 04:14:33 +00:00
Rusty Russell
ee1fb07c8e update-mocks: fix stubs generation for fatal().
We're going to need this, and the PRINTF_FMT(1,2) in front of it caused
mockup.sh to miss the declaration.

We also eliminate the obviously-unused fallback case (which referred
to daemon/*.h).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-31 04:14:33 +00:00
Rusty Russell
82f252c79a test_permfail_new_commit: fix intermittant failure.
Normally, we get an error as soon as we send WIRE_REVOKE_AND_ACK.  But if the
commit timer goes off, we get some extra cycles, during which the other side
can reconnect.  In this case, we simply kill the channeld before it fails,
and never check for the permfail string.

    b'lightning_channeld(18613): TRACE: dev_disconnect: -WIRE_REVOKE_AND_ACK'
    b'lightning_channeld(18613): TRACE: Trying commit'
    b'lightning_channeld(18613): TRACE: htlc 0: SENT_ADD_REVOCATION->SENT_ADD_ACK_COMMIT'
    b'lightning_channeld(18613): TRACE: htlc added REMOTE: local +0 remote -200000000'
    b'lightning_channeld(18613): TRACE: sending_commit: HTLC REMOTE 0 = SENT_ADD_ACK_COMMIT/RCVD_ADD_ACK_COMMIT'
    b'lightning_gossipd(18590): TRACE: Responder: Act 1'
    b'lightning_channeld(18613): TRACE: Derived key 034aab0b5cb755de836cffb34c053ba115fba6fe75414e8f56261e23c80eabb1fe from basepoint 03e0a7bb422b254f54bc954be05bd6823a7b7a4b996ff8d3079ca211590fb5df39, point 02f3bf525b6ca595bf85d63e89c95fc59c0fde3ae434b55c8093bbb5c64849da37'
    b'lightningd(18465): Connected json input'
    b'lightningd(18465):jcon fd 16: Success'
    b'lightningd(18465):jcon fd 16: Closing (Bad file descriptor)'
    b'lightning_gossipd(18590): TRACE: Responder: Act 2'
    b'lightning_gossipd(18590): TRACE: Responder: Act 3'
    b'lightning_gossipd(18590): UPDATE WIRE_GOSSIP_PEER_CONNECTED'
    b'lightning_gossipd(18590): UPDATE WIRE_GOSSIP_PEER_CONNECTED'
    b'lightningd(18465): peer 0266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c03518: Peer has reconnected, state CHANNELD_NORMAL'
    b'lightning_channeld(18613): Status closed, but not exited. Killing'

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-28 13:33:00 +02:00
Rusty Russell
2aadd351f8 test_lightningd.py: more information when we fail to find payment.
Travis has been hitting this intermittantly, can't reproduce here.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-28 13:33:00 +02:00
Rusty Russell
0c7ca9ab7c gossipd: call to return all connected peers.
And we report these through the getpeers JSON RPC again (carefully: in
our reconnect tests we can get duplicates which this patch now filters
out).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-26 21:01:09 +00:00
Rusty Russell
a7d6326bef type_to_string: format wireaddr.
Good for printing, and removes some code from peer_control.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-26 21:01:09 +00:00
Rusty Russell
78cd25d620 ipaddr: rename to wireaddr.
In future it will have TOR support, so the name will be awkward.

We collect the to/fromwire functions in common/wireaddr.c, and the
parsing functions in lightningd/netaddress.c.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-26 21:01:09 +00:00
Rusty Russell
4bd0352951 lightningd: try to figure out our own IP automatically.
Most of the code is from bitcoind, to handle the weird different non-public
IP ranges.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-26 21:01:09 +00:00
Rusty Russell
329269d9d0 lightningd: support multiple addresses.
Currently only ipv4 and ipv6.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-26 21:01:09 +00:00
Rusty Russell
bd1cac34ce netaddr: remove.
We use ipaddr everywhere now, so we can remove this.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-26 21:01:09 +00:00
Rusty Russell
dfd60a2047 gossipd: tell the master the peer's address.
This will let us remove peer->netaddr.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-26 21:01:09 +00:00
Rusty Russell
33bfc2326a gossipd: pass addr of peer though handshake.
We need to derive this from the fd when they connect in, but we already
know it if we're connecting out.

We want this so we can tell (in next few patches) master the peer's address.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-26 21:01:09 +00:00
Rusty Russell
3f84ca1052 gossipd: really fix peer handoff.
954a3990fa had two errors:
1) We created the handoff message *before* we sent the final packet, meaning
   that the cryptostate was out-of-sync.
2) We called io_wait() on the output side of a duplex connection: it has
   to be io_wait_out().

This time, stress testing for 2 hours revealed no more problems.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-26 13:03:51 +02:00
Rusty Russell
ac92138603 common: remove unused assert() headers.
Auditing for assert/abort in common/ code used by lightningd, this is all
that showed up.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-26 12:53:09 +02:00
Rusty Russell
81db5896e1 common/json: remove asserts() which may trigger from user input.
They don't currently, since callers check, but be safe.  In addition,
handle NULL returns from these in the bitcoind code.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-26 12:53:09 +02:00
Rusty Russell
11b43a422b lightningd: close one possibly-reachable abort.
There are others, but they really are casued by bad failure.  We need a
parachute system for these.

Closes: #176
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-26 12:53:09 +02:00
Rusty Russell
3c6eec87e3 Add DEVELOPER flag, set by default.
This is a bit messier than I'd like, but we want to clearly remove all
dev code (not just have it uncalled), so we remove fields and functions
altogether rather than stub them out.  This means we put #ifdefs in callers
in some places, but at least it's explicit.

We still run tests, but only a subset, and we run with NO_VALGRIND under
Travis to avoid increasing test times too much.

See-also: #176
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-26 12:53:09 +02:00
Rusty Russell
8d9818ff9c gossipd: receive global/local features the right way around
Fixes: #323
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-26 12:49:56 +02:00
Rusty Russell
954a3990fa gossipd: don't send a peer to master with half-written or half-read packet.
In this case, it was a gossip message half-sent, when we asked the peer
to be released.  Fix the problem in general by making send_peer_with_fds()
wait until after the next packet.

test_routing_gossip/lightning-4/log:
	b'lightning_openingd(8738): TRACE: First per_commit_point = 02e2ff759ed70c71f154695eade1983664a72546ebc552861f844bff5ea5b933bf'
	b'lightning_openingd(8738): TRACE: Failed hdr decrypt with rn=11'
	b'lightning_openingd(8738): STATUS_FAIL_PEER_IO: Reading accept_channel: Success'

test_routing_gossip/lightning-5/log:

	b'lightning_gossipd(8461): UPDATE WIRE_GOSSIP_PEER_NONGOSSIP'
	b'lightning_gossipd(8461): UPDATE WIRE_GOSSIP_PEER_NONGOSSIP'
	b'lightningd(8308): Failed to get netaddr for outgoing: Transport endpoint is not connected'

The problem occurs here on release, but could be on any place where we hand
a peer over when using ccan/io.  Note the other case (channel.c).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-25 18:34:35 +02:00
Rusty Russell
4e14185961 cryptomsg: add helpers to determine if we're partway through msg read/write.
For message read, we do it as header then body, so we can have
io_plan_in_started(conn) false, but we're between header and body.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-25 18:34:35 +02:00
Rusty Russell
a8f033f6ae ccan: update to get new ccan/io changes.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-25 18:34:35 +02:00
Rusty Russell
a2dc71b0a1 lightningd: close a take() leak.
test_routing_gossip (__main__.LightningDTests) ... lightningd: Outstanding taken pointers: lightningd/peer_control.c:2352:towire_errorfmt(ld, ((void *)0), "Can't resolve your address")

This caused by the other end closing due to the next bug.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-25 18:34:35 +02:00
Rusty Russell
9e869e641a take: turn labels on.
Gives us meaningful errors when there's a take() leak.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-25 18:34:35 +02:00
Rusty Russell
7d62de8632 lightningd: fix typo in fatal error.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-25 18:34:35 +02:00
Rusty Russell
b6a2b8c58b Add --rgb and --alias options.
And derive random ones from nodeid if they don't choose.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-25 09:16:14 +00:00
Rusty Russell
ebdecebb1a channeld: send channel_announce and initial update to master, not gossipd.
There is a race we see sometimes under valgrind on Travis which shows
gossipd receiving the node_announce from master before it reads the
channel_announce from channeld, and thus fails.  The simplest solution
is to send the channel_announce and channel_update to master as well,
so it can ensure it sends them to gossipd in order

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-24 16:12:22 +02:00
Rusty Russell
66a0c55322 test_lightning.py: fix float insanity with values.
When is 0.01 != 0.01?  When there are floats involved!  Jenkins hit an
error once, I have no idea why.

This works around the following intermittant error:

ERROR: test_closing_negotiation_reconnect (__main__.LightningDTests)
----------------------------------------------------------------------
Traceback (most recent call last):
File "tests/test_lightningd.py", line 1601, in test_closing_negotiation_reconnect
self.fund_channel(l1, l2, 10**6)
File "tests/test_lightningd.py", line 241, in fund_channel
raise ValueError("Can't find {} payment in {}".format(amount, tx))
ValueError: Can't find 1000000 payment in 02000000000101b0b27be92916faee59f197c263e1ae7b44c0e59acdf2c385c55b5b04670ac026010000001716001401fad90abcd66697e2592164722de4a95ebee165ffffffff02651d0f0000000000160014c2ccab171c2a5be9dab52ec41b825863024c546640420f00000000002200205b8cd3b914cf67cdd8fa6273c930353dd36476734fbd962102c2df53b90880cd02473044022017bd19a0ee85532f67a280c71ed02d1321f08975334bd281527478022265225702202ec448bf9c0890a31a26f0ef4f03d298c8ec16b277faff09a70ddd335df44b6e012103d745445c9362665f22e0d96e9e766f273f3260dea39c8a76bfa05dd2684ddccf00000000

For testing, that tx decodes to:
{
  "txid": "0165e92be762b352b665b76b9872d5189e1b2a8faf4918ab3cca7cd5d4b6a5fa",
  "hash": "8af9a36c79ee5243468c5cbed1c80f10238fba405f0ad957a0c2cfc46fb632f5",
  "version": 2,
  "size": 257,
  "vsize": 176,
  "locktime": 0,
  "vin": [
    {
      "txid": "26c00a67045b5bc585c3f2cd9ae5c0447baee163c297f159eefa1629e97bb2b0",
      "vout": 1,
      "scriptSig": {
        "asm": "001401fad90abcd66697e2592164722de4a95ebee165",
        "hex": "16001401fad90abcd66697e2592164722de4a95ebee165"
      },
      "txinwitness": [
        "3044022017bd19a0ee85532f67a280c71ed02d1321f08975334bd281527478022265225702202ec448bf9c0890a31a26f0ef4f03d298c8ec16b277faff09a70ddd335df44b6e01",
        "03d745445c9362665f22e0d96e9e766f273f3260dea39c8a76bfa05dd2684ddccf"
      ],
      "sequence": 4294967295
    }
  ],
  "vout": [
    {
      "value": 0.00990565,
      "n": 0,
      "scriptPubKey": {
        "asm": "0 c2ccab171c2a5be9dab52ec41b825863024c5466",
        "hex": "0014c2ccab171c2a5be9dab52ec41b825863024c5466",
        "type": "witness_v0_keyhash"
      }
    },
    {
      "value": 0.01000000,
      "n": 1,
      "scriptPubKey": {
        "asm": "0 5b8cd3b914cf67cdd8fa6273c930353dd36476734fbd962102c2df53b90880cd",
        "hex": "00205b8cd3b914cf67cdd8fa6273c930353dd36476734fbd962102c2df53b90880cd",
        "type": "witness_v0_scripthash"
      }
    }
  ]
}

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-24 16:12:22 +02:00
Rusty Russell
7f38943956 options: show the default network setting in --help.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-24 16:12:22 +02:00
Rusty Russell
7e022b522c gossipd: don't try to handle padding inside fromwire_ipaddr.
It makes it impossible to embed an ipaddr in another structure, since we
always try to skip over any zeroes, which may swallow a following field.

Do the skip specially for the case where we're parsing routing messages:
we never use padding for our own internal messages anyway.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-24 16:12:22 +02:00
Rusty Russell
c2dd0cb295 test_lightningd.py: return short channel id from fund_channel helper.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-24 16:12:22 +02:00
Rusty Russell
79962b3588 lightningd: return transaction from fundchannel RPC.
Lets tests figure out the short channel name, for example.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-24 16:12:22 +02:00
Rusty Russell
48cedef756 peer_control: remove unique_id field.
It's now completely useless.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-24 16:12:22 +02:00
Rusty Russell
ffaa15c7da hsm: remove unique_id.
It was only for error messages, so replace it with pubkey.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-24 16:12:22 +02:00
Rusty Russell
c3bed51b2d test_lightningd.py: make HSM seeds constant for tests.
Makes it easier to compare before/after failures.  Ideally, we should
run under Travis both with this option and with the seed based on the
entire tmp path (which is still reproducible with determination, but
not fixed every run like this is).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-24 16:12:22 +02:00
Rusty Russell
4c9f7542b2 subd: Clarify description of subd_release_peer.
Suggested-by: Christian Decker
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-22 16:24:10 +02:00
Rusty Russell
74e684cc0d is_all_channels: rename to channel_id_is_all
Suggested-by: Christian Decker
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-22 16:24:10 +02:00
Rusty Russell
1954844fbf lightningd: make peer_fail_permanent() only save the first error for peer.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-20 18:31:32 +02:00
Rusty Russell
0b953b86fe subd: automatically detect if callback frees subd.
This involves a tricky callback internally, but far less error-prone.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-20 18:31:32 +02:00
Rusty Russell
5a256c724a subd: simplify and cleanup lifetime handling.
There are now only two kinds of subdaemons: global ones (hsmd, gossipd) and
per-peer ones.  We can handle many callbacks internally now.

We can have a handler to set a new peer owner, and automatically do
the cleanup of the old one if necessary, since we now know which ones
are per-peer.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-20 18:31:32 +02:00
Rusty Russell
a117d595a4 subd: allow callbacks to free sd.
We'll need this for the next patch; we'll be freeing the old subd whenever
peer->owner changes.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-20 18:31:32 +02:00
Rusty Russell
2374b54ef2 ccan: update to get io fix for duplex pipes.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-20 18:31:32 +02:00
Rusty Russell
cb82bf7aa2 onchaind: send message when peer's transactions are irrevocably committed.
We currently rely on a zero exit status.  That's the only difference between
onchain finished handling and other per-peer daemons, so instead we should
have an explicit "done" message.  This is both clearer, and allows us to
unify.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-20 18:31:32 +02:00
Rusty Russell
f83ee6d5ea dev_disconnect: don't permfail more than once.
The coming tests trigger this latent bug under travis.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-20 18:31:32 +02:00
Rusty Russell
ebba5f85a2 handshaked: remove.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-20 18:31:32 +02:00
Rusty Russell
474887512d gossipd: rewrite to do the handshake internally.
Now the flow is much simpler from a lightningd POV:

1. If we want to connect to a peer, just send gossipd `gossipctl_reach_peer`.
2. Every new peer, gossipd hands up to lightningd, with global/local features
   and the peer fd and a gossip fd using `gossip_peer_connected`
3. If lightningd doesn't want it, it just hands the peerfd and global/local
   features back to gossipd using `gossipctl_handle_peer`
4. If a peer sends a non-gossip msg (eg `open_channel`) the gossipd sends
   it up using `gossip_peer_nongossip`.
5. If lightningd wants to fund a channel, it simply calls `release_channel`.

Notes:
* There's no more "unique_id": we use the peer id.
* For the moment, we don't ask gossipd when we're told to list peers, so
  connected peers without a channel don't appear in the JSON getpeers API.
* We add a `gossipctl_peer_addrhint` for the moment, so you can connect to
  a specific ip/port, but using other sources is a TODO.
* We now (correctly) only give up on reaching a peer after we exchange init
  messages, which changes the test_disconnect case.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-20 18:31:32 +02:00