The JSON-RPC spec specifies that if the request is unparseable we
should return an error with a NULL id. This is a bit more friendly
than slamming the door in the face.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
As reported by @practicalswift in #945 it is possible to inject
non-printable, or shell escape, characters in a json command, that
will fail to parse and then clear the shell.
Reported-by: @practicalswift
Signed-off-by: Christian Decker <decker.christian@gmail.com>
Now we have wirestring, this is much more natural. And with the
24M length limit, we needn't be so concerned about dumping 64k peer
messages in hex.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
These are now logically arrays of pointers. This is much more natural,
and gets rid of the horrible utxo array converters.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
`activate_peer` does little more than wiring up some txwatches and
asking `gossipd` to reconnect to the peer. If the peer manages to
reconnect before we activate then we would crash.
This just changes the `assert` causing the crash into a conditional
whether we need to reconnect or not.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
Due to the broadcast failure quite a few users are reporting channels
stuck in awaiting lockin. This commit adds a `dev-forget-channel`
command that checks whether the funding outpoint is in the UTXO, and
forgets the channel if not. The UTXO check can be overridden with the
`force` parameter, but that is dangerous.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
We were sideloading it, which is awkward, now it's a field that we can
actually use in the code.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
We currently don't handle LOG_IO properly, and we turn it into a string
before handing it to the ->print function, which makes it ugly for
the case where we're using copy_to_parent_log, and also means in
that case we lose *what peer* the IO is coming from.
Now, we handle the io as a separate arg, which is much neater.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Logging often gets called in error paths, so this is just good hygiene.
Also, log_io does this already.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
libunwind does not accept a NULL parameter for the error callback. It
will simply call into the NULL pointer. So add an error callback.
This makes the crash output somewhat more sensible on FreeBSD, where
there is no libunwind stack trace available:
2018-02-05T20:24:50.598Z lightningd(75556): error getting backtrace: no stack trace because unwind library not available (0)
Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Maintaining it was always fraught, since the command could go away
if the JSON RPC died. Most recently, it was broken again on shutdown
(see below).
In future we may allow pay commands to block on previous payments, so
it won't even be a 1:1 mapping. Generalize it: keep commands in a
simple list and do a lookup when a payment fails/succeeds.
Valgrind error file: valgrind-errors.5732
==5732== Invalid read of size 8
==5732== at 0x4149FD: remove_cmd_from_hout (pay.c:292)
==5732== by 0x468BAB: notify (tal.c:237)
==5732== by 0x469077: del_tree (tal.c:400)
==5732== by 0x4690C7: del_tree (tal.c:410)
==5732== by 0x46948A: tal_free (tal.c:509)
==5732== by 0x40F1EA: main (lightningd.c:362)
==5732== Address 0x69df148 is 1,512 bytes inside a block of size 1,544 free'd
==5732== at 0x4C2EDEB: free (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==5732== by 0x469150: del_tree (tal.c:421)
==5732== by 0x46948A: tal_free (tal.c:509)
==5732== by 0x4198F2: free_htlcs (peer_control.c:1281)
==5732== by 0x40EBA9: shutdown_subdaemons (lightningd.c:209)
==5732== by 0x40F1DE: main (lightningd.c:360)
==5732== Block was alloc'd at
==5732== at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==5732== by 0x468C30: allocate (tal.c:250)
==5732== by 0x4691F7: tal_alloc_ (tal.c:448)
==5732== by 0x40A279: new_htlc_out (htlc_end.c:143)
==5732== by 0x41FD64: send_htlc_out (peer_htlcs.c:397)
==5732== by 0x41511C: send_payment (pay.c:388)
==5732== by 0x41589E: json_sendpay (pay.c:513)
==5732== by 0x40D9B1: parse_request (jsonrpc.c:600)
==5732== by 0x40DCAC: read_json (jsonrpc.c:667)
==5732== by 0x45C706: next_plan (io.c:59)
==5732== by 0x45D1DD: do_plan (io.c:387)
==5732== by 0x45D21B: io_ready (io.c:397)
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is a transitional patch so we can still close channels cleanly;
for want of a better option, I hooked it into --deprecated-apis.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We shouldn't fail negotiation just because they exceeded what we thought
fair: we're better off as long as it's actually <= final commitment fee.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We move it into jsonrpc where it belongs, and make it fail the command.
This means it can tell us exactly what was wrong.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
With the new 'human-readable' mode of lightning-cli, this actually produces
a valid config file. It's a bit hacky though...
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We may need to lookup UTXO entries for other reasons, so here we
disentangle it and make it into its own method.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
Might help alleviate some of the issues of having to run a full-node
on the same machine as `lightningd`.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
Exception: Node /tmp/lightning-t5gxc6gs/test_closing_different_fees/lightning-2/ has memory leaks: [{'value': '0x55caa0a0b8d0', 'label': 'ccan/ccan/tal/str/str.c:90:char[]', 'backtrace': ['ccan/ccan/tal/tal.c:467 (tal_alloc_)', 'ccan/ccan/tal/tal.c:496 (tal_alloc_arr_)', 'ccan/ccan/tal/str/str.c:90 (tal_vfmt)', 'lightningd/log.c:131 (new_log)', 'lightningd/subd.c:632 (new_subd)', 'lightningd/subd.c:686 (new_peer_subd)', 'lightningd/peer_control.c:2487 (peer_accept_channel)', 'lightningd/peer_control.c:674 (peer_sent_nongossip)', 'lightningd/gossip_control.c:55 (peer_nongossip)', 'lightningd/gossip_control.c:142 (gossip_msg)', 'lightningd/subd.c:477 (sd_msg_read)', 'lightningd/subd.c:319 (read_fds)', 'ccan/ccan/io/io.c:59 (next_plan)', 'ccan/ccan/io/io.c:387 (do_plan)', 'ccan/ccan/io/io.c:397 (io_ready)', 'ccan/ccan/io/poll.c:305 (io_loop)', 'lightningd/lightningd.c:347 (main)', '(null):0 ((null))', '(null):0 ((null))', '(null):0 ((null))'], 'parents': ['lightningd/log.c:103:struct log_book', 'lightningd/lightningd.c:43:struct lightningd']}]
Technically, true, but we save more memory by sharing the prefix pointer
than we lose by leaking it.
However, we'd ideally refcount so it's freed if the log is freed and
all the entries using it are pruned from the log book.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This makes much more sense when you ask for a specific peer's log.
Also, we put the peerid rather than pid ().
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We added code to allow a few spurious failures, but it didn't unmark
the request running.
IRC user 'mlz' (@molxyz) provided logs from his stuck-at-old-block lightningd:
lightningd(31981): Adding block 1261159: 00000000da3890ccd0f313a74fccfd4789654b496836da5c28a8d2ad28852264
lightningd(31981): Adding block 1261160: 00000000f70938a33aecbdd7b047cb5cf5b095ea4770c1335acf1859bad1e767
lightningd(31981): bitcoin-cli -testnet estimatesmartfee 2 CONSERVATIVE exited with status 1
Fixes: #749
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
There is an interaction between --ipaddr and --port, namely that the
default port is used when parsing --ipaddr if --port comes after the
--ipaddr, and --port is used if it comes before it. Adding a port to
--ipaddr still trumps everything else, but this way we correctly set
port in the address.
Reported-by: Wladimir J. van der Laan @laanwj
Signed-off-by: Christian Decker <decker.christian@gmail.com>
The JSON connect command wouldn't terminate if peer reconnected
in a state CHANNELD_AWAITING_LOCKIN or above.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Such an htlc is invalid, and will be failed cleanly by our channeld
(which also checks that it meets the minimum amount), but it's
not the master's job to check it, and in fact, it asserts if we were
to try to pay or forward such a thing.
Fixes: #686
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The peer shouldn't try, and channeld won't try to add it if it does,
but we shouldn't trust it. And it would make our htlc_in_check() code
assert.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Before this patch:
```
$ lightningd/lightningd
lightningd(PID): Creating lightningd dir /root/.lightning (because chdir gave No such file or directory)
lightningd(PID): Creating database
```
After this patch:
```
$ lightningd/lightningd
lightningd(PID): Creating lightningd dir /root/.lightning
lightningd(PID): Creating database
```
delinvoice was orginally documented to only allow deletion of unpaid
invoices, but there might be reasons to delete paid ones or unexpired ones.
But we have to avoid the race where someone pays as it's deleted: the
easiest way is to have the caller tell us the status, and fail if
it's wrong.
Fixes: #477
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Error code is inverted (which makes sense: who returns 'true' on
error?), and anyway there's a leak if we do error.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We're going to have to support multiple channels per peer, even if only
when some are onchain. This would break the current listpeers, so
change it to an array (single element for now).
Other cleanups:
1. Only set connected true if daemon is not onchaind.
2. Only show netaddr if connected; don't make it an array, call it `address`
in comparison with `addresses` in listnodes.
3. Rename `channel` to `short_channel_id`
4. Add `funding_txid` field for voyeurism.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This allows us to add other fields, such as version information,
warnings or invoiceless payments, later.
(Note: the deprecated listinvoice is unchanged)
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This matches the other names, and also the return value is about to change.
This will be removed before release!
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This can be used for upgrades to make sure you're not using deprecated
options, JSON commands, JSON fields, etc.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
For performance, we delay entering the 'wallet_payment' into the db
until we actually commit to the HTLC (when we have to touch the DB
anyway).
This opens a race where we can try to pay twice, and since it's not in
the database yet, we don't notice the duplicate.
So remove the temporary payment field from htlc_out, which was always
an uncomfortable hack, and make the wallet code abstract over the
deferred entry a little by maintaining a 'unstored_payments' list
and incorporating that in results.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We need these to decode any returned errors.
We remove it from struct pay_command too, and load directly from db
when we need it.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We should be saving this, as it's our proof of payment. Also, we return
it if they try to pay again.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This, of course, should never be used. But it helps maintain connections
for the moment while we dig deeper into feerates.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is a common occurence on pruned nodes. By calling the callback
upon failures, we communicate that we couldn't verify the txoutput. We
fail safe rejecting any channel we can't verify.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
This means we print out the correct path with --debugger, which
can be vital if there are multiple binaries (eg. compiled vs installed).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
json_get_params does this for us.
Fixes: 78adf0b (pay: allow 'null' msatoshi field.)
Reported-by: ZmnSCPxj
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Pulling up the save call from `peer_save_commitsig_received` into its
caller `peer_got_commitsig` and adding a call to
`peer_sending_commitsig`
Signed-off-by: Christian Decker <decker.christian@gmail.com>
Message buffer `why` is allocated in the `peer` context and also freed when peer is freed.
Only explicitly free the buffer when peer itself is not freed yet.
exit status is not enough to detect spent outputs. gettxout will return a
success exit code and 0 bytes.
Signed-off-by: William Casarin <jb55@jb55.com>
We'll pass this down to gossip and make sure to re-announce/update
channels every so often. This is also used as a pruning timer, i.e.,
channels that have not been updated in 2 x channel-update-interval
will be pruned from the local view.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
Since most callers use positional arguments, we should allow a 'null'
literal where we require no value at all.
Also adds some more value tests.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Paid invoices need to know how much was actually paid: both for the case
where no 'msatoshi' amount was specified, and for the normal case, where
clients are permitted to overpay in order to help them disguise their
payments.
While we migrate the db, we leave this field as 0 for old paid
invoices. This is unhelpful for accounting, but at least clearly
indicates what happened if we find this in the wild.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
'rhash' is the old terminology, but 'payment_preimage' and
'payment_hash' were decided on for the BOLTs, so we should fix that here.
We still use rhash internally, but that's much easier to fix.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Different commands (listinvoice, delinvoice, waitinvoice,
waitanyinvoice) returned different fields, as not all were updated.
This makes them uniform.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This reuses the same code internally, and also now means that we deal
correctly with "any" msatoshi invoices: the old code would a return
'msatoshi' of 0 in that case.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The manfile and the online help use 'msatoshi', the returned
response uses 'msatoshi', nearly every invoice-related
monetary amount is labelled 'msatoshi' and not 'amount'.
It would be nice if bitcoind had an RPC to do this in one, but that's
a bit much to ask for. We could also hand around proofs, for lite nodes.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
1. htlc->fail has been changed to a u8 *.
2. wallet_get_newindex saves to the db.
3. peer->next_htlc_id is saved to the db in peer_save_commitsig_sent() below.
4. We do store commit in peer_save_commitsig_received(peer, commitnum),
and the fixme below talks about HTLC sigs.
5. We do commit shachain and next_per_commit_point in wallet_shachain_add_hash
and update_per_commit_point respectively.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
All other users of json_get_params(...) check the return value:
```
lightningd/chaintopology.c: if (!json_get_params(buffer, params,
lightningd/chaintopology.c: if (!json_get_params(buffer, params,
lightningd/dev_ping.c: if (!json_get_params(buffer, params,
lightningd/gossip_control.c: if (!json_get_params(buffer, params,
lightningd/invoice.c: if (!json_get_params(buffer, params,
lightningd/invoice.c: if (!json_get_params(buffer, params,
lightningd/invoice.c: if (!json_get_params(buffer, params,
lightningd/invoice.c: if (!json_get_params(buffer, params,
lightningd/invoice.c: if (!json_get_params(buffer, params, "label", &labeltok, NULL)) {
lightningd/invoice.c: if (!json_get_params(buffer, params,
lightningd/jsonrpc.c: if (!json_get_params(buffer, params,
lightningd/pay.c: if (!json_get_params(buffer, params,
lightningd/pay.c: if (!json_get_params(buffer, params,
lightningd/peer_control.c: if (!json_get_params(buffer, params,
lightningd/peer_control.c: if (!json_get_params(buffer, params,
lightningd/peer_control.c: if (!json_get_params(buffer, params,
lightningd/peer_control.c: if (!json_get_params(buffer, params,
lightningd/peer_control.c: if (!json_get_params(buffer, params,
lightningd/peer_control.c: if (!json_get_params(buffer, params,
wallet/walletrpc.c: if (!json_get_params(buffer, params,
wallet/walletrpc.c: if (!json_get_params(buffer, params, "tx", &txtok, NULL)) {
```
I've only seen this under travis, so I can't verify that this fixes it,
but it's certainly a bug which could cause that issue.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is necessary to grad the their_unilateral/to-us outputs since
they aren't being harvested by `onchaind`
Signed-off-by: Christian Decker <decker.christian@gmail.com>
This is the scriptpubkey that onchaind spends all funds to, except for
the their_unilateral/to-us case, so we better recognize that address.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
This is the only case in which we don't respend to a simple keyindex'd
pubkey, so we need to handle this for future spends.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
We always arm the funding_lockin_cb, even if we don't need to. If we
have an short_channel_id already from the db, this was replacing it
and leaking the old one.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Since we panic when we see our root reorg out, even if we're not doing
anything yet, restoring the 100 block margin is the simplest fix.
Unfortunately this means adding a 100-block spacer in the tests, so things
don't get confused.
Fixes: #511
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is surprisingly simple. We set up the watches for funding tx
depth and the funding output, then if it's not onchain we ask gossipd
to reconnect.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Load the first block we're possibly interested in, then load the peers so
we can restore the tx watches, then finally replay to the current tip.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Eventually we want to save blockchain in db to avoid this scan, but
for the moment, we need to reload as far back as we may be interested in.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This gives us a lower bound on where funding tx could be.
In theory, it could be lower than this if we get a reorganization, but
in practice this is already a 1-block buffer (since we can't get into
current block, only the next one).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
In the normal (peer-to-peer) path, the HTLC state prevents us fulfilling
twice, but this goes out the window with onchain HTLCs.
The actual assert which caught it was lightningd/pay.c:70 (payment_succeeded)
in the test_htlc_in_timeout test, after the next commit.
So add an assert earlier (in fulfill_our_htlc_out) and check in the
one caller where it can be true.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We used to load the new tip and work backwards until we joined up with
the previous tip. That consumed quite a lot of memory if there were
many blocks.
Instead, just poll on blocknum+1, and grab it once that succeeds. If
prev is different from what we expect (reorg), we free the current tip
and try again.
We could theoretically miss a reorg which is the same length (2 block
reorg with more work due to difficulty adjustment), but even if that
happened we'd catch up on the next block.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It definitely changes when we get a block, but it also changes between
blocks as mempool fills. So put it on its own timer.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It's just a sha256_double, but importantly when we convert it to a
string (in type_to_string, which is used in logging) we use
bitcoin_blkid_to_hex() so it's reversed as people expect.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It's just a sha256_double, but importantly when we convert it to a
string (in type_to_string, which is used in logging) we use
bitcoin_txid_to_hex() so it's reversed as people expect.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
I prefer the typesafety of specific functions, rather than having the
caller know that txids are traditionally reversed in bitcoin.
And we already have a bitcoin_txid_to_hex() function for this.
Closes: #411
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* Add port parsing support to parse_wireaddr. This is in preparation for storing
addresses in the peers table. This also makes parse_wireaddr a proper inverse of
fmt_wireaddr.
* Move parse_wireaddr to common/wireaddr.c this seems like a better place for
it. I bring along parse_ip_port with it for convenience. This also fixes some
issues with the upcoming ip/port parsing tests.
Signed-off-by: William Casarin <jb55@jb55.com>
We set hout->key.id when channeld tells us what it is, but if channeld
dies before that we free the hout, and our destructor logs it:
Valgrind error file: valgrind-errors.20312
==20312== Use of uninitialised value of size 8
==20312== at 0x53ABC9B: _itoa_word (_itoa.c:179)
==20312== by 0x53B041F: vfprintf (vfprintf.c:1642)
==20312== by 0x53B17D5: buffered_vfprintf (vfprintf.c:2330)
==20312== by 0x53AEAA5: vfprintf (vfprintf.c:1301)
==20312== by 0x53B7D63: fprintf (fprintf.c:32)
==20312== by 0x128BAC: hout_subd_died (peer_htlcs.c:316)
==20312== by 0x16D8E0: notify (tal.c:240)
==20312== by 0x16DD95: del_tree (tal.c:400)
==20312== by 0x16DDE7: del_tree (tal.c:410)
==20312== by 0x16DDE7: del_tree (tal.c:410)
==20312== by 0x16E1B4: tal_free (tal.c:509)
==20312== by 0x162B5C: io_close (io.c:443)
==20312== by 0x12D563: sd_msg_read (subd.c:508)
==20312== by 0x161EA5: next_plan (io.c:59)
==20312== by 0x1629A2: do_plan (io.c:387)
==20312== by 0x1629E0: io_ready (io.c:397)
==20312== by 0x164319: io_loop (poll.c:305)
==20312== by 0x118E21: main (lightningd.c:334)
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Accuracy improvements:
1. We assumed the output was a p2wpkh, but it can be user-supplied now.
2. We assumed we always had change; remove this for wallet_select_all.
Calculation out-by-one fixes:
1. We need to add 1 byte (4 sipa) for the input count.
2. We need to add 1 byte (4 sipa) for the output count.
3. We need to add 1 byte (4 sipa) for the output script length for each output.
4. We need to add 1 byte (4 sipa) for the input script length for each input.
5. We need to add 1 byte (4 sipa) for the PUSH optcode for each P2SH input.
The results are now a slight overestimate (due to guessing 73 bytes
for signature, whereas they're 71 or 72 in practice).
Fixes: #458
Reported-by: Jonas Nick @jonasnick
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Two changes:
- Fixed the function signature of noleak_ to match in both
configurations
- Added memleak.o to linker for tests
Generating the stubs for the unit tests doesn't really work since the
stubs are checked in an differ between the two configurations, so
adding memleak to the linker fixes that, by not requiring stubs to be
generated in the first place.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
We can call this multiple times. The best solution is to add and remove
the signature so it's always unsigned as we expect it.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The pay command in particular, attaches a reasonable number of
temporaries to cmd, knowing they'll be freed once cmd is done.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is called when we load from database: clearly our tests aren't thorough
enough because we were allocating and initializing `r` in an unused structure.
invs is also the owner already; functions which steal are a bit surprising
to callers, so we either document them, or just don't do it.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We have things which we don't keep a pointer to, but aren't leaks.
Some are simply eternal (eg. listening sockets), others cases are
io_conn tied to the lifetime of an fd, and timers which expire.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
memleak doesn't detect pointers to within an object, only pointers to their
exact address (it's simpler this way). Moving the linked list to the
top of the structure means it can follow the chain.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
memleak doesn't detect pointers to within an object, only pointers to their
exact address (it's simpler this way). Moving the linked list to the
top of the structure means it can follow the chain.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is not a child of cmd, since they have independent lifetimes, but
we don't want to noleak them all, since it's only the one currently in
progress (and its children) that we want to exclude.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We use the tal notifiers to attach a `backtrace` object on every
allocation.
This also means moving backtrace_state from log.c into lightningd.c, so
we can hand it to memleak_init().
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is a primitive mark-and-sweep-style garbage detector. The core is
in common/ for later use by subdaemons, but for now it's just lightningd.
We initialize it before most other allocations.
We walk the tal tree to get all the pointers, then search the `ld`
object for those pointers, recursing down. Some specific helpers are
required for hashtables (which stash bits in the unused pointer bits,
so won't be found).
There's `notleak()` for annotating things that aren't leaks: things
like globals and timers, and other semi-transients.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
jsonrpc handlers usually directly call command_success or
command_fail; not doing that implies they're waiting for something
async.
Put an explicit call (currently a noop) there, and add debugging
checks to make sure it's used.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Couldn't find a good place to put these messages, we probably want to
do the same capability based request routing that we did for the HSM,
but for now this just defines the message in the master messages file.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
If send_htlc_out() fails, it doesn't initialize pc->out; that can
make us think it's still in progress.
Reported-by: Jonas Nick
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
When gossipd sends a message, have a gossip_index. When it gets back a
peer, the current gossip_index is included, so it can know exactly where
it's up to.
Most of this is mechanical plumbing through openingd, channeld and closingd,
even though openingd and closingd don't (currently) read gossip, so their
gossip_index will be unchanged.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
All peers come from gossipd, and maintain an fd to talk to it. Sometimes
we hand the peer back, but to avoid a race, we always recreated it.
The race was that a daemon closed the gossip_fd, which made gossipd
forget the peer, then master handed the peer back to gossipd. We stop
the race by never closing the gossipfd, but hand it back to gossipd
for closing.
Now gossipd has to accept two fds, but the handling of peers is far
clearer.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
As demonstrated in the test at the end of this series, openingd dying
spontaneously causes the conn to be freed which causes the subd to be
destroyed, which fails the peer, which hits the db.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rather than using the destructor, hook up the cmd so we can close it.
peers are allocated off ld, so they are only destroyed explicitly.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We are still generating only char* style aliases, but the field is
defined to be unicode, which doesn't mix too well with char.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
We don't use it yet, but now we'll decode correctly.
See: https://github.com/lightningnetwork/lightning-rfc/pull/317
lightning-rfc commit: ef053c09431442697ab46e83f9d3f86e3510a18e
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Change all calls to use the correct serialization and deserialization
functions, include the correct headers and remove the control
messages.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
The master now hands channeld either an error code, and channeld
generates the error message, or an error message relayed from another
node to pass through.
This doesn't fill in the channel_update yet: we need to wire up gossipd
to give us that.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Currently lightningd does this, but channeld is perfectly capable of doing it.
channeld is also in a far better position to add channel_updates to it.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
estimatesmartfee 4 ECONOMICAL was too high for lnd, so drop it, with some
increased security risk.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The filter is being populated while initializing the daemon and by
adding new keys as they are being generated. The filter is then used
in connect_block to identify transactions of interest.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
This is mainly used to filter for transactions that may be of interest
to us, i.e., whether one of our keys is the recipient. It currently
does onyl simple scriptpubkey checks, but will eventually be extended
to use bloomfilters and add more sophisticated checks.
For now the goal is to speed up the processing of blocks during startup.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
This addresses a performance regression introduced by
6ceb375650. We were storing it in an
otherwise empty DB transaction, which means that DB transaction was no
longer a no-op. Now we defer storing until we need to store the
corresponding HTLC anyway, so we can just piggyback on top of that
transaction.
This is also more consistent since we'd be forgetting the payment
anyway if we restart between adding the HTLC and committing to it.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
We only send them when we're not awaiting revoke_and_ack: our
simplified handling can't deal with multiple in flights.
Closes: #244
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The wire protocol uses this, in the assumption that we'll never see feerates
in excess of 4294967 satoshi per kiloweight.
So let's use that consistently internally as well.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Depending on what we're doing, we can want different ones. So use
IMMEDIATE (estimatesmartfee 2 CONSERVATIVE), NORMAL (estimatesmartfee
4 ECONOMICAL) and SLOW (estimatesmartfee 100 ECONOMICAL).
If one isn't available, we try making each one half the previous.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This means we convert it when retrieving from bitcoind; internally it's
always satoshi-per-1000-weight aka millisatoshi-per-weight.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Test objects must be added to $(ALL_OBJS) so they correctly depend on
CCAN headers etc.
Also, each test in a subdir must depend on headers and src in the parent
directory, as it will often #include them directly.
Reported-by: Christian Decker
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Our testsuite uses --dev-fail-on-subdaemon-fail, so I didn't notice this
until I turned that off to chase a bug.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
All the callers need to pass it in: currently channeld and openingd just
fake it by copying the payment point.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
There were two bugs: we weren't returning the next from the given
label but the one matching the label, and we were appending new
invoices to the head instead of the tail, which meant we'd be
traversing in the wrong order.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
Thought we don't handle it at the moment, nodes can certainly have multiple
addresses, and we should display them all.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We don't track them accurately when in onchaind, but we don't want to:
onchaind can be restarted at any time.
Once it's all settled, we're clear to clean them up.
Before this, valgrind could complain about deferncing hout->key.peer:
Valgrind error file: valgrind-errors.10876
==10876== Invalid read of size 4
==10876== at 0x41F8AF: peer_on_chain (peer_control.h:127)
==10876== by 0x42340D: notify_new_block (peer_htlcs.c:1461)
==10876== by 0x40A08D: connect_block (chaintopology.c:96)
==10876== by 0x40A96B: topology_changed (chaintopology.c:313)
==10876== by 0x40AC85: add_block (chaintopology.c:384)
==10876== by 0x40ABF0: gather_previous_blocks (chaintopology.c:363)
==10876== by 0x4051B3: process_rawblock (bitcoind.c:410)
==10876== by 0x4044DD: bcli_finished (bitcoind.c:155)
==10876== by 0x454665: destroy_conn (poll.c:183)
==10876== by 0x454685: destroy_conn_close_fd (poll.c:189)
==10876== by 0x45DF89: notify (tal.c:240)
==10876== by 0x45E43A: del_tree (tal.c:400)
==10876== Address 0x6929208 is 2,120 bytes inside a block of size 2,416 free'd
==10876== at 0x4C2EDEB: free (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==10876== by 0x45E513: del_tree (tal.c:421)
==10876== by 0x45E849: tal_free (tal.c:509)
==10876== by 0x41A8E9: handle_irrevocably_resolved (peer_control.c:1172)
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We are announcing that we are willing to accept incoming payments with
current_height + min_final_cltv_expiry + slack, assuming that the
sender adds some slack. In particular we'd reject the payment if
slack=0 which is allowed by the spec.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
We save location where transaction was started, in case we try to nest.
There's now no error case; db_exec_mayfail() is the only one.
This means the tests need to override fatal() if they want to intercept
these errors.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is a subset of a "bitcoind: wrap callbacks in transaction." from
the everything-in-transaction branch, but we need the ld pointer now.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
And nail "make check-source" to that specific version (which is a commit id,
not a branch name, so needs a different syntax for git).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
These need to be different for testing the example in BOLT 11.
We also use the cltv_final instead of deadline_blocks in the final hop:
various tests assumed 5 was OK, so we tweak utils.py.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It crashes under valgrind, causing a valgrind error: valgrind gives us a
backtrace anyway, so we don't need it.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Normally, we get an error as soon as we send WIRE_REVOKE_AND_ACK. But if the
commit timer goes off, we get some extra cycles, during which the other side
can reconnect. In this case, we simply kill the channeld before it fails,
and never check for the permfail string.
b'lightning_channeld(18613): TRACE: dev_disconnect: -WIRE_REVOKE_AND_ACK'
b'lightning_channeld(18613): TRACE: Trying commit'
b'lightning_channeld(18613): TRACE: htlc 0: SENT_ADD_REVOCATION->SENT_ADD_ACK_COMMIT'
b'lightning_channeld(18613): TRACE: htlc added REMOTE: local +0 remote -200000000'
b'lightning_channeld(18613): TRACE: sending_commit: HTLC REMOTE 0 = SENT_ADD_ACK_COMMIT/RCVD_ADD_ACK_COMMIT'
b'lightning_gossipd(18590): TRACE: Responder: Act 1'
b'lightning_channeld(18613): TRACE: Derived key 034aab0b5cb755de836cffb34c053ba115fba6fe75414e8f56261e23c80eabb1fe from basepoint 03e0a7bb422b254f54bc954be05bd6823a7b7a4b996ff8d3079ca211590fb5df39, point 02f3bf525b6ca595bf85d63e89c95fc59c0fde3ae434b55c8093bbb5c64849da37'
b'lightningd(18465): Connected json input'
b'lightningd(18465):jcon fd 16: Success'
b'lightningd(18465):jcon fd 16: Closing (Bad file descriptor)'
b'lightning_gossipd(18590): TRACE: Responder: Act 2'
b'lightning_gossipd(18590): TRACE: Responder: Act 3'
b'lightning_gossipd(18590): UPDATE WIRE_GOSSIP_PEER_CONNECTED'
b'lightning_gossipd(18590): UPDATE WIRE_GOSSIP_PEER_CONNECTED'
b'lightningd(18465): peer 0266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c03518: Peer has reconnected, state CHANNELD_NORMAL'
b'lightning_channeld(18613): Status closed, but not exited. Killing'
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
And we report these through the getpeers JSON RPC again (carefully: in
our reconnect tests we can get duplicates which this patch now filters
out).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
In future it will have TOR support, so the name will be awkward.
We collect the to/fromwire functions in common/wireaddr.c, and the
parsing functions in lightningd/netaddress.c.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
They don't currently, since callers check, but be safe. In addition,
handle NULL returns from these in the bitcoind code.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
There are others, but they really are casued by bad failure. We need a
parachute system for these.
Closes: #176
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is a bit messier than I'd like, but we want to clearly remove all
dev code (not just have it uncalled), so we remove fields and functions
altogether rather than stub them out. This means we put #ifdefs in callers
in some places, but at least it's explicit.
We still run tests, but only a subset, and we run with NO_VALGRIND under
Travis to avoid increasing test times too much.
See-also: #176
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
test_routing_gossip (__main__.LightningDTests) ... lightningd: Outstanding taken pointers: lightningd/peer_control.c:2352:towire_errorfmt(ld, ((void *)0), "Can't resolve your address")
This caused by the other end closing due to the next bug.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
There is a race we see sometimes under valgrind on Travis which shows
gossipd receiving the node_announce from master before it reads the
channel_announce from channeld, and thus fails. The simplest solution
is to send the channel_announce and channel_update to master as well,
so it can ensure it sends them to gossipd in order
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It makes it impossible to embed an ipaddr in another structure, since we
always try to skip over any zeroes, which may swallow a following field.
Do the skip specially for the case where we're parsing routing messages:
we never use padding for our own internal messages anyway.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Makes it easier to compare before/after failures. Ideally, we should
run under Travis both with this option and with the seed based on the
entire tmp path (which is still reproducible with determination, but
not fixed every run like this is).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
There are now only two kinds of subdaemons: global ones (hsmd, gossipd) and
per-peer ones. We can handle many callbacks internally now.
We can have a handler to set a new peer owner, and automatically do
the cleanup of the old one if necessary, since we now know which ones
are per-peer.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We currently rely on a zero exit status. That's the only difference between
onchain finished handling and other per-peer daemons, so instead we should
have an explicit "done" message. This is both clearer, and allows us to
unify.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Now the flow is much simpler from a lightningd POV:
1. If we want to connect to a peer, just send gossipd `gossipctl_reach_peer`.
2. Every new peer, gossipd hands up to lightningd, with global/local features
and the peer fd and a gossip fd using `gossip_peer_connected`
3. If lightningd doesn't want it, it just hands the peerfd and global/local
features back to gossipd using `gossipctl_handle_peer`
4. If a peer sends a non-gossip msg (eg `open_channel`) the gossipd sends
it up using `gossip_peer_nongossip`.
5. If lightningd wants to fund a channel, it simply calls `release_channel`.
Notes:
* There's no more "unique_id": we use the peer id.
* For the moment, we don't ask gossipd when we're told to list peers, so
connected peers without a channel don't appear in the JSON getpeers API.
* We add a `gossipctl_peer_addrhint` for the moment, so you can connect to
a specific ip/port, but using other sources is a TODO.
* We now (correctly) only give up on reaching a peer after we exchange init
messages, which changes the test_disconnect case.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We're going to make the ip/port optional, so they should go at the end.
In addition, using ip:port is nicer, for gethostbyaddr().
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We have to do a dance when we get a reconnect in openingd, because we
don't normally expect to free both owner and peer. It's a layering
violation: freeing a peer should clean up the owner's pointer to it,
to avoid a double free, and we can eliminate this dance.
The free order is now different, and the test_reconnect_openingd was
overprecise.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This fixes the only case where the master currently has to write directly
to the peer: re-sending an error. We make gossipd do it, by adding
a new gossipctl_fail_peer message.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
In particular, the main daemon needs to pass it about (marshal/unmarshal)
but it won't need to actually use it after the next patch.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We pull them from the database on-demand, where we're storing them
anyway. No need to keep them in memory as well.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
No idea why we were iterating over the list of stubs and then passing
in the index instead of a pointer to the stub directly.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
This wires in the loading of `struct htlc_stub`s on-demand when
starting `onchaind` so that we don't need to keep them in memory.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
So far we were tracking the status by including it either in the paid
or the unpaid list. This refactor makes the state explicit, which
matches the planned DB schema much better.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
While loading HTLCs from the database we might not yet have all the
incoming HTLCs loaded when loading a dependent htlc_out. So we defer
the wiring of the HTLCs until we are sure we have them loaded.
This is also the first step towards keeping that association only in
the database, since otherwise we cannot selectively load channels from
DB.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
Especially when testing we might want to disable the automatic
reconnection logic in order not to masquerade bugs that disappear when
reconnecting.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
Seems to go out to lunch on reorgs:
+136792.168286138 lightningd(9465):BROKEN: bitcoin-cli getchaintips exited 28: 'error code: -28
error message:
Rewinding blocks...
Closes: #286
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We don't hit this in testing, since we wait for startup already. Hacking
tests to avoid that, I tested this code by hand.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Using pc after free in the pay_command_destroyed destructor, so
we just steal cmd onto pc so free order is the one we want.
[ Edit: expanded comment, split commit ]
Signed-off-by: Christian Decker <decker.christian@gmail.com>
So far only happens during normal shutdown, but it may happen in other
cases as well. We simply define a new destructor that unregisters the
`cmd` from the `jcon`.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
These were fun to hunt down. The jcon and the conn are allocated off
of ld, so the free order is unspecified and if conn is freed before
conn then the finish_jcon destructor uses conn after free.
[ Edit: split commit, modified to use a destructor directly on jcon,
which is more robust than relying on it only being freed via conn --RR ]
Signed-off-by: Christian Decker <decker.christian@gmail.com>
peer_fail_permanent() frees peer->owner, but for bad_peer() we're
being called by the sd->badpeercb(), which then goes on to
io_close(conn) which is a child of sd.
We need to detach the two for this case, so neither tries to free the
other.
This leads to a corner case when the subd exits after the peer is gone:
subd->peer is NULL, so we have to handle that too.
Fixes: #282
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We have a race where we start onchaind, but state is unchanged, so checks
like peer_control.c's:
peer_ready = (peer->owner && peer->state == CHANNELD_AWAITING_LOCKIN);
if (!peer_ready) {
log_unusual(peer->log,
"Funding tx confirmed, but peer state %s %s",
peer_state_name(peer->state),
peer->owner ? peer->owner->name : "unowned");
} else {
subd_send_msg(peer->owner,
take(towire_channel_funding_locked(peer,
peer->scid)));
}
Can send to the wrong daemon.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We were sending a channeld message to onchaind, which was v. confusing
due to overlap. We make all the numbers distinct, which means we can
also add an assert() that it's valid for that daemon, which catches
such errors immediately.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We re-use the value for reasonable_depth given by the master, and we
tell it when our timeout transactions reach that depth.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
When we see an offered HTLC onchain, we need to use the preimage if we
know it. So we dump all the known HTLC preimages at startup, and send
new ones as we discover them.
This doesn't cover preimages we know because we're the final
recipient; that can happen if an HTLC hasn't been irrevocably
committed yet. We'll do that in a followup patch.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
If the HSM is slow it might happen that the timestamp has changed the
second time we come around, so we generate the timestamp externally
and pass it in so we're sure it won't change between calls.
Reported-by: Rusty Russell
Signed-off-by: Christian Decker <decker.christian@gmail.com>
lightningd can crash on shutdown if it's in the middle of getchaintips;
we free the conn, the finished callback is called (process_chaintips),
and it reports that it received an empty result.
The simplest fix is to set a flag in the struct bitcoind destructor,
and avoid the callback.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Either when it exits with a signal, or sends an error status message.
Then we make test_lightningd.py use it.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This change is really to allow us to have a --dev-fail-on-subdaemon-fail option
so we can handle failures from subdaemons generically.
It also neatens handling so we can have an explicit callback for "peer
did something wrong" (which matters if we want to close the channel in
that case).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
1. Remove reference to old $(LIGHTNINGD_OLD_LIB_OBJS) var (in handshaked too).
2. Make check depend directly on unit tests, insteadof weird lightningd/tests
variable.
3. check-source-bolt and check-whitespace are automatic for $(ALL_TEST_PROGRAMS)
so we don't need them here.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is the step where we broadcast the transaction to the network and
a nice place to extract the change from the transaction.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
It's no longer used and we definitely do not want to run with an
outdated or future db, so we'll terminate if we can't upgrade or
the version is newer than what we understand.
Signed-off-by: Christian Decker <decker.christian@gmai.com>
So far we were always using the deadline in the announcements, that's
obviously not good, so this introduces the parameter as per spec.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
We weren't killing it. Eventually it would die, and peer_owner_finished()
would access subd->peer->owner, but that peer was freed already.
Closes: #261
Reported-by: Christian Decker <decker.christian@gmail.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
To reproduce the next bug, I had to ensure that one node keeps thinking it's
disconnected, then the other node reconnects, then the first node realizes
it's disconnected.
This code does that, adding a '0' dev-disconnect modifier. That means
we fork off a process which (due to pipebuf) will accept a little
data, but when the dev_disconnect file is truncated (a hacky, but
effective, signalling mechanism) will exit, as if the socket finally
realized it's not connected any more.
The python tests hang waiting for the daemon to terminate if you leave
the blackhole around; to give a clue as to what's happening in this
case I moved the log dump to before killing the daemon.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
In this case, we unset the old subd->peer, then freed subd.
peer_owner_finished dereferenced subd->peer->owner, and boom:
test_disconnect_funder (__main__.LightningDTests) ... Fatal signal 11. Log dumped in crash.log
------------------------------- Valgrind errors --------------------------------
Valgrind error file: valgrind-errors.2882
==2882== Invalid read of size 8
==2882== at 0x413F74: peer_owner_finished (peer_control.c:679)
==2882== by 0x41EA2C: destroy_subd (subd.c:381)
==2882== by 0x459700: notify (tal.c:240)
==2882== by 0x459BB1: del_tree (tal.c:400)
==2882== by 0x459FC0: tal_free (tal.c:509)
==2882== by 0x413796: peer_reconnected (peer_control.c:493)
==2882== by 0x413A6A: add_peer (peer_control.c:592)
==2882== by 0x40ED1F: handshake_succeeded (new_connection.c:186)
==2882== by 0x41E3DD: sd_msg_reply (subd.c:262)
==2882== by 0x41E6BB: sd_msg_read (subd.c:318)
==2882== by 0x41E4E6: read_fds (subd.c:283)
==2882== by 0x44DEB4: next_plan (io.c:59)
==2882== Address 0x838 is not stack'd, malloc'd or (recently) free'd
==2882==
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The logic of dispatching the announcement_signatures message was
distributed over several places and daemons. This aims to simplify it
by moving it all into `channeld`, making peer_control only report
announcement depth to `channeld`, which then takes care of the
rest. We also do not reuse the funding_locked tx watcher since it is
easier to just fire off a new watcher with the specific purpose of
waiting for the announcement_depth.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
1. The code to skip over padding didn't take into account max.
2. It also didn't use symbolic names.
3. We are not supposed to fail on unknown addresses, just stop parsing.
4. We don't use the read_ip/write_ip code, so get rid of it.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Use a negative timestamp as the flag for this, making the test simple.
This allows valgrind to detect that we're accessing them prematurely,
including across the wire on gossip_getchannels_entry.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
jl777 reported a crash when we try to pay past reserve. Fix that (and
a whole class of related bugs) and add tests.
In test_lightning.py I had to make non-async path for sendpay() non-threaded
to get the exception passed through for testing.
Closes: #236
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
You will want to 'make distclean' after this.
I also removed libsecp; we use the one in in libwally anyway.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Some fields were redundant, some are simply moved into 'struct lightningd'.
All routines updated to hand 'struct lightningd *ld' now.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Also, we split the more sophisticated json_add helpers to avoid pulling in
everything into lightning-cli, and unify the routines to print struct
short_channel_id (it's ':', not '/' too).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
To avoid everything pulling in HTLCs stuff to the opening daemon, we
split the channel and commit_tx routines into initial_channel and
initial_commit_tx (no HTLC support) and move full HTLC supporting versions
into channeld.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Other places require the flags and states, but the structure is
only needed in channeld, and even then we can remove several fields.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
I was hoping to defer HTLC updates until we actually store HTLCs, but
we need to flush to DB whenever balances update as well.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
We're very simple about it: if there's a reorganization, we restart. Otherwise
we tell it about everything.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It's in the shachain, so storing it is completely redundant. We leave
it in for the moment so we can assert() that nothing has changed.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
When loading from DB the list of htlcs was not being initialized which
caused a segfault when the first commit came around, this fixes it.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
The peer->seed needs to be unique for each channel, since bitcoin
pubkeys and the shachain are generated from it. However we also need
to guarantee that the same seed is generated for a given channel every
time, e.g., upon a restart. The DB channel ID is guaranteed to be
unique, and will not change throughout the lifetime of a channel, so
we simply mix it in, instead of a separate increasing counter.
We also needed to make sure to store in the DB before deriving the
seed, in order to get an ID assigned by the DB.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
This is the big one, and it's completely anticlimactic: it loads all
channels that have reached opening and are not marked as
closingd_complete into memory, that's it.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
This was supposed to be a temporary solution anyway, and I had a
rather annoying mixup between peer_id and unique_id, the latter of
which is actually a connection identifier.
Add the channel to the peer on the two open paths (fundee and funder)
and store it into the database. Currently fails when opening a channel
to a known peer after loading from DB because we attempt to insert a
new peer with the same node_id. Will fix later.
As per lightning-rfc change 956e8809d9d1ee87e31b855923579b96943d5e63
"BOLT 7: add chain_hashes values to channel_update and channel_announcment"
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This brings us up to 955e874acc535ab2c74c1cf0eab61896ea4224ff in
https://github.com/lightningnetwork/lightning-rfc
This doesn't actually change anything; the only actual change is held back
for the next commit.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is required for onchaind: we want to watch all descendents by default,
as to do otherwise would be racy, which means we need to traverse the outputs
when a tx appears.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
And store in peer->last_tx/peer->last_sig like all other places,
that way we broadcast it if we need to.
Note: the removal of tmpctx in funder_channel() is needed because we
use txs[0], which was allocated off tmpctx.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We need to check if we exit after sending a revoke_and_ack, otherwise
channeld ends up getting the closing_signed packet.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
tal_strdup() doesn't set tal_count(), so we end up sending an ERROR
packet with an empty message. Wrap this and get it right.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
There was a race condition that would cause an assertion to segfault
if a call to release_peer was interleaved with a fail_peer. The
release_peer was making the peer non-local, which was then causing the
assertion in fail_peer to fail. Now we just have 3 cases: not found,
local, and non-local.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
After quite some back and forth we seem to finally agree on the bit
3 (mask 0x08) to signal optional initial_routing_sync.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
This was causing failures on testnet where confirmations are not
immediate.
Reported-by: Fabrice Drouin @sstone
Signed-off-by: Christian Decker <decker.christian@gmail.com>
We support a number of features already, so failing connections
whenever we see an even bit set is not a good idea. This turned out to
kill our connections to eclair.
Also, the spec says that the LSB / bit 0 is to be counted as index 0, and
therefore even. So we need to check the lower of each 2-bit-tuple not
the higher one.
We were using the bitcoin genesis blockhash for all networks, which is
not correct, and would result in the open being aborted when talking
to other implementations.
Reported-by: @sstone and @pm47
Signed-off-by: Christian Decker <decker.christian@gmail.com>
We weren't registering reconnecting peers for broadcasts. Just
starting a timer is enough. Also added an integration test to check
that the gossip sync is being resumed.
test_closing_negotiation_reconnect (__main__.LightningDTests) ... peer state CLOSINGD_COMPLETE should be CLOSINGD_SIGEXCHANGE
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is a transitional state, while we're waiting to see the
closing tx onchain (which is To Be Implemented).
The simplest way to do re-transmission is to re-use closingd, and just
disallow any updates.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
I made the mistake of thinking it was a [NUM_SIDES] array, but
it's actually our balance, and it's in millisatoshi. Rename
for clarity.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
As tracked down by Christian; by setting up the master conn first,
we make the master fd async. This means that the synchronous read
(in init_channel) can fail with -EAGAIN, and indeed, Christian
saw this when not running under valgrind.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We actually don't need to transition if we're reconnecting, and logic
to go to CHANNELD_NORMAL was wrong: we checked that we'd seen funding tx
locked, but not that we'd received a msg from the remote peer.
We need to fix the tests now we no longer double-transition, too.
Fixes: #188
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Important: a non-standard one can make the closing tx not propagate.
Drive-by cut&paste message fix, too.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is what it actually is, and makes it clearer when we refer to the
spec. It's the commitment we're currently updating, which is the next
commitment.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We currently have the problem that the master can send new HTLCs before
we've processed the incoming reestablish message.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The next patch includes wire/peer_wire.h and causes a compile error
as lightningd/gossip_control.c defined its own gossip_msg function.
New names are clearer.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This can happen even without a protocol violation, if the incoming
update_add_htlc crosses over our outgoing shutdown.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We keep the scriptpubkey to send until after a commitment_signed (or,
in the corner case, if there's no pending commitment). When we
receive a shutdown from the peer, we pass it up to the master.
It's up to the master not to add any more HTLCs, which works because
we move from CHANNELD_NORMAL to CHANNELD_SHUTTING_DOWN.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We don't need to keep this around any more: by handing it to
subdaemons we ensure we'll close it if the peer disconnects, and we
also add code to get a new one on reconnection.
Because getting a gossip_fd is async, we re-check the peer state after
it gets back. This is kind of annoying: perhaps if we were to hand
the reconnected peer through gossipd (with a flag to immediately
return it) we could get the gossip fd that way and unify the paths?
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
At the moment, master simply keeps the gossip fd open when peer
disconnects. That's inefficient, and wrong anyway (it may want a
complete new sync, or may not, but we'll currently send all the
messages including stale ones).
This interface will be required for restart anyway.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Now we're always sync, just use an fd. Put the hsm_sync_read() helper
here, too, and do HSM init sync which makes things much simpler.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
With no async calls left, we can just use a stack variable for the fd.
And we're now *always* in the hands of some daemon, unless we're
disconnected, so owner is only NULL in that case.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We had a terrible hack in gossip when a peer didn't exist. Formalize
a pattern when code+200 is a failure (with no fds passed), and use it
here.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This means there's no GETTING_HSMFD state at all any more. We
temporarily play games with the hsm fd; those will go away once we're
done.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This means there's no GETTING_SIG_FROM_HSM state at all any more. We
temporarily play games with the hsm fd; those will go away once we're
done.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Wallet should really be the container for anything bip32 related, so
I'd like to slowly wean off of `ld->bip32_base` in favor of
`ld->wallet->bip32_base`
We'll re-use them a few times so having them at a central location is
nice. We also fix a bug that was unreserving UTXO entries upon free,
instead of promoting them to being spent.
So far we always needed to know the public key, which was not the case
for addresses that we don't own. Moving the hashing outside of the
script construction allows us to send to arbitrary addresses. I also
added the hash computation to the pubkey primitives.
When we get a fail/fulfill on an outgoing HTLC, we tell the correspoding
incoming HTLC about it. But if that peer is disconnected, we don't.
The better solution is to copy the preimage/malformed/failmessage and mark
the incoming HTLC as resolved. This is done most simply by marking it
SENT_REMOVE_HTLC, which will work in the database case as well.
channeld now re-transmits appropriately when it gets started with an HTLC
in that state.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This matches what the master does: increments commit index when we send
commit_sig. Thus if we restart at that point, we match.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This matters in one case: channeld receiving a bad message is a
permenant failure, whereas losing a connection is transient.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We need the old remote per_commitment_point so we can validate the
per_commitment_secret when we get it.
We unify this housekeeping in the master daemon using
update_per_commit_point().
This patch also saves whether remote funding is locked, and disallows
doing that twice (channeld should ignore it).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It's a bit tricky since we want to hand more verbose errors to the local
case, but the locally-created and forwarded paths had diverged (the local
one missing some things).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
There are two ways we can do retransmission on reconnect: re-derive
what we would have sent, or remember it and simply re-send. The
rederivation is difficult: unwinding state depends on whether we sent
a revoke_and_ack before or after the commitment_signed, and unwinding
a revoke_and_ack would require us to remember HTLCs we would have
normally forgotten at this point.
So we simply tell the master to remember the old signatures for us,
and hand them back in case we need to re-send.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
In the case where we can't decrypt the onion, we can't fail it in the
normal way (which is encrypted using the onion shared secret), we need
to respond with a update_fail_malformed_htlc message.
Moreover, we need to remember this for persistence. This means that
we really have three conclusions for an HTLC: fulfilled, failed,
malformed. Fix up the logic everywhere which assumed failed or
fulfilled.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It's easiest to have the master keep the last commit we sent, for
re-transmission. We could recalculate it, but it's made more difficult
by the before/after revoke case.
And because revoke_and_ack changes the channel state, we need to
remember which order we sent them in for re-transmission.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We need this for reestablishing a channel.
(Note: this patch changes quite a bit in this series, but reshuffling was
tedious).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Currently it's fairly ad-hoc, but we need to tell it to channeld when
it restarts, so we define it as the non-HTLC balance.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It needs to save them to the db in case of restart; this means we tell
it about funding_locked, as well as the next_per_commit_point given
in revoke_and_ack.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The channel daemon gets the shared secrets from the HSM to save
the master daemon some work. It used to hand these over at
revoke_and_ack receive, which is when the master daemon needs them.
However, it's a bit simpler to hand them over when we first tell
the master about the incoming HTLC (the first commitsig).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
They share some fields, but they're basically different, and it's clearest
to treat them differently in most places.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
When adding their HTLCs, it needs all the information. When failing,
it needs the id as key and the failure reason. When fulfilling, it
needs the id and payment preimage.
It also needs to know when we have received an revoke_and_ack or a
commitment_signed, to place in the database.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We're about to change to a batch interface, where we tell the master
before we send certain packets (eg. commit, revoke). We need to wait
for it to respond before doing anything else, but it might cross-over
and be sending us commands at the same time.
This queues those requests until we're ready.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This prepares us for handlers turning off peer I/O, rather than assuming
we always want to handle the next incoming message.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>