Annotating the htlc in `listpeers` with their current status, and
which plugin is currently holding on to them with their
`htlc_accepted` hook can help us debug where plugins may go wrong.
Changelog-Added: jsonrpc: HTLCs in `listpeers` are now annotated with a status if they are waiting on an `htlc_accepted` hook of a plugin.
We set the timeout on first HTLC, but didn't clear it if that HTLC failed.
It's saner to have a per-HTLC timeout (since that's what it is!) and
also our timer infra is specially coded to scale approximately infinitely so
trying to optimize this is vastly premature.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Fixed: Protocol: We would sometimes gratuitously disconnect 30 seconds after an HTLC failed.
Still asserts that it's the standard size, but makes it a dynamic
member. For simpliciy, changes the parse_onionpacket API (it must be
a tal object now, so we might as well allocate it here to catch all
the callers).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Found in tests/test_connection.py::test_restart_many_payments:
`lightningd: outstanding taken(): lightningd/peer_htlcs.c:532:towire_temporary_channel_failure(((void *)0), ((void *)0))`
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This adds a `state_change` 'cause' to a channel.
A 'cause' is some initial 'reason' a channel was created or closed by:
/* Anything other than the reasons below. Should not happen. */
REASON_UNKNOWN,
/* Unconscious internal reasons, e.g. dev fail of a channel. */
REASON_LOCAL,
/* The operator or a plugin opened or closed a channel by intention. */
REASON_USER,
/* The remote closed or funded a channel with us by intention. */
REASON_REMOTE,
/* E.g. We need to close a channel because of bad signatures and such. */
REASON_PROTOCOL,
/* A channel was closed onchain, while we were offline. */
/* Note: This is very likely a conscious remote decision. */
REASON_ONCHAIN
If a 'cause' is known and a subsequent state change is made with
`REASON_UNKNOWN` the preceding cause will be used as reason, since a lot
(all `REASON_UNKNOWN`) state changes are a subsequent consequences of a prior
cause: local, user, remote, protocol or onchain.
Changelog-Added: Plugins: Channel closure resaon/cause to channel_state_changed notification
Great report from whitslack on this crash at startup:
```
2020-10-07T13:03:21.419Z **BROKEN** lightningd: FATAL SIGNAL 6 (version 0.9.1)
2020-10-07T13:03:21.419Z **BROKEN** lightningd: backtrace: common/daemon.c:51 (crashdump) 0x559fb67bcc76
2020-10-07T13:03:21.419Z **BROKEN** lightningd: backtrace: /var/tmp/portage/sys-libs/glibc-2.32-r2/work/glibc-2.32/signal/../sysdeps/unix/sysv/linux/x86_64/sigaction.c:0 ((null)) 0x7f61cdca8baf
2020-10-07T13:03:21.419Z **BROKEN** lightningd: backtrace: ../sysdeps/unix/sysv/linux/raise.c:50 (__GI_raise) 0x7f61cdca8b31
2020-10-07T13:03:21.419Z **BROKEN** lightningd: backtrace: /var/tmp/portage/sys-libs/glibc-2.32-r2/work/glibc-2.32/stdlib/abort.c:79 (__GI_abort) 0x7f61cdc92535
2020-10-07T13:03:21.419Z **BROKEN** lightningd: backtrace: /var/tmp/portage/sys-libs/glibc-2.32-r2/work/glibc-2.32/assert/assert.c:92 (__assert_fail_base) 0x7f61cdc9241e
2020-10-07T13:03:21.419Z **BROKEN** lightningd: backtrace: /var/tmp/portage/sys-libs/glibc-2.32-r2/work/glibc-2.32/assert/assert.c:101 (__GI___assert_fail) 0x7f61cdca1241
2020-10-07T13:03:21.419Z **BROKEN** lightningd: backtrace: lightningd/subd.c:750 (subd_send_msg) 0x559fb67a1c31
2020-10-07T13:03:21.419Z **BROKEN** lightningd: backtrace: lightningd/subd.c:745 (subd_send_msg) 0x559fb67a1c31
2020-10-07T13:03:21.419Z **BROKEN** lightningd: backtrace: lightningd/peer_htlcs.c:252 (local_fail_in_htlc) 0x559fb6798f77
2020-10-07T13:03:21.419Z **BROKEN** lightningd: backtrace: lightningd/peer_htlcs.c:1441 (onchain_failed_our_htlc) 0x559fb6798f77
2020-10-07T13:03:21.419Z **BROKEN** lightningd: backtrace: lightningd/onchain_control.c:339 (handle_missing_htlc_output) 0x559fb6786b9d
2020-10-07T13:03:21.419Z **BROKEN** lightningd: backtrace: lightningd/onchain_control.c:455 (onchain_msg) 0x559fb6786b9d
```
The problem is a channel with an onchaind can be in state FUNDING_STATE_SEEN,
because onchaind has started but not responded to init yet (which it does once it
has analyzed the commitment tx).
Channel B is onchain, and its onchaind fails the HTLC, and we try to send a msg
to channel A's onchaind as if it were channeld.
Explicitly check if it's channeld, rather than trying to see if it's onchaind.
Fixes: #4114
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Fixed: crash: assertion fail at restart when source and destination channels of an HTLC are both onchain.
This avoids overwriting the ones in git, and generally makes things neater.
We have convenience headers wire/peer_wire.h and wire/onion_wire.h to
avoid most #ifdefs: simply include those.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Note that other directories were explicitly depending on the generated
file, instead of relying on their (already existing) dependency on
$(LIGHTNINGD_HSM_CLIENT_OBJS), so we remove that.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This means some files get renamed, and I took the opportunity to clarify
our naming (the *d* is important!)
1. channeld/channel_wire.csv -> channeld/channeld_wire.csv
2. channeld/gen_channel_wire.h -> channeld/channeld_wiregen.h
3. enum channel_wire_type -> enum channeld_wire
4. WIRE_CHANNEL_FUNDING_DEPTH -> WIRE_CHANNELD_FUNDING_DEPTH.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
As per https://github.com/lightningnetwork/lightning-rfc/pull/785
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Changed: config: the default CLTV expiry is now 34 blocks, and final expiry 18 blocks as per new BOLT recommendations.
This is best done by passing `struct bitcoin_signature` around instead
of raw signatures. We still save raw sigs to the db, and of course the
wire protocol uses them.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
On node start we replay onchaind's transactions from the database/from
our loaded htlc table. To keep things tidy, we shouldn't notify the
ledger about these, so we wrap pretty much everything in a flag that
tells us whether or not this is a replay.
There's a very small corner case where dust transactions will get missed
if the node crashes after the htlc has been added to the database but
before we've successfully notified onchaind about it.
Notably, most of the obtrusive updates to onchaind wrappings are due to
the fact that we record dust (ignored outputs) before we receive
confirmation of its confirmation.
HTLCs trigger a coin movement only when their final form (state) is
reached. This prevents us from needing to concern ourselves with
retries, as well as being the absolutely most correct in terms of
answering the question 'when has the money irrevocably changed hands'.
All coin movements should pass this bar, for ultimate accounting
correctness
The current plan for coin movements involves tagging
origination/destination htlc's with a separate tag from 'routed' htlcs
(which pass through our node). In order to do this, we need a persistent flag on
incoming htlcs as to whether or not we are the final destination.
I noticed the following in logs for tests/test_connection.py::test_feerate_stress:
```
DEBUG 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-chan#1: Failing HTLC 18446744073709551615 due to peer death
DEBUG 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-chan#1: local_routing_failure: 8194 (WIRE_TEMPORARY_NODE_FAILURE)
```
This is because it reports the (transient) node_failure error, because
our channel_failure message is incomplete. Fix this wart up.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Previously we've used the term 'funder' to refer to the peer
paying the fees for a transaction; v2 of openchannel will make
this no longer true. Instead we rename this to 'opener', or the
peer sending the 'open_channel' message, since this will be universally
true in a dual-funding world.
The plugin can basically return whatever it thinks the preimage is, but we
weren't handling the case in which it doesn't actually match the hash. If it
doesn't match now we just return an error claiming we don't have any matching
invoice.
One is called on every plugin return, and tells us whether to continue;
the other is only called if every plugin says ok.
This works for things like payload replacement, where we need to process
the results from each plugin, not just the final one!
We should probably turn everything into a chained callback next
release.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
They callback must take ownership of the payload (almost all do, but
now it's explicit).
And since the payload and cb_arg arguments to plugin_hook_call_() are
always identical, make them a single parameter.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Note that it's channeld which calculates the shared secret, too. This
minimizes the work that lightningd has to do, at cost of passing this
through.
We also don't yet save the blinding field(s) to the database.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This requires us to call ecdh() in the corner case where the blinding seed
is in the TLV itself (which is the case for the start of a blinded route).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This happened on my testnet node because I've been failing to reconnect to
a node which created a channel and never exchanged announcement sigs.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We currently abuse the added_htlc and failed_htlc messages to tell channeld
about existing htlcs when it restarts. It's clearer to have an explicit
'existing_htlc' type which contains all the information for this case.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
For messages, we use the onion but payload lengths 0 and 1 aren't special.
Create a flag to disable that logic.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>