Looks like #4394 treated a symptom but not the root cause. We were
actually sending the message framed with the WIRE_CUSTOMMSG_OUT and
the length prefix over the encrypted connection to the peer. It just
happened to be a valid custommsg...
This fixes the issue, and this time I made sure we actually send the
raw message over the wire. However for backward compatibility we
needed to imitate the faulty behavior which is 90% of this patch :-)
Changelog-Fixed: plugin: `dev-sendcustommsg` included the type and length prefix when sending a message.
We move over to the new "warning" paradigm, instead of using
an "rbf_fail" message.
Every failure is either a warning or an error; on warnings we
hang up and reconnect later, effectively resetting the state.
And make all the callers choose which one. In general, I prefer warn,
which lets them reconnect and try again, however some places are either
stated that they must be errors in the spec itself, or in openingd
where we abandon the channel when we close the connection anyway.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Changed: Protocol: we now send warning messages and close the connection, except on unrecoverable errors.
This takes from the draft spec at https://github.com/lightningnetwork/lightning-rfc/pull/834
Note that if this draft does not get included, the peer will simply
ignore the warning message (we always close the connection afterwards
anyway).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Added: Protocol: we now report the new (draft) warning message.
We should actually be including this (as it may define _GNU_SOURCE
etc) before any system headers. But where we include <assert.h> we
often didn't, because check-includes would complain that the headers
included it too.
Weaken that check, and include config.h in C files before assert.h.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Don't include exp directly, use an ifdef in common/bolt12
(like we do for peer and onion wiregen files).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
There's a case where a dropped funding_locked will result in the peer
moving onto channeld, while we stay in dualopend. As we haven't
received their funding_locked, we retransmit tx_sigs, which channeld
will need to handle.
With the patch the peer drops it on the floor; the peer will resend
funding_locked on reconnect, which will correctly advance us to
channeld and CHANNELD_NORMAL
We used this for dual funded opens, to track the receipt of signatures.
We're moving all of this over to dualopend now, however, so we no longer
need the PSBT in channeld.
The previous onion_message code required a confirmed, not-shutting-down
channel, not just a connection. That's overkill; plus before widespread
adoption we will want to connect directly as a last resort.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
I saw this message:
```
lightning_channeld: outstanding taken(): channeld/channeld.c:3087:blinding
lightning_channeld: outstanding taken(): channeld/channeld.c:3087:blinding
lightning_channeld: outstanding taken(): channeld/channeld.c:3087:blinding
```
The caller does take(blinding), but blinding can be NULL. We should
move the code around to do the take() earlier anyway.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Still asserts that it's the standard size, but makes it a dynamic
member. For simpliciy, changes the parse_onionpacket API (it must be
a tal object now, so we might as well allocate it here to catch all
the callers).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
While debugging a hanging channel with a user I noticed that they
called `close` on a channel, resulting in the channel showing
`CHANNELD_SHUTTING_DOWN`, but the billboard seemed to show the
information the wrong way around:
```json
{
"peers": [
{
"connected": true,
// ...
"channels": [
{
"state": "CHANNELD_SHUTTING_DOWN",
// ...
"status": [
"CHANNELD_SHUTTING_DOWN:Reconnected, and reestablished.",
"CHANNELD_SHUTTING_DOWN:Funding transaction locked. They need our announcement signatures. They've sent shutdown, waiting for ours"
],
// ...
}
]
}
]
}
```
Aside from the hung channel, the switch in direction of the status
seemed weird. Checking the billboard code seems to have the status
switched as well:
ff8830876d/channeld/channeld.c (L223-L226)
We set `shutdown_sent[LOCAL]` when we send the shutdown:
ff8830876d/channeld/channeld.c (L823-L839)
And we set `shutdown_sent[REMOTE]` when we receive the shutdown:
ff8830876d/channeld/channeld.c (L1730-L1781)
So I think the billboard code just needs to be switched around.
Changelog-Fixed: JSON-RPC: The status of the shutdown meesages being exchanged is now displayed correctly.
This is vital for calculating merkle trees; I previously used
towire+fromwire to get this!
Requires generation change so we can magic the ARRAY_SIZE var (the C
pre-processor can't uppercase things).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Required to determine if this msg used expected reply path.
Also remove FIXME (om->enctlv is handled above).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
```
channeld/channeld.c:237:2: error: ‘shutdown_status’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
```
Reported-by: az0re on IRC
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Avoids much cut & paste. Some tests don't need any of it, but most
want at least some of this infrastructure.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
libwally has a quirk where the finalize method will fail to 'completely'
finalize an input's parts if either the final_scriptsig or
final_redeemscript fields are set
since we manually set the final_witness stack here, we also need to
fully finalize the redeemscript -> final_scriptsig here as well.
Note that check-whitespace and check-bolt already do this, so we
can eliminate redundant lines in common/Makefile and bitcoin/Makefile.
We also include the plugin headers in ALL_C_HEADERS so they get
checked.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We used to send our tx_sigs before we got to channeld existing. We
changed how this worked so that multifundchannel could live, but failed
to clean up the logic of what "having a psbt around" means wrt channeld
and messaging our peer.
The general idea is that we want to send `tx_signatures` to our peer on
reconnect until they've sent us `funding_locked`.
Note that it's an error to
- send `funding_locked` without having sent `tx_signatures`
- send `tx_signatures` after sending `funding_locked`
We use the 'finalized' state of the peer's inputs/outputs to help signal
where we are in receiving their sigs -- but this doesn't work at all for
opens where the peer doesn't contribute inputs at all.
This isn't really a huge deal, but it does mean that if we receive a
peer's `tx_sigs` more than once (will happen for a reconnect before
`funding_locked`), then we'll issue a notification about receiving their
sigs multiple times. /shrug
Prior to this patch update, we expected a client to call
`openchannel_signed` before checking for peer's tx-sigs messages on the
wire.
When moving to a 'multifundchannel' approach, we'll need to be able to
collect sigs from our peers before sending our tx_sigs message. There's
no strict ordering on when tx-sigs messages are sent/received, so this
is fine.
To do this, we go ahead and start up channeld as soon as
commitment_sigs are secured, so that we process incoming tx-sigs from
our peers as soon as we get them.
We're about to totally upset the order that sigs are set on our PSBTs
for new channel opens, making it such that our peer's sigs may arrive
before ours do.
We can no longer rely on the 'set witness means this is our input' since
there's no guarantee that our input sigs have been added yet, so we
check the serial_id and only set the stack on their (odd) inputs.
Instead of a boutique message, use a "real" channel_announcement for
private channels (with fake sigs and pubkeys). This makes it far
easier for gossmap to handle local channels.
Backwards compatible update, since we update old stores.
We also fix devtools/dump-gossipstore to know about the tombstone markers.
Since we increment our channel_announce count for local channels now,
the stats in the tests changed too.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
There's a few structs/wire calls that only exist under experimental features.
These were in a common file that was shared/used a bunch of places but
this causes problems. Here we move one of the problematic methods back
into `openingd`, as it's only used locally and then isolate the
references to the `witness_stack` in a new `common/psbt_internal` file.
This lets us remove the iff EXP_FEATURES inclusion switches in most of
the Makefiles.
...right time.
We re-send the tx_sigs on start/init/reconnect until we've gotten a
funding_locked from our peer. We also build it in channeld now, instead
of in dualopend, and don't pass in a message for them anymore
We need the PSBT to create the finalized tx from once the peer's
tx_signatures are received. Since we're passing the PSBT, we no longer
need the secondary message to be passed, as it was derived from the
PSBT.
Also removes now unused witness serialization code
We had one report of this, and then Eugene and Roasbeef of Lightning
Labs confirmed it; they saw misordered HTLCs on reconnection too.
Since we didn't enforce this when we receive HTLCs, we never noticed :(
Fixes: #3920
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Fixed: Protocol: fixed retransmission order of multiple new HTLCs (causing channel close with LND)
We didn't care, but other implementations (particularly lnd) do. And it
does violate the spec.
(We need to use skip not xfail on the test which catches this, since
xfail doesn't seem to stop errors reported by cleanup)
(Includes Christian's typo fix!)
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>