Changelog-Added: hsmd: Added hsmd_forget_channel to enable explicit channel deletion. ([#6987])
Motivation: Previously, a signer prematurely forgetting a channel led
to failures in unresolved channel requests. This update introduces
hsmd_forget_channel, allowing nodes to explicitly notify signers when
a channel is irrevocably resolved and can be safely forgotten. This
ensures synchronized channel cleanup between nodes and signers.
This change maintains backward and forward compatibility. Nodes
explicitly check whether a signer has `WIRE_HSMD_FORGET_CHANNEL`
capability before sending the message. Nodes without
`WIRE_HSMD_FORGET_CHANNEL` capability won't send this message. Signers
capable of handling this message but not receiving it will continue to
use conservative pruning methods.
Fixes#6987
Rename the offending functions from wally_foo to cln_wally_foo.
For the sake of a minimal diff, only calls which conflict with wally
v1.0.0 have been changed. However it is bad form to use the wally_
function namespace; the remaining such calls should also be renamed.
Changelog-None
Signed-off-by: Jon Griffiths <jon_p_griffiths@yahoo.com>
The integration with opentelemetry was sub-optimal: it was generating
jaeger-style traces, with short traceIds and we were considering the
entire lifetime as a single trace. This PR changes that to a trace for
startup and then a trace for any event that doesn't already have a
parent.
We also allow using the `CLN_TRACEPARENT` envvar to attach the startup
to a remote / external trace, potentially by whatever started the main
process. This is useful to see the startup trace in the wider context
of whatever tooling is built around it.
Changelog-Added: tracing: It is now possible to inject a parent for the startup trace by setting the `CLN_TRACEPARENT` envvar
As reported by @wtogami, LND nodes are using a default
min_final_cltv_expiry_delta of 9, which makes them unable to pay invoices
using the modern spec default of 18. Forcing inclusion of the c field
allows interoperability until broader support of the 18 block default.
Fixes: #6956
Changelog-Fixed: Default bolt11 invoices are payable by LND nodes.
On July 18th, @jgriffiths wrote:
> You need to set this to NULL after freeing it, otherwise if line 72 returns you have a dangling pointer and potential later use-after-free here. Alternately use wally_psbt_set_input_final_witness(NULL) which will free any existing witness and set the value to NULL.
Reported-By: @jgriffiths
This is more thorough than the minimal one required for getroute(), including the feerates
and cltv deltas.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Reported-by: Shahana Farooqui
Changelog-Fixed: JSON-RPC: Plugin notification `msat` fields in `invoice_payment` and `invoice_created` hooks now a number, not a string with "msat" suffix.
Changelog-Fixed: JSON-RPC: Plugin hook `payment` `msat` field is now a number, not a string with "msat" suffix.
Adds tests for when the connection fails during
1) splice tx_signature
2) splice commitment_signed
Fleshed out the reestablish flow for these two cases and implemented the fixes to make these reestablish flows work.
Part of this work required changing commit process for splices: Now we send a single commit_part for the splice where previously we sent all commits, and accordingly, we no longer revoke in response.
Changelog-Fixed: Implemented splicing restart logic for tx_signature and commitment_signed. Splice commitments are reworked in a manner incompatible with the last version.
We don't actually use this internal to this method? Weird.
Anyway, if we don't want/need it allow the caller to signal that by
passing in NULL, if desired.
In general, a validating signer may be under a different operational
environment than the node, and therefore may have a different
source of on-chain data. The signer may therefore temporarily disagree
on whether a funding or splice transaction is locked (buried).
We would like to ensure agreement between the signer and the
node on how to progress a channel's state.
The following message are added to provide a solution:
- `check_outpoint(outpoint) -> bool` - check if the signer agrees that a funding candidate outpoint is buried
- `lock_outpoint(outpoint)` - change the funding/splice state to locked
Link: https://github.com/ElementsProject/lightning/issues/6722
Suggested-by: @devrandom
Co-Developed-by: Ken Sedgwick <ken@bonsai.com>
Changelog-Added: hsmd protocol: Added hsmd_check_outpoint and hsmd_lock_outpoint
Signed-off-by: Vincenzo Palazzo <vincenzopalazzodev@gmail.com>
Changelog-Added: JSON-RPC: `recover` command to force (unused) lightningd node to restart with `--recover` flag.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We often want to do more parameter checks after param(), so allow a
new param_check(), with the proviso that the caller needs to also return
command_check_done() after other checks if command_check_only(cmd) is true.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
During the changeset calculation after the `openchannel2_sign`
hook.
So this commit patch the problem with the following change:
- Addressed an issue where `psbt_get_changeset` was modifying the original PSBT unnecessarily.
- This modification led to problems with a different hsmd, as referenced in [Issue #6672](https://github.com/ElementsProject/lightning/issues/6672).
- Noted a potential optimization where only a subpart of the PSBT
needs to be cloned, as the mutation is specific to inputs.
Link: https://github.com/ElementsProject/lightning/issues/6672
Reported-by: @devrandom
Suggested-by: Ken Sedgwick <ken@bonsai.com>
Co-Developed-by: Ken Sedgwick <ken@bonsai.com>
Signed-off-by: Vincenzo Palazzo <vincenzopalazzodev@gmail.com>
Originally VLS used hsmd_ready_channel as an early call during channel
setup, but later the BOLT-2 spec changed the name of funding_locked to channel_ready.
This is very confusing because the hsmd_ready_channel is not directly
related to the new channel_ready.
This commit is renaming the hsmd_ready_channel to hsmd_setup_channel.
Link: https://github.com/ElementsProject/lightning/issues/6717
Suggested-by: Ken Sedgwick
Signed-off-by: Vincenzo Palazzo <vincenzopalazzodev@gmail.com>
We generalize the current df-only "aborted" flag (and invert it) to a
"disconnected" flag in the peer status message.
We convert it back to the aborted flag for now inside subd.c, but that's
next.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rather than crashing the entire node on invalid pubkey, check the
validity of the pubkey in decode_n, and return an error if invalid.
Detected by libFuzzer:
==265599== ERROR: libFuzzer: deadly signal
#7 abort
#8 bolt11_decode common/bolt11.c:999:4
Invalid recovery IDs cause
secp256k1_ecdsa_recoverable_signature_parse_compact to abort, which
crashes the entire node. We should return an error instead.
Detected by libFuzzer:
[libsecp256k1] illegal argument: recid >= 0 && recid <= 3
Remove the assertion so that an error is returned for invalid bech32.
An error is preferable to crashing the entire node if there's an extra
"lightning:" prefix:
$ lightning-cli pay "lightning:lightning:"
Node log:
pay: common/bolt11.c:718: bolt11_decode_nosig: Assertion `!has_lightning_prefix(str)' failed.
pay: FATAL SIGNAL 6
...
INFO plugin-pay: Killing plugin: exited during normal operation
**BROKEN** plugin-pay: Plugin marked as important, shutting down lightningd
If both databits and *data_len are 0, pull_uint return uninitialized
stack memory in *val.
Detected by valgrind and UBSan.
valgrind:
==173904== Use of uninitialised value of size 8
==173904== __sanitizer_cov_trace_cmp8
==173904== decode_c (bolt11.c:292)
==173904== bolt11_decode_nosig (bolt11.c:877)
UBSan:
common/bolt11.c:79:29: runtime error: shift exponent 64 is too large for 64-bit type 'uint64_t' (aka 'unsigned long')
Corpus input e6f7b9744a7d79b2aa4f7c477707bdd3483f40fa triggers the UBSan
report, but we didn't previously realize this because UBSan has been
disabled in the CI run. We rename the input to indicate its usefulness
as a permanent regression test.
Otherwise, if pull_all fails, we attempt to create a script from NULL,
causing a UBSan report:
bitcoin/script.c:29:28: runtime error: null pointer passed as argument 2, which is declared to never be null
Corpus input bf703c2c20c0818af70a8c4caad6e6fd8cfd1ac6 triggers the UBSan
report, but we didn't previously realize this because UBSan has been
disabled in the CI run. We rename the input to indicate its usefulness
as a permanent regression test.
During our development, we modified the way
we report backtraces.
On a minimal configuration in OpenBSD, it seems that we
no longer compile from commit a9f26b7d07 because
our conditional code is buggy.
With the following compiler
vultr# cc -v
OpenBSD clang version 13.0.0
Target: amd64-unknown-openbsd7.3
Thread model: posix
InstalledDir: /usr/bin
We have the following error
cc common/channel_id.c
cc common/daemon.c
common/daemon.c:218:2: error: implicit declaration of function 'add_steal_notifiers' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
add_steal_notifiers(NULL);
^
1 error generated.
gmake: *** [Makefile:298: common/daemon.o] Error 1
Reported-by: @grubles
Fixes a9f26b7d07
Signed-off-by: Vincenzo Palazzo <vincenzopalazzodev@gmail.com>
We show where the pointer was allocated, but it can be confusing if it
was later stolen onto another context. Save and report those too.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This makes it easier to use outside simple subds, and now lightningd can
simply dump to log rather than returning JSON.
JSON formatting was a lot of work, and we only did it for lightningd, not for
subdaemons. Easier to use the logs in all cases.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We usually hand times by copy, not by pointer (and if we did, they should
be const!). I noticed this particularly for the state changed code, but
it goes down to to json_add_timeiso, so I fixed that too.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
If you previously configured with `--enable-developer` we turn that into `--enable-debugbuild`.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Removed: build: `--enable-developer` arg to configure (and DEVELOPER variables): use `./configure --enable-debugbuild` and `developer` setting at runtime.
Also requires us to expose memleak when !DEVELOPER, however we only
ever used the memleak tracking when the LIGHTNINGD_DEV_MEMLEAK
environment variable was set, so keep that.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Currently it just defaults to the DEVELOPER compile option, but we'll
move over to this.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Added: Config: `--developer` enables developer options and changes default to be "disable deprecated APIs".
Added a test for splicing out that exposed some behavior and code glitches that are addressed in this commit.
Added test for splice gossip.
Also added documentation for how to do a splice out.
ChangeLog-Fixed: Added docs, testing, and some fixes related to splicing out, insufficent balance handling, and restarting during a splice.
json_add_timeabs only printed in milliseconds and json_add_time outputs a string which is weird
Changelog-Changed: JSON-RPC time fields now have full nanosecond precision (i.e. 9 decimals not 3): `listfowards` `received_time` `resolved_time` `listpays`/`listsendpays` `created_at`.
Since we changed the default, it used to be required to set it. That was a while ago, though, so we can make it optional again.
Changelog-Changed: Protocol: `invoice` no longer explicitly encodes `c` if it's the default (18)
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We were allowed to, but the spec removed that. So we handle warnings
differently from errors now.
This also means the LND "internal error" workaround is done in
lightningd (we still disconnect, but we don't want to close channel).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Changed: Protocol: we no longer disconnect every time we receive a warning message.
Doesn't happen on x86, but struct gossmap_chan defines:
```
u32 private: 1;
u32 plus_scid_off: 31;
```
And complains when we initialize plus_scid_off and access it later:
```
VALGRIND=1 valgrind -q --error-exitcode=7 --track-origins=yes --leak-check=full --show-reachable=yes --errors-for-leak-kinds=all plugins/renepay/test/run-mcf > /dev/null
==186886== Conditional jump or move depends on uninitialised value(s)
==186886== at 0x10076388: chan_iter (gossmap.c:1098)
==186886== by 0x100797F3: gossmap_next_chan (gossmap.c:1112)
==186886== by 0x1008C5AF: main (run-mcf.c:309)
==186886== Uninitialised value was created by a heap allocation
==186886== at 0x40F0A44: malloc (vg_replace_malloc.c:431)
==186886== by 0x10072BAF: allocate (tal.c:256)
==186886== by 0x100737A7: tal_alloc_ (tal.c:463)
==186886== by 0x100738DF: tal_alloc_arr_ (tal.c:506)
==186886== by 0x10079507: load_gossip_store (gossmap.c:690)
==186886== by 0x10079667: gossmap_load (gossmap.c:978)
==186886== by 0x1008C4AF: main (run-mcf.c:295)
```
Reported-by: @grubles
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Fixes: #6557
As side-effect, getroute(0) is special too.
Reported-by: MiddleW4y in Discord
Fixes: #6577
Changelog-Fixed: `pay` will still use an invoice routehint if path to it doesn't take 1-msat payments.
This should provide the default help message and exit, but was
resulting in a segmentation fault from freeing pointers passed to
the default config.
Changelog-Fixed: lightning-cli properly returns help without argument
This is actually a valid complaint (though this is a sanity check for
things we make ourselves, still!).
```
In file included from common/test/run-blindedpath_onion.c:9:
common/test/../sphinx.c: In function ‘sphinx_add_hop_has_length’:
common/test/../sphinx.c:117:12: error: ‘prepended_len’ may be used uninitialized [-Werror=maybe-uninitialized]
117 | if (lenlen + prepended_len != tal_bytelen(payload))
| ^
common/test/../sphinx.c:109:27: note: ‘prepended_len’ was declared here
109 | bigsize_t lenlen, prepended_len;
| ^~~~~~~~~~~~~
cc1: all warnings being treated as errors
```
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Avoids a gratuitous "ctx" field, and the simplified declaration
is now understood by `make update-mocks`.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This was recommended by @t-bast: if the final spec commits to something
compatible, we can simply advertize and accept both features, but if it
does change in incompatible ways we won't cause problems for nodes
who implement the official spec.
(I split this, so first, we remove the OPT_SPLICE entirely, to make
sure we caught them all. --RR)
Suggested-by: @t-bast
Changelog-None
I obviously like the word "capabilities" since I reused it to refer
to the HSM's overall features :(
Suggested-by: @ksedgwic
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This was strongly recommended by Russell O'Connor: the "ms" implies that
it's a BIP-32 master secret, and this is CLN specific.
If we changed the hrp to "cln" it would be better, but apparently that
means we no longer fit in a "standard billfold metal wallet" (and
our code assumes a 2-byte prefix anyway).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Update the lightningd <-> channeld interface with lots of new commands to needed to facilitate spicing.
Implement the channeld splicing protocol leveraging the interactivetx protocol.
Implement lightningd’s channel_control to support channeld in its splicing efforts.
Changelog-Added: Added the features to enable splicing & resizing of active channels.
New daemon process means we don’t have to deal with gossip, so that gets removed along with error cleanup and a refactoring of how we calculating PDBT diffs.
Update gossip routiens and various other hecks on the channel state to consider AWAITING_SPLICE to be routable and treated similar to CHANNELD_NORMAL.
Small updates to psbt interface
Changelog-None
Firstly, I wanted the results easier to use:
1. Make them always lower case, even if the string was UPPER.
2. Decode the payload for them.
3. Don't give the user any fields they don't need, and make
the field sizes explicit.
Secondly, I wanted to avoid the pattern of "check in one place, assume
in another", in favour of "check on use".
So, I changed the code to lower the string if it needs to at the start,
and then changed the pull functions so we always use them to get data:
this way we should fail clearly and gracefully if we don't have enough data.
I made all the checks explicit, where we assign the fields.
I also addressed the FIXME: I think the array is *often* one shorter,
but not always, so I trim the last byte at the end if needed.
[ Aditya modified the tests to work ]
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Nothing major here:
1. size_t for lengths.
2. pass engine to checksum_verify, as caller wants ->len (avoid repeating 13/15 magic numbers).
3. Use x.member instesad of (&x)->member.
4. Return memcmp result directly instead of if.
5. Spacing removal, `;;` removal.
6. codexl is a bool `true`/`false` not 0/1 (it's the same, but clearer)
7. Make sanity_check assign *fail directly.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Clean restart of daemon after a tx-abort is a nice way to work around
the 'persistent' disconnect that we t-bast noticed.
Changelog-Fixed: `dualopend`: Fix behavior for tx-aborts. No longer hangs, appropriately continues re-init of RBF requests without reconnction msg exchange.
For non-v0 witness programs we weren't stripping the data push byte
before writing into the fallback address.
According to BIP14, all witness scripts will be data pushes (up to 40-bytes)
so trimming the datapush byte should be kosher.
From BIP141:
A scriptPubKey (or redeemScript as defined in BIP16/P2SH) that
consists of a 1-byte push opcode (for 0 to 16) followed by a
data push between 2 and 40 bytes gets a new special meaning.
The value of the first push is called the "version byte". The
following byte vector pushed is called the "witness program".
Changelog-Fixed: Adding a >0 version witness program to a fallback address now is *just* the witness program, as per bolt11 spec
And do them on the first run (where we check parameters), instead
of every time. Might as well do them in non-developer mode too,
since they're simply programmer correctness.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This extracts the core checking functionality for a rune, so they can
easily be used more widely than just commando.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
`struct log` becomes `struct logger`, and the member which points to the
`struct log_book` becomes `->log_book` not `->lr`.
Also, we don't need to keep the log_book in struct plugin, since it has
access to ld's log_book.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Previously, our code checked for the presence of the `lightning:`
prefix while decoding a bolt11 string. Although this prefix is valid
and accepted by the core lightning pay command, it was causing issues
with how we managed invoices. Specifically, we were skipping the prefix
when creating a copy of the invoice string and storing the raw invoice
(including the prefix) in the database, which caused inconsistencies
in the user experience.
To address this issue, we need to strip the `lightning:` prefix before
calling each core lightning command. In addition, we should
modify the invstring inside the db with the canonical one.
This commit fixes the issue by stripping the `lightning:` prefix
from the `listsendpays` function, which will improve the
user experience and ensure consistency in our invoice management (see
next commit).
Reported-by: @johngribbin
Link: ElementsProject#6207
Fixes: debbdc0
Changelog-Fixes: trim the `lightning:` prefix from invoice everywhere.
Signed-off-by: Vincenzo Palazzo <vincenzopalazzodev@gmail.com>