There's a race under CI, where a channel is deleted then we see the
channel_update in the gossip store. We assumed this wouldn't happen,
but it can!
```
[gw1] [ 95%] FAILED tests/test_connection.py::test_multichan
[gw1] [ 95%] ERROR tests/test_connection.py::test_multichan
...
> raise ValueError(str(errors))
E ValueError:
E Node errors:
E - lightningd-3: had BROKEN messages
E - lightningd-3: Node exited with return code 1
E Global errors:
...
lightningd-3: 2022-03-28T00:11:42.160Z DEBUG wallet: Owning output 0 100000sat (SEGWIT) txid 30616903feba1839a3834e2b3b6123759ce1fe0d76414ca77e2dbc17414772e0 CONFIRMED
lightningd-3: 2022-03-28T00:11:42.392Z DEBUG hsmd: Client: Received message 5 from client
lightningd-3: 2022-03-28T00:11:42.393Z DEBUG hsmd: new_client: 2
lightningd-3: 2022-03-28T00:11:42.398Z INFO plugin-topology: Killing plugin: exited during normal operation
lightningd-3: 2022-03-28T00:11:42.400Z **BROKEN** plugin-topology: Plugin marked as important, shutting down lightningd!
...
----------------------------- Captured stderr call -----------------------------
topology: update for channel 105x1x1 not found!
```
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Means that field is now optional in JSON output.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Added: JSON-RPC: `delinvoice` has a new parameter `desconly` to remove description.
LNURL wants this so they can include images etc in descriptions.
Replaces: #4892
Changelog-Added: JSON-RPC: `invoice` has a new parameter `deschashonly` to put hash of description in bolt11.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It was tlv_fields_valid that wanted a non-const: now that's gone, we
can make this correctly const.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Callers were supposed to call "tlv_fields_valid" after fromwire_tlv,
but few did. Make this the default, and call the underlying function
directly where we want to be more flexible (one place).
This loses the ability to allow misordered fields, or to pass through
*any* even fields. We restore that for special cases in the next
patch.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Requiring the caller to allocate them is ugly, and differs from
other types.
This means we need a context arg if we don't have one already.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
No more "towire_offer", but "towire_tlv_offer".
This means we double-up on the unfortunately-named `tlv_payload` inside
the onion, but we should rename that in the spec when we remove
old payloads.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
In particular, this changes the name of a field in invoice_request:
`payer_signature` becomes simply `signature`. So we allow both for
now, and send the old one unless deprecated_apis is disabled.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
When we did fill them in, we filled them in wrong: the offset should be
the offset in the message, not the field number!
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Also has to fix up tests.
Changelog-Fixed: cli doesn't required anymore to confirm the password if the `hsm_secret` is already encrypted.
Signed-off-by: Vincenzo Palazzo <vincenzopalazzodev@gmail.com>
Things allocated by libwally all get the tal_name "wally_tal",
which cost me a few hours trying to find a leak.
In the case where we're making one of the allocations the parent
of the others (e.g. a wally_psbt), we can do better: supply a name
for the tal_wally_end().
So I add a new tal_wally_end_onto() which does the standard
tal_steal() trick, and also changes the (typechecked!) name.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
As per proposal in https://github.com/lightning/bolts/pull/962
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Removed: protocol: support for legacy onion format removed, since everyone supports the new one.
We often hand an exclude pointer (usually the current command) to
memleak. But when we encountered this we would stop iterating, rather
than just ignore it: this means we would often ignore significant siblings.
In particular, fixing this (which has always been there) reveals many
previously-undetected leaks.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We allocate the default, then callback allocates over the top. Mark
params with a default, so we can free that when it's called.
(We can't do this generally, since not all param args are actually
pointers to pointers, though opt_param_def has to be).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It looks like decode_c doesn't set have_c unlike the other decode_
methods. At the start of the function, decode_c checks have_c to see if
it's set, but it is never set. It seems like this could allow for
duplicate c tags, which is probably not intended.
Signed-off-by: William Casarin <jb55@jb55.com>
And in particular, fix onchaind grinding code which used the
actual number of inputs and outputs (which already includes the
fee output); that breaks with the next patch which fixes other
calculations.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The blockheight is zero though, since these aren't included in a block
yet.
We also don't issue an 'external' deposit event if we can tell that the
address you're sending to actually belongs to our wallet (we'll issue a
deposit event when it gets included in a block)
If a coin move concerns an external account, it's really useful to know
which 'internal' account initiated the transfer.
We're about to add a notification for withdrawals, so we can use this to
track wallet pushes to outside addresses
Changelog-Added: JSONRPC: `coin_movement` to 'external' accounts now include an 'originating_account' field
connectd does this internally now using ccan/io, with appropriate
credit for ZmnSCPxj who wrote this code in the first place.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This was put in late 2019, and @t-bast says Eclair doesn't ignore their
errors and has had no issues.
It also conflicts with https://github.com/lightning/bolts/pull/932
which suggests you *should* fail when you receive an error.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
In the case where the peer sends an error (and hangs up) immediately
after init, connectd *doesn't actually read the error* (even after all the
previous fixes so it actually receives the error!).
This is because to tried to first write WIRE_CHANNEL_REESTABLISH, and
that fails, so it never tries to read. Generally, we should ignore
write failures; we'll find out if the socket is closed when we read
nothing.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
msg_queue was originally designed for inter-daemon comms, and so it has
a special mechanism to mark that we're trying to send an fd. Unfortunately,
a peer could also send such a message, confusing us!
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
dev_blackhole_fd was a hack, and doesn't work well now we are async
(it worked for sync comms in per-peer daemons, but now we could sneak
through a read before we get to the next write).
So, make explicit flags and use them. This is much easier now we
have all peer comms in one place.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We now let gossipd do it.
This also means there's nothing left in 'struct per_peer_state' to
send across the wire (the fds are sent separately), so that gets
removed from wire messages too.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We actually intercept the gossip_timestamp_filter, so the gossip_store
mechanism inside the per-peer daemon never kicks off for normal connections.
The gossipwith tool doesn't set OPT_GOSSIP_QUERIES, so it gets both, but
that only effects one place.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
channeld can't do it any more: it's using local sockets. Connectd
can do it, and simply does it by type.
Amazingly, on my machine the timing change *always* caused
test_channel_receivable() to fail, due to a latent race.
Includes feedback from @cdecker.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
As connectd handles more packets itself, or diverts them to/from gossipd,
it's the only place we can implement the dev_disconnect logic.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Now connectd is doing the crypto, we can use normal wire io. We
create helper functions to clearly differentiate between "peer" comms
and intra-daemon comms though.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We temporarily hack to sync_crypto_write/sync_crypto_read functions to
not do any crypto, and do it all in connectd.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
1. tal_strndup(.., str, strlen(str)) == tal_strdup()
2. tal_strdup also takes(), so document that.
3. Avoid passing 'struct sha256' on the stack: use ptr.
4. Generally, structures shouldn't keep pointers to things they don't own.
In this case, mvt->node_id.
5. Make payment_hash a pointer, since NULL is more natural than an all-zero
hash.
And add NON_NULL_ARGS() to the functions; it's cumbersome, but make it
fairly clear what params are optional.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Once connectd is doing this, we can't close as soon as we send,
and in fact we can't do 'fail write' either.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The "read until closed" trick doesn't work if the other end doesn't
close (as found in the next patch, where we use DEV_DISCONNECT_DISABLE_AFTER).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
connectd is going to end up using this do demux; make it fast and complete.
Fixing this reveals a problem in openingd: it now extracts the channel_id
from funding_signed (which is where we transition off the temporary), and
gets upset. So fix that.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We need to stash/save the amount of the lease fees on a leased channel,
we do this by re-using the 'push' amount field on channel (which is
technically correct, since we're essentially pushing the fee amount to
the peer).
Also updates a bit of how the pushes are accounted for (pushed to now
has an event; their channel will open at zero but then they'll
immediately register a push event).
Leases fees are treated exactly the same as pushes, except labeled
differently.
Required adding a 'lease_fee' field to the inflights so we keep track of
the fee for the lease until the open happens.
We record the amount of fees collected for a routed payment. For
simplicity's sake on the data agg side, we record the fee payment on
*BOTH* the incoming htlc and the outgoing htlc. Note that this results
in double counting if you add up the fees from both an in-routed and
out-routed payment.
Get rid of the 'movement_idx', since we replay events now.
Since we're removing a field from the 'coin_movement' event emission, we
bump the version type.
Changelog-Updated: `coin_movements` events have been revamped and are now on version 2.
The old model of coin movements attempted to compute fees etc and log
amounts, not utxos. This is not as robust, as multi-party opens and dual
funded channels make it hard to account for fees etc correctly.
Instead, we move towards a 'utxo' view of the onchain events. Every
event is either the creation or 'destruction' of a utxo. For cases where
the value of the utxo is not (fully) debited/credited to our account, we
also record the output_value. E.g. channel closings spend a utxo who's
entire value we may not own.
Since we're now tracking UTXOs onchain, we can now do more complex
assertions about the onchain footprint of them. The integration tests
have been updated to now use more 'chain aware' assertions about the
ending state.
we're pivoting from a txid based world to a outpoint based world. every
coin movement (onchain) will correspond with a outpoint; only the spend
of an outpoint will have a tx_txid
We're not going to do 'spend tracks' any more; instead we'll emit an
event whenever an output is included in a broadcast tx
(even if the broadcast fails!!)
Suggested-by: Rusty Russell
Signed-off-by: Vincenzo Palazzo <vincenzopalazzodev@gmail.com>
Changelog-Changed: Support hsm specific error error code in lightning-cli
And turn "" includes into full-path (which makes it easier to put
config.h first, and finds some cases check-includes.sh missed
previously).
config.h sets _GNU_SOURCE which really needs to be done before any
'#includes': we mainly got away with it with glibc, but other platforms
like Alpine may have stricter requirements.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Various unit tests were creating temporary files unconditionally in /tmp
and were not cleaning up after themselves. Introduce a new variant of
mkstemp(3p) that respects the TMPDIR environment variable, and use it in
the offending unit tests. This allows each test run to use a dedicated
TMPDIR that can be cleaned up after the run.
Changelog-None
Signed-off-by: Matt Whitlock <c-lightning@mattwhitlock.name>
As of 2b923a0367c5f9154fcec706e3302cc4658dd889.
Recurrence quotes need to be marked separately, since they're no longer
in offers main bolt.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This builds on the enctlv vectors, but actually goes all the way
to creating a modern onionmessage.
Thanks to Thomas H for corrections!
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is from 6e99c5feaf60cb797507d181fe583224309318e9
We renamed the enctlv field to encrypted_recipient_data in the spec, and the
new onion_message is message 513. We don't handle it until the next patch.
Two renames:
1. blinding_seed -> blinding_point.
2. enctlv -> encrypted_recipient_data.
We don't do a compat cycle for our JSON APIs for these experimental
features only used by our own plugins, we just rename.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Temporarily disable sendpay_blinding test which uses obsolete onionmsg;
there's still some debate on the PR about how blinded HTLCs will work.
Changelog-EXPERIMENTAL: onionmessage: removed support for v0.10.1 onion messages.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
No idea why TCSAFLUSH was used, could not find anything in PR comments.
Also cannot explain exactly what causes the problem, but the hang can be reproduced
*with* TCSAFLUSH and not with TCSANOW.
According to termios doc:
TCSANOW
the change occurs immediately.
TCSAFLUSH
the change occurs after all output written to the object referred by fd has been
transmitted, and all input that has been received but not read will be discarded
before the change is made.
This happened in my tal_dump(), and I couldn't see how we ended up
with object having more than one "backtrace". Adding asserts that we
never added a second backtrace didn't trigger.
Finally I wondered if we were tal_steal() backtraces, and sure enough
we do that blinding in one place: libwally wrapping. So fix that.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
valgrind locally complains about the allocations in autodata leaking:
```
==138200== 16 bytes in 1 blocks are still reachable in loss record 1 of 2
==138200== at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==138200== by 0x10D41A: autodata_register_ (autodata.c:20)
==138200== by 0x10E7B8: register_autotype_type_to_string (type_to_string.h:79)
==138200== by 0x10F5CA: register_one_type_to_string0 (block.c:259)
==138200== by 0x19734C: __libc_csu_init (in /home/rusty/devel/cvs/lightning/common/test/run-route-specific)
==138200== by 0x4A3D03F: (below main) (libc-start.c:264)
==138200==
==138200== 176 bytes in 1 blocks are still reachable in loss record 2 of 2
==138200== at 0x483DFAF: realloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==138200== by 0x10D472: autodata_register_ (autodata.c:26)
==138200== by 0x122D37: register_autotype_type_to_string (type_to_string.h:79)
==138200== by 0x122F1F: register_one_type_to_string0 (node_id.c:50)
==138200== by 0x19734C: __libc_csu_init (in /home/rusty/devel/cvs/lightning/common/test/run-route-specific)
==138200== by 0x4A3D03F: (below main) (libc-start.c:264)
==138200==
make: *** [Makefile:638: unittest/common/test/run-route-specific] Error 7
```
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
blob[] is really a string from the commandline; leave it as a char.
And parsing is much simpler than this code makes it seem!
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This surprised me, since the CHANGELOG for [0.8.2] said:
We now announce multiple addresses of the same type, if given. ([3609](https://github.com/ElementsProject/lightning/pull/3609))
But it lied!
Changelog-Fixed: We really do allow providing multiple addresses of the same type.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
October was the date Torv2 is no longer supported by the Tor Project;
it will probably not work at all by next release, so we should remove
it now even though it's not quite the 6 months we prefer for
deprecation cycles.
I still see 110 nodes advertizing Torv2 (vs 10,292 Torv3); we still
parse and display it, we just don't advertize or connect to it.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
for every new added htlc, check that adding it won't go over our 'dust
budget' (which assumes a slightly higher than current feerate, as this
prevents sudden feerate changes from overshooting our dust budget)
note that if the feerate changes surpass the limits we've set, we
immediately fail the channel.
If we're over the dust limit, we fail it immediatey *after* commiting
it, but we need a way to signal this throughout the lifecycle, so we add
it to htlc_in struct and persist it through to the database.
If it's supposed to be failed, we fail after the commit cycle is
completed.
To reduce the surface area of amount of a channel balance that can be
eaten up as htlc dust, we introduce a new config
'--max-dust-htlc-exposure-msat', which sets the max amount that any
channel's balance can be added as dust
Changelog-Added: config: new option --max-dust-htlc-exposure-msat, which limits the total amount of sats to be allowed as dust on a channel
This also inadvertently fixes a latent bug: before this patch, in the
`subd` function in `lightningd/subd.c`, we would close `execfail[1]`
*before* doing an `exec`.
We use an EOF on `execfail[1]` as a signal that `exec` succeeded (the
fd is marked CLOEXEC), and otherwise use it to pump `errno` to the
parent.
The intent is that this fd should be kept open until `exec`, at which
point CLOEXEC triggers and close that fd and sends the EOF, *or* if
`exec` fails we can send the `errno` to the parent process vua that
pipe-end.
However, in the previous version, we end up closing that fd *before*
reaching `exec`, either in the loop which `dup2`s passed-in fds (by
overwriting `execfail[1]` with a `dup2`) or in the "close everything"
loop, which does not guard against `execfail[1]`, only
`dev_disconnect_fd`.
It's probably not worth fixing for the other daemons.
Changelog-Changed: JSON-RPC: `ping` now only works if we have a channel with the peer.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We keep the now-removed chains field, and in deprecated mode, we set it.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-EXPERIMENTAL: bolt12: `chains` in invoice_request and invoice is deprecated, `chain` is used instead.
Main changes are:
1. Uses point32 instead of pubkey32.
2. Uses issuer instead of vendor.
3. Uses byte instead of u8.
4. blinded_path num_hops is now a byte, not u16 (we don't use that yet!).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-EXPERIMENTAL: bolt12: `vendor` is deprecated: the field is now called `issuer`.
The latest ones use lno, not lni (this unit tests loads from
../lightning-rfc, silently exiting if it doesn't have the test
vector).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
By popular merge-hell demand.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Changed: Build: Python is now required to build, as generated files are no longer checked into the repository.