This cause of cascading failure was pointed out by @t-bast: if fees spike and
you don't timeout an outgoing onchain HTLC, you should nonetheless fail the incoming htlc
because otherwise the incoming peer will close on you.
Of course, there's a risk of losing funds, but this only happens if you weren't going to get the HTLC spend in time anyway. And it would also catch any other reason that the downstream onchain goes wrong, containing the damage.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Reported-by: @t-bast
Changelog-Fixed: Protocol: We will close incoming HTLCs early if the outgoing HTLC is stuck onchain long enough, to avoid cascating failure.
The test actually triggers this:
1. We don't get our commitment tx mined at all (we block it).
2. By the time the peer does, the HTLC is expired.
3. We have the preimage but we don't even try, since it's expired.
We should at least *try* to collect the HTLC in this case.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Caught by leak detection, we just re-assigned this when we retried: sure,
it's temporary, but it's technically a leak.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
I shut down bitcoind during a test, and bcli leak reports flooded in.
They're all temporary, but this fixes them.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This update incorporates the proposed version of lnprototest from
the patch [1], which includes the following fixes:
- Corrects the `ExpectError` event and updates BOLT 7 to expect a
warning instead of an error.
- Implements a new test for when the runner sends a bad signature
within the announcement_signatures message.
[1] https://github.com/rustyrussell/lnprototest/pull/100
Signed-off-by: Vincenzo Palazzo <vincenzopalazzodev@gmail.com>
This commit addresses an issue to enhance the resilience of core
lightning when receiving node announcements.
According to BOLT 7 (The announcement_signatures Message),
if the node_signature OR the bitcoin_signature is NOT correct,
it is recommended to either send a warning and close the connection or send an error and fail the channel.
In this commit, we take a strict approach. If any error is detected, we
send an error and fail the open channel operation.
This is because the announcement_signatures operation is optional,
and we assume that it must be correct.
lnprototest at commit dea47c29b5541dbfe7fe53cc2598330e897fa4f4 report
the following error now.
```
2023-07-06T21:03:20.930Z DEBUG hsmd: Shutting down
ERROR root:helpers.py:170 Traceback (most recent call last):
File "/home/vincent/Github/lightning/external/lnprototest/tests/helpers.py", line 167, in run_runner
runner.run(test)
File "/home/vincent/Github/lightning/external/lnprototest/lnprototest/runner.py", line 99, in run
all_done = sequence.action(self)
^^^^^^^^^^^^^^^^^^^^^
File "/home/vincent/Github/lightning/external/lnprototest/lnprototest/structure.py", line 55, in action
all_done &= e.action(runner)
^^^^^^^^^^^^^^^^
File "/home/vincent/Github/lightning/external/lnprototest/lnprototest/event.py", line 365, in action
raise EventError(self, "{}: message was {}".format(err, msg.to_str()))
lnprototest.errors.EventError: `Expected msgtype-warning, got msgtype-error: message was error channel_id=a37362839b13f61cfe82d35bd397b1264c389b245847cfb6111b38892546dc77 data=4661696c656420746f20766572696679206e6f64655f7369676e61747572652e` on event [{"event": "ExpectMsg", "file": "test_bolt2-01-close_channel.py", "pos": "157"},]
============================================================================================================================================================== short test summary info ===============================================================================================================================================================
FAILED tests/test_bolt2-01-close_channel.py::test_close_channel_shutdown_msg_normal_case_receiver_side - AssertionError: `Expected msgtype-shutdown, got msgtype-error: message was error channel_id=a37362839b13f61cfe82d35bd397b1264c389b245847cfb6111b38892546dc77 data=4661696c656420746f20766572696679206e6f64655f7369676e61747572652e` on event [{"event": "ExpectMsg", "file": "test_bolt2-01-close_channel.py", "pos": "75"},]
FAILED tests/test_bolt2-01-close_channel.py::test_close_channel_shutdown_msg_wrong_script_pubkey_receiver_side - AssertionError: `Expected msgtype-warning, got msgtype-error: message was error channel_id=a37362839b13f61cfe82d35bd397b1264c389b245847cfb6111b38892546dc77 data=4661696c656420746f20766572696679206e6f64655f7369676e61747572652e` on event [{"event": "ExpectMsg", "file": "test_bolt2-01-close_channel.py", "pos": "157"},]
```
Changelog-Fixes: channeld: Verify the signature sent in announcement_signatures by the counterparty
Reported-by: lnprototest
Signed-off-by: Vincenzo Palazzo <vincenzopalazzodev@gmail.com>
This is the simplest solution, not the best, but there's significant risk in try to remove the "we have a path" assumption in the code pay code.
Includes removing a `tal_steal` which was incorrect: the buffer has the same lifetime as the plugin, so if we steal it then things get messy when we free the struct payment.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Added: Plugins: `pay` will now pay your own invoices if you try.
Previously, the "payment" and "invoice" paths were completely separate, but this now calls both. It bypasses htlc_sets (and thus, cannot do MPP), and bypasses the hook too: the former is tied closely to HTLCs, and the hook is also very htlc-centric.
Includes finishing unfinished sentence in sendpay man page, as a bonus.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Added: Plugins: `sendpay` now allows self-payment of invoices, by specifying an empty route.
Clean these up: they were debug logs, but we want to pass this information
back for self-payments.
Also fixes "Attept" typo which altered tests!
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
For non-v0 witness programs we weren't stripping the data push byte
before writing into the fallback address.
According to BIP14, all witness scripts will be data pushes (up to 40-bytes)
so trimming the datapush byte should be kosher.
From BIP141:
A scriptPubKey (or redeemScript as defined in BIP16/P2SH) that
consists of a 1-byte push opcode (for 0 to 16) followed by a
data push between 2 and 40 bytes gets a new special meaning.
The value of the first push is called the "version byte". The
following byte vector pushed is called the "witness program".
Changelog-Fixed: Adding a >0 version witness program to a fallback address now is *just* the witness program, as per bolt11 spec
Pointed out by @ShahanaFarooqui, we leave a single unused entry in the datastore,
so we should clean that up too.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
If they have invalid runes, we bail, but if they have runes which used
a different master secret (old commando.py allowed you to override
secret), we just complain and delete them.
Note that this requires more mocks in wallet/test/run-db.c...
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The wallet_datastore_first() SELECT statement only iterates from the
given key (if any), relying on the caller to notice when the key no
longer applies. (e.g. startkey = ["foo", "bar"] will return key
["foo", "bar"] then ["foo", "bar", "child" ], then ["foo", "baz"]).
The only caller (listdatastore) would notice the keychange and stop
looping, but reallly wallet_datastore_next() should do this. When I
tried to use it for migrations, I got very confused!
Also, several places want a simple "wallet_datastore_get()" function,
so provide that.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
During migrations, wallet doesn't exist yet, so we use raw db. Split
functions into lower-level ones and make public API a simple wrapper.
Unfortunately, this means db_datastore_next needs to proceed db_datastore_first
since they're now static (and first calls next), plus, fix some weird indents,
so diff is bigger than expected.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We want to access this in db migrations, which happen very early, but
runes_init needs the db, creating a circular dependency which must be
split.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This means (temporarily) that blacklisting won't work (fix later), and
means that old-style (commando.py) master-secret-override doesn't work.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Removed: Plugins: `commando` no longer allows datastore ['commando', 'secret'] to override master secret (re-issue runes if you were using that!).
In preparation for going async:
1. Split try_command's tail into a new function called execute_command() after
the rune checks have succeeded.
2. Put all the info execute_command() needs into struct cond_info, to make it
a simple callback style.
So we create new_cond_info() which dynamically allocates `struct cond_info`
and sets the destructor.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We would create a `struct commando` to marshal our incoming messages,
then try_command would create a *new* one. We can simply reuse, but
when I did I noticed a trick: the new one was not in the `incomings`
array, so didn't work towards the ratelimit. So we need to remove it
from `incomings` in `try_command`, but at least it's now explicit.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We used to activate on the first rune creation, but we're no longer in charge
of runes, so we can't make that call.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
And do them on the first run (where we check parameters), instead
of every time. Might as well do them in non-developer mode too,
since they're simply programmer correctness.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The uid isn't enough: it could be someone else's rune. This is tested
in the command rune list tests.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Make this a bit less difficult to see what's going on; use common
variables for the paths etc.
Also make the final `rm` step recursive, same as the above find step
which patches the files.
Files are in `contrib/pyln-grpc-proto/pyln/grpc`, but the upper level
find command acts at a level above.
At cold start, if your node is behind the blocktip and you've sent your
peer a blockheight counter from the future, we shouldn't confuse ourselves
with our rollback/replay.
Should fix flakes in CI that were spotting BROKEN blockheight updates.
Logs below from a previuos CI fail (edited for relative clarity)
The one that sasy "{ SENT_ADD_ACK_REVOCATION:111 }, our current 108` is
the tell; the last line is the node finally catching up to the tip.
In the test we get into this state by stopping and restarting the node.
```
2023-07-22T11:24:28.2754533Z lightningd-1 2023-07-22T11:19:34.188Z DEBUG 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-chan#2: Already have funding locked in
2023-07-22T11:24:28.2755486Z lightningd-1 2023-07-22T11:19:34.188Z DEBUG 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-chan#2: attempting update blockheight a5b23dff5177badd6df725c
efeb83ceccbfc52dc64a16b38894a41f0ad8fa181
2023-07-22T11:24:28.2755778Z lightningd-1 2023-07-22T11:19:34.188Z DEBUG lightningd: update_blockheight: height = 108
2023-07-22T11:24:28.2766210Z lightningd-1 2023-07-22T11:19:34.210Z DEBUG 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-channeld-chan#2: init LOCAL: remote_per_commit = 029563e7c898
5d8b95bdfe19e47e494bb8ec8d53ff4edb93f156be57667bfee8c9, old_remote_per_commit = 02bf3117c149d324361f0b418db8984b1e29af70c773eb2865a41ff7f583c7c9ed next_idx_local = 3 next_idx_remote = 3 revocations_recei
ved = 2 feerates { SENT_ADD_ACK_REVOCATION:3750 } range 253-150000 blockheights { SENT_ADD_ACK_REVOCATION:111 }, our current 108
2023-07-22T11:24:28.2768866Z lightningd-1 2023-07-22T11:19:34.211Z DEBUG 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-channeld-chan#2: peer_out WIRE_CHANNEL_REESTABLISH
2023-07-22T11:24:28.2769416Z lightningd-1 2023-07-22T11:19:34.211Z DEBUG 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-channeld-chan#2: billboard: Sent reestablish, waiting for the
irs
2023-07-22T11:24:28.2771115Z lightningd-1 2023-07-22T11:19:34.212Z DEBUG 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-channeld-chan#2: peer_in WIRE_CHANNEL_REESTABLISH
2023-07-22T11:24:28.2774150Z lightningd-1 2023-07-22T11:19:34.212Z DEBUG 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-channeld-chan#2: Got reestablish commit=3 revoke=2
2023-07-22T11:24:28.2776056Z lightningd-1 2023-07-22T11:19:34.212Z DEBUG 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-channeld-chan#2: next_revocation_number = 2
2023-07-22T11:24:28.2805639Z lightningd-1 2023-07-22T11:19:34.239Z **BROKEN** 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-channeld-chan#2: current blockheight 108 less than last 111
2023-07-22T11:24:28.2823960Z lightningd-1 2023-07-22T11:19:34.240Z DEBUG lightningd: Adding block 109: 5f67b6e110eb3c3457bea4fcf0d04ce9be90efeee5df8e083ed4266074ca911f
2023-07-22T11:24:28.2833154Z lightningd-1 2023-07-22T11:19:34.251Z **BROKEN** 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-channeld-chan#2: current blockheight 108 less than last 111
2023-07-22T11:24:28.2833630Z lightningd-1 2023-07-22T11:19:34.252Z DEBUG 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-channeld-chan#2: Trying commit
2023-07-22T11:24:28.2834165Z lightningd-1 2023-07-22T11:19:34.252Z **BROKEN** 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-channeld-chan#2: current blockheight 108 less than last 111
2023-07-22T11:24:28.2835070Z lightningd-1 2023-07-22T11:19:34.252Z DEBUG 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-channeld-chan#2: Can't send commit: nothing to send, feechange not wanted ({ SENT_ADD_ACK_REVOCATION:3750 }) blockheight not wanted ({ SENT_ADD_ACK_REVOCATION:111 })
2023-07-22T11:24:28.2835516Z lightningd-1 2023-07-22T11:19:34.350Z DEBUG lightningd: Adding block 110: 5f43f3ac9d808e3a309720d1b0727a00d5a3d3ddca71d97401e233637e87639c
2023-07-22T11:24:28.2835962Z lightningd-1 2023-07-22T11:19:34.476Z DEBUG lightningd: Adding block 111: 55b0d1e0a08ff6233e186e6735cb1cbec33e2b0a6e7d08f2622e8c1db30b54b9
```
This looked like a test flake, but was real:
```
l1.daemon.wait_for_log("closing soon due to the funding outpoint being spent")
# We won't gossip the dead channel any more (but we still propagate node_announcement). But connectd is not explicitly synced, so wait for "a bit".
time.sleep(1)
> assert len(get_gossip(l1)) == 2
E assert 4 == 2
```
We can see that two channel_updates come in *after* we mark it dying:
```
gossipd: channel 103x1x0 closing soon due to the funding outpoint being spent
gossipd: REPLY WIRE_GOSSIPD_NEW_BLOCKHEIGHT_REPLY with 0 fds
022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-gossipd: Received channel_update for channel 103x1x0/0 now DISABLED
022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-gossipd: Received channel_update for channel 103x1x0/1 now DISABLED
```
We should keep marking channel_updates the same way.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>