Closes: #4860
ChangeLog-Added: With `sqlite3` db backend we now use a 60-second busy timer, to allow backup processes like `litestream` to operate safely.
```
[gw1] [ 98%] PASSED tests/test_wallet.py::test_hsmtool_dump_descriptors
tests/test_wallet.py::test_fundchannel_listtransaction
[gw0] [ 98%] PASSED tests/test_plugin.py::test_channel_opened_notification
tests/test_wallet.py::test_hsmtool_generatehsm
[gw0] [ 98%] PASSED tests/test_wallet.py::test_hsmtool_generatehsm
tests/test_wallet.py::test_withdraw_nlocktime_fuzz
[gw1] [ 98%] ERROR tests/test_wallet.py::test_fundchannel_listtransaction
tests/test_wallet.py::test_fundchannel_listtransaction
tests/test_wallet.py::test_withdraw_nlocktime_fuzz
tests/test_wallet.py::test_fundchannel_listtransaction
[gw0] [ 99%] ERROR tests/test_wallet.py::test_withdraw_nlocktime_fuzz
tests/test_wallet.py::test_multiwithdraw_simple
[gw1] [ 99%] ERROR tests/test_wallet.py::test_fundchannel_listtransaction
tests/test_wallet.py::test_withdraw_nlocktime
tests/test_wallet.py::test_multiwithdraw_simple
tests/test_wallet.py::test_withdraw_nlocktime
tests/test_wallet.py::test_multiwithdraw_simple
tests/test_wallet.py::test_withdraw_nlocktime
[gw0] [ 99%] ERROR tests/test_wallet.py::test_multiwithdraw_simple
tests/test_wallet.py::test_repro_4258
[gw1] [ 99%] ERROR tests/test_wallet.py::test_withdraw_nlocktime
...
2021-10-12 06:36:09.203 UTC [224552] STATEMENT: SELECT version FROM version LIMIT 1
2021-10-12 06:36:09.566 UTC [224523] PANIC: could not write to file "pg_wal/xlogtemp.224523": No space left on device
2021-10-12 06:36:09.566 UTC [224523] STATEMENT: VACUUM FULL;
Error vacuuming db: BEGIN command failed: PANIC: could not write to file "pg_wal/xlogtemp.224523": No space left on device
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
```
This makes init a two-stage, and causes some code hoisting.
And we can now send all the HTLCs in a single message, since we have
an 128MB limit and each HTLC is 37 bytes.
This breaks the onchaind stresstest, which uses canned internal messages.
It's time to finally delete that.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is particularly useful after our recent field deletion:
before: 362,573,824 bytes
after: 124,190,720 bytes
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Changed: db: removal of old HTLC information and vacuuming shrinks large lightningd.sqlite3 by a factor of 2-3.
And initialize max to current height max when htlcs are already dead.
Turns out (thanks CI!) that MAX() of multiple columns is GREATEST() in
Postgres. That's clearer (MAX is used elsewhere for single columns),
so translate on the sqlite3 side.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
sendonionmessage can fail when sending a reply, either because
the reply had a bad first peer, or because it went offline. The
latter happens in CI, which is how I found this.
Also fixed typo "onio" -> "onion".
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Fixes#4482Fixes#4481
Changelog-Added: pay: Payment attempts are now grouped by the pay command that initiated them
Changelog-Fixed: pay: `listpays` returns payments orderd by their creation date
Changelog-Fixed: pay: `listpays` no longer groups attempts from multiple attempts to pay an invoice
This re-establishes the prior behavior where a `sendpay` or
`sendonion` that'd match a prior payment would cause the prior payment
to be deleted. While we no longer delete prior attempts we now avoid a
primary key collision by incrementing once. This helps us not having
to touch all existing tests, and likely avoids breaking other users
too.
One of the fundamental constraints of the payment groups idea is that
there may only ever be one group in flight at any point in time, so if
we find a group that is in flight, any new `sendpay` or `sendonion`
must match its `groupid`.
This was the main cause of the pay states flip-flopping, since we
reset the status on each attempt any final status is not really
final. Let's keep them around, and provide a stable history.
So far we've always been deferring the deletion, retry and early abort
logic to `sendonion` and `sendpay` which do not have the context to
decide if a call is legitimate or not (they were mostly based on
heuristics). By calling `listsendpays` for the invoice's
`payment_hash` we can identify what our `groupid` should be, but more
importantly we can also abort if another payment is pending or a prior
attempt has already succeeded.
When doing things like `waitsendpay` without specifying the `groupid`
we likely want to use the latest `groupid` we created, since that's
the one in flight. This adds a function to quickly retrieve that.
channeld may have sent a ping before shutdown, now closingd sees it.
```
lightningd-2: 2021-10-11T20:55:18.892Z DEBUG 0266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c03518-closingd-chan#1: peer_out WIRE_WARNING
lightningd-2: 2021-10-11T20:55:18.949Z DEBUG 0266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c03518-closingd-chan#1: billboard perm: Expected closing_signed: 0013000166
lightningd-2: 2021-10-11T20:55:18.983Z INFO 0266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c03518-chan#1: Peer transient failure in CLOSINGD_SIGEXCHANGE: closingd WARNING: Expected closing_signed: 0013000166
```
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We were really strict with the version check, requiring an exact
match. We now check that we are running at least with the version we
were compiled with (distro upgrades continue to work, and repro builds
are built off of an unupdated installation matching this minimum
requirement), and a major version match (since major versions can and
will introduce breaking changes).
Changelog-Fixed: sqlite3: Relaxed the version match requirements to be at least a minimum version and a major version match
If it reconnects by itself, it will get a warning message:
```
lightningd-2: 2021-10-08T01:40:42.446Z DEBUG 0266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c03518-channeld-chan#3: billboard: Sent reestablish, waiting for theirs
lightningd-2: 2021-10-08T01:40:42.446Z DEBUG 0266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c03518-channeld-chan#3: peer_in WIRE_ERROR
lightningd-2: 2021-10-08T01:40:42.447Z DEBUG 0266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c03518-channeld-chan#3: billboard perm: Received error channel 0a6220a3e904d17e72b5c3499928dc8a65720063c6395c999a129a0ff0b06afb: Forcibly closed by `close` command timeout
lightningd-2: 2021-10-08T01:40:42.448Z INFO 0266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c03518-chan#3: Peer transient failure in CHANNELD_NORMAL: channeld WARNING: error channel 0a6220a3e904d17e72b5c3499928dc8a65720063c6395c999a129a0ff0b06afb: Forcibly closed by `close` command timeout
```
And this will make CI complain.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>