We have a primary key that is spanning the `in_channel_id` and the
`in_htcl_id`. The latter gets set to NULL when the HTLC and channel
gets deleted, so we coalesce with a random large number that is
unlikely to collide for the primary key.
We have to allow them (as otherwise `fees_collected_msat` in getinfo breaks),
but it means that actually, in_htlc_id might be missing in listforwards
(also, out_htlc_id might be missing, which we didn't catch before).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Fixes: #5628
This adds a new "chan_dying" message to the gossip_store, but since we
already changed the minor version in this PR, we don't bump it again.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Added: Protocol: We now delay forgetting funding-spent channels for 12 blocks (as per latest BOLTs, to support splicing in future).
Normally, we'd use the delete_columns function to remove the old
`short_channel_id` string field, *but* we can't do that for sqlite, as
there are other tables with references to it. So add a FIXME to do
it once everyone has upgraded to an sqlite3 which has native support
for column deletion.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is a good sanity check that users understand that if they upgrade
to master mid-cycle they can't go back!
Suggested-by: @wtogami
Changelog-Added: Config: `--database-upgrade=true` required if a non-release version wants to (irrevocably!) upgrade the db.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Some tests need to inspect it, but most don't, and I suspect I'm missing some
error messages due to this.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
If you get the wrong hsm_secret, your node_id will change, and
peers won't know who you are, bitcoind will reject your transaction
signatures, and other madness.
Catch this as soon as it happens, by storing our node_id in the db.
Suggested-by: @cdecker, @fiatjaf
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Changed: Config: `lightningd` will refuse to start with the wrong node_id (i.e. hsm_secret changes).
Connectd already does this when we *receive* an error or warning, but
now do it on send. This causes some slight behavior change: we don't
disconnect when we close a channel, for example (our behaviour here
has been inconsistent across versions, depending on the code).
When connectd is told to disconnect, it now does so immediately, and
doesn't wait for subds to drain etc. That simplifies the manual
disconnect case, which now cleans up as it would from any other
disconnection when connectd says it's disconnected.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
ChangeLog-Added: With the `sqlite3://` scheme for `--wallet` option, you can now specify a second file path for real-time database backup by separating it from the main file path with a `:` character.
v2 channel open uses a different method to derive the channel_id, so now
we save it to the database so that we dont have to remember how to
derive it for each.
includes a migration for existing channels
the way we use PSBTs to sign things requires that we have the
scriptpubkey available on the utxo so we can populate the witness-utxo
field with it.
this causes problems if we don't already have the scriptpubkey cached in
the database, as in *some* cases we require a round trip to the HSM to
populate them
to get over this hump, we backfill any and all missing scriptpubkey
information for the utxo's that we hold in our wallet.
this will allow us to clean up the NULL handling of missing
scriptpubkeys.
We erase peer data after the last channel close transaction for that
peer is 100 blocks deep. We were failing to finish the migration because
the peer_id lookup on these was failing.
Now we ignore any channel with a null peer_id.
Fixes#3768
We use a database snapshot with 3 channels -- two of which have HTLCs
dangling and one is an initial open channel tx in the 'old' tx hex
format in last_tx and confirm that they are successfully updated to PSBT
format on start.
Valgrind doesn't really like crashes if compiled without DEVELOPER since that
seems to compile out the debug symbols, resulting in the following error:
```
Optimistic lock on the database failed. There may be a concurrent access to the database. Aborting since concurrent access is unsafe.
lightningd: FATAL SIGNAL 6 (version 0.0.99)
lightningd/lightningd: libbacktrace: no debug info in ELF executable
lightningd/lightningd: libbacktrace: no debug info in ELF executable
lightningd/lightningd: libbacktrace: no debug info in ELF executable
lightningd/lightningd: libbacktrace: no debug info in ELF executable
lightningd/lightningd: libbacktrace: no debug info in ELF executable
lightningd/lightningd: libbacktrace: no debug info in ELF executable
lightningd/lightningd: libbacktrace: no debug info in ELF executable
lightningd/lightningd: libbacktrace: no debug info in ELF executable
lightningd/lightningd: libbacktrace: no debug info in ELF executable
lightningd/lightningd: libbacktrace: no debug info in ELF executable
lightningd/lightningd: libbacktrace: no debug info in ELF executable
lightningd/lightningd: libbacktrace: no debug info in ELF executable
lightningd/lightningd: libbacktrace: no debug info in ELF executable
lightningd: FATAL SIGNAL 11 (version 0.0.99)
lightningd/lightningd: libbacktrace: no debug info in ELF executable
lightningd/lightningd: libbacktrace: no debug info in ELF executable
lightningd/lightningd: libbacktrace: no debug info in ELF executable
lightningd/lightningd: libbacktrace: no debug info in ELF executable
2020-01-07 15:26:03.539 EST [11583] LOG: unexpected EOF on client connection with an open transaction
--------------------------- Captured stdout teardown ---------------------------
DEBUG:root:Calling stop with payload None
------------------------------- Valgrind errors --------------------------------
Valgrind error file: valgrind-errors.11409
==11409== Jump to the invalid address stated on the next line
==11409== at 0x0: ???
==11409== by 0x1C00A8: backtrace_full (backtrace.c:127)
==11409== by 0x147B0A: send_backtrace (daemon.c:46)
==11409== by 0x147B55: crashdump (daemon.c:54)
==11409== by 0x6071F1F: ??? (in /lib/x86_64-linux-gnu/libc-2.27.so)
==11409== by 0x6071E96: __libc_signal_restore_set (nptl-signals.h:80)
==11409== by 0x6071E96: raise (raise.c:48)
==11409== by 0x6073800: abort (abort.c:79)
==11409== by 0x12B2FF: fatal (log.c:819)
==11409== by 0x16FA3B: db_data_version_incr (db.c:826)
==11409== by 0x16FA9E: db_commit_transaction (db.c:841)
==11409== by 0x124D20: io_loop_with_timers (io_loop_with_timers.c:34)
==11409== by 0x129260: main (lightningd.c:860)
==11409== Address 0x0 is not stack'd, malloc'd or (recently) free'd
==11409==
==11409==
==11409== Process terminating with default action of signal 11 (SIGSEGV)
==11409== Bad permissions for mapped region at address 0x0
==11409== at 0x0: ???
==11409== by 0x1C00A8: backtrace_full (backtrace.c:127)
==11409== by 0x147B0A: send_backtrace (daemon.c:46)
==11409== by 0x147B55: crashdump (daemon.c:54)
==11409== by 0x6071F1F: ??? (in /lib/x86_64-linux-gnu/libc-2.27.so)
--------------------------------------------------------------------------------
```
The optimistic lock prevents multiple instances of c-lightning making
concurrent modifications to the database. That would be unsafe as it messes up
the state in the DB. The optimistic lock is implemented by checking whether a
gated update on the previous value of the `data_version` actually results in
an update. If that's not the case the DB has been changed under our feet.
The lock provides linearizability of DB modifications: if a database is
changed under the feet of a running process that process will `abort()`, which
from a global point of view is as if it had crashed right after the last
successful commit. Any process that also changed the DB must've started
between the last successful commit and the unsuccessful one since otherwise
its counters would not have matched (which would also have aborted that
transaction). So this reduces all the possible timelines to an equivalent
where the first process died, and the second process recovered from the DB.
This is not that interesting for `sqlite3` where we are also protected via the
PID file, but when running on multiple hosts against the same DB, e.g., with
`postgres`, this protection becomes important.
Changelog-Added: DB: Optimistic logging prevents instances from running concurrently against the same database, providing linear consistency to changes.
This leads to all sorts of problems; in particular it's incredibly
slow (days, weeks!) if bitcoind is a long way back. This also changes
the behaviour of a rescan argument referring to a future block: we will
also refuse to start in that case, which I think is the correct behavior.
We already ignore bitcoind if it goes backwards while we're running.
Also cover a false positive memleak.
Changelog-Fixed: If bitcoind goes backwards (e.g. reindex) refuse to start (unless forced with --rescan).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Sometimes the l3 seeker asks for scids, and the reply contains the
channel which is then closed by the time it checks, so it considers
the updates bad gossip.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This was the initial issue that was addressed by #2756 and now we just test
that all is working as expected.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
This is just the test that we use to verify block backfilling below the wallet
birth height is working correctly.
Signed-off-by: Christian Decker <decker.christian@gmail.com>