wallet/db_sqlite3.c: Support direct replication of SQLITE3 backends.

ChangeLog-Added: With the `sqlite3://` scheme for `--wallet` option, you can now specify a second file path for real-time database backup by separating it from the main file path with a `:` character.
This commit is contained in:
ZmnSCPxj jxPCSnmZ 2021-10-27 17:49:26 +08:00 committed by Rusty Russell
parent 6c34e522dd
commit a294683675
4 changed files with 324 additions and 18 deletions

View File

@ -82,6 +82,95 @@ any in-channel funds.
To recover in-channel funds, you need to use one or more of the other
backup strategies below.
## SQLITE3 `--wallet=${main}:${backup}` And Remote NFS Mount
`/!\` WHO SHOULD DO THIS: Casual users.
`/!\` **CAUTION** `/!\` This technique is only supported on 0.10.3
or later.
On earlier versions, the `:` character is not special and will be
considered part of the path of the database file.
When using the SQLITE3 backend (the default), you can specify a
second database file to replicate to, by separating the second
file with a single `:` character in the `--wallet` option, after
the main database filename.
For example, if the user running `lightningd` is named `user`, and
you are on the Bitcoin mainnet with the default `${LIGHTNINGDIR}`, you
can specify in your `config` file:
wallet=sqlite3:///home/user/.lightning/bitcoin/lightningd.sqlite3:/my/backup/lightningd.sqlite3
Or via command line:
lightningd --wallet=sqlite3:///home/user/.lightning/bitcoin/lightningd.sqlite3:/my/backup/lightningd.sqlite3
If the second database file does not exist but the directory that would
contain it does exist, the file is created.
If the directory of the second database file does not exist, `lightningd` will
fail at startup.
If the second database file already exists, on startup it will be overwritten
with the main database.
During operation, all database updates will be done on both databases.
The main and backup files will **not** be identical at every byte, but they
will still contain the same data.
It is recommended that you use **the same filename** for both files, just on
different directories.
This has the advantage compared to the `backup` plugin below of requiring
exactly the same amount of space on both the main and backup storage.
The `backup` plugin will take more space on the backup than on the main
storage.
It has the disadvantage that it will only work with the SQLITE3 backend and
is not supported by the PostgreSQL backend, and is unlikely to be supported
on any future database backends.
You can only specify *one* replica.
It is recommended that you use a network-mounted filesystem for the backup
destination.
For example, if you have a NAS you can access remotely.
At the minimum, set the backup to a different storage device.
This is no better than just using RAID-1 (and the RAID-1 will probably be
faster) but this is easier to set up --- just plug in a commodity USB
flash disk (with metal casing, since a lot of writes are done and you need
to dissipate the heat quickly) and use it as the backup location, without
repartitioning your OS disk, for example.
Do note that files are not stored encrypted, so you should really not do
this with rented space ("cloud storage").
To recover, simply get **all** the backup database files.
Note that SQLITE3 will sometimes create a `-journal` or `-wal` file, which
is necessary to ensure correct recovery of the backup; you need to copy
those too, with corresponding renames if you use a different filename for
the backup database, e.g. if you named the backup `backup.sqlite3` and
when you recover you find `backup.sqlite3` and `backup.sqlite3-journal`
files, you rename `backup.sqlite3` to `lightningd.sqlite3` and
`backup.sqlite3-journal` to `lightningd.sqlite3-journal`.
Note that the `-journal` or `-wal` file may or may not exist, but if they
*do*, you *must* recover them as well
(there can be an `-shm` file as well in WAL mode, but it is unnecessary;
it is only used by SQLITE3 as a hack for portable shared memory, and
contains no useful data; SQLITE3 will ignore its contents always).
It is recommended that you use **the same filename** for both main and
backup databases (just on different directories), and put the backup in
its own directory, so that you can just recover all the files in that
directory without worrying about missing any needed files or correctly
renaming.
If your backup destination is a network-mounted filesystem that is in a
remote location, then even loss of all hardware in one location will allow
you to still recover your Lightning funds.
However, if instead you are just replicating the database on another
storage device in a single location, you remain vulnerable to disasters
like fire or computer confiscation.
## `backup` Plugin And Remote NFS Mount
`/!\` WHO SHOULD DO THIS: Casual users.

View File

@ -193,6 +193,13 @@ The default wallet corresponds to the following DSN:
--wallet=sqlite3://$HOME/.lightning/bitcoin/lightningd.sqlite3
```
For the `sqlite3` scheme, you can specify a single backup database file
by separating it with a `:` character, like so:
```
--wallet=sqlite3://$HOME/.lightning/bitcoin/lightningd.sqlite3:/backup/lightningd.sqlite3
```
The following is an example of a postgresql wallet DSN:
```

View File

@ -7,6 +7,7 @@ from utils import wait_for, sync_blockheight, COMPAT, VALGRIND, DEVELOPER, only_
import base64
import os
import pytest
import shutil
import time
import unittest
@ -379,3 +380,36 @@ def test_local_basepoints_cache(bitcoind, node_factory):
# after we verified.
l1.restart()
l2.restart()
@unittest.skipIf(os.getenv('TEST_DB_PROVIDER', 'sqlite3') != 'sqlite3', "Tests a feature unique to SQLITE3 backend")
def test_sqlite3_builtin_backup(bitcoind, node_factory):
l1 = node_factory.get_node(start=False)
# Figure out the path to the actual db.
main_db_file = l1.db.path
# Create a backup copy in the same location with the suffix .bak
backup_db_file = main_db_file + ".bak"
# Provide the --wallet option and start.
l1.daemon.opts['wallet'] = "sqlite3://" + main_db_file + ':' + backup_db_file
l1.start()
# Get an address and put some funds.
addr = l1.rpc.newaddr()['bech32']
bitcoind.rpc.sendtoaddress(addr, 1)
bitcoind.generate_block(1)
wait_for(lambda: len(l1.rpc.listfunds()['outputs']) == 1)
# Stop the node.
l1.stop()
# Copy the backup over the main db file.
shutil.copyfile(backup_db_file, main_db_file)
# Remove the --wallet option and start.
del l1.daemon.opts['wallet']
l1.start()
# Should still see the funds.
assert(len(l1.rpc.listfunds()['outputs']) == 1)

View File

@ -6,33 +6,136 @@
#if HAVE_SQLITE3
#include <sqlite3.h>
struct db_sqlite3 {
/* The actual db connection. */
sqlite3 *conn;
/* A replica db connection, if requested, or NULL otherwise. */
sqlite3 *backup_conn;
};
/**
* @param conn: The db->conn void * pointer.
*
* @return the actual sqlite3 connection.
*/
static inline
sqlite3 *conn2sql(void *conn)
{
struct db_sqlite3 *wrapper = (struct db_sqlite3 *) conn;
return wrapper->conn;
}
static void replicate_statement(struct db_sqlite3 *wrapper,
const char *qry)
{
sqlite3_stmt *stmt;
int err;
if (!wrapper->backup_conn)
return;
sqlite3_prepare_v2(wrapper->backup_conn,
qry, -1, &stmt, NULL);
err = sqlite3_step(stmt);
sqlite3_finalize(stmt);
if (err != SQLITE_DONE)
db_fatal("Failed to replicate query: %s: %s: %s",
sqlite3_errstr(err),
sqlite3_errmsg(wrapper->backup_conn),
qry);
}
static void db_sqlite3_changes_add(struct db_sqlite3 *wrapper,
struct db_stmt *stmt,
const char *qry)
{
replicate_statement(wrapper, qry);
db_changes_add(stmt, qry);
}
/* Check if both sqlite3 databases have a data_version variable,
* *and* are the same.
*/
static bool have_same_data_version(sqlite3 *a, sqlite3 *b)
{
sqlite3_stmt *stmt;
const char *qry = "SELECT intval FROM vars"
" WHERE name = 'data_version';";
int err;
u64 version_a;
u64 version_b;
sqlite3_prepare_v2(a, qry, -1, &stmt, NULL);
err = sqlite3_step(stmt);
if (err != SQLITE_ROW) {
sqlite3_finalize(stmt);
return false;
}
version_a = sqlite3_column_int64(stmt, 0);
sqlite3_finalize(stmt);
sqlite3_prepare_v2(b, qry, -1, &stmt, NULL);
err = sqlite3_step(stmt);
if (err != SQLITE_ROW) {
sqlite3_finalize(stmt);
return false;
}
version_b = sqlite3_column_int64(stmt, 0);
sqlite3_finalize(stmt);
return version_a == version_b;
}
#if !HAVE_SQLITE3_EXPANDED_SQL
/* Prior to sqlite3 v3.14, we have to use tracing to dump statements */
struct db_sqlite3_trace {
struct db_sqlite3 *wrapper;
struct db_stmt *stmt;
};
static void trace_sqlite3(void *stmtv, const char *stmt)
{
struct db_stmt *s = (struct db_stmt*)stmtv;
db_changes_add(s, stmt);
struct db_sqlite3_trace *trace = (struct db_sqlite3_trace *)stmtv;
struct db_sqlite3 *wrapper = trace->wrapper;
struct db_stmt *s = trace->stmt;
db_sqlite3_changes_add(wrapper, s, stmt);
}
#endif
static const char *db_sqlite3_fmt_error(struct db_stmt *stmt)
{
return tal_fmt(stmt, "%s: %s: %s", stmt->location, stmt->query->query,
sqlite3_errmsg(stmt->db->conn));
sqlite3_errmsg(conn2sql(stmt->db->conn)));
}
static bool db_sqlite3_setup(struct db *db)
{
char *filename;
char *sep;
char *backup_filename = NULL;
sqlite3_stmt *stmt;
sqlite3 *sql;
int err, flags = SQLITE_OPEN_READWRITE | SQLITE_OPEN_CREATE;
struct db_sqlite3 *wrapper;
if (!strstarts(db->filename, "sqlite3://") || strlen(db->filename) < 10)
db_fatal("Could not parse the wallet DSN: %s", db->filename);
/* Strip the scheme from the dsn. */
filename = db->filename + strlen("sqlite3://");
/* Look for a replica specification. */
sep = strchr(filename, ':');
if (sep) {
/* Split at ':'. */
filename = tal_strndup(db, filename, sep - filename);
backup_filename = tal_strdup(db, sep + 1);
}
wrapper = tal(db, struct db_sqlite3);
db->conn = wrapper;
err = sqlite3_open_v2(filename, &sql, flags, NULL);
@ -40,7 +143,55 @@ static bool db_sqlite3_setup(struct db *db)
db_fatal("failed to open database %s: %s", filename,
sqlite3_errstr(err));
}
db->conn = sql;
wrapper->conn = sql;
if (!backup_filename)
wrapper->backup_conn = NULL;
else {
err = sqlite3_open_v2(backup_filename,
&wrapper->backup_conn,
flags, NULL);
if (err != SQLITE_OK) {
db_fatal("failed to open backup database %s: %s",
backup_filename,
sqlite3_errstr(err));
}
sqlite3_prepare_v2(wrapper->backup_conn,
"PRAGMA foreign_keys = ON;", -1, &stmt,
NULL);
err = sqlite3_step(stmt);
sqlite3_finalize(stmt);
if (err != SQLITE_DONE) {
db_fatal("failed to use backup database %s: %s",
backup_filename,
sqlite3_errstr(err));
}
}
/* If we have a backup db, but it does not have a matching
* data_version, copy over the main database. */
if (wrapper->backup_conn &&
!have_same_data_version(wrapper->conn, wrapper->backup_conn)) {
/* Copy the main database over the backup database. */
sqlite3_backup *copier = sqlite3_backup_init(wrapper->backup_conn,
"main",
wrapper->conn,
"main");
if (!copier) {
db_fatal("failed to initiate copy to %s: %s",
backup_filename,
sqlite3_errmsg(wrapper->backup_conn));
}
err = sqlite3_backup_step(copier, -1);
if (err != SQLITE_DONE) {
db_fatal("failed to copy database to %s: %s",
backup_filename,
sqlite3_errstr(err));
}
sqlite3_backup_finish(copier);
}
/* In case another process (litestream?) grabs a lock, we don't
* want to return SQLITE_BUSY immediately (which will cause a
@ -48,9 +199,10 @@ static bool db_sqlite3_setup(struct db *db)
* We *could* make this an option, but surely the user prefers a
* long timeout over an outright crash.
*/
sqlite3_busy_timeout(db->conn, 60000);
sqlite3_busy_timeout(conn2sql(db->conn), 60000);
sqlite3_prepare_v2(db->conn, "PRAGMA foreign_keys = ON;", -1, &stmt, NULL);
sqlite3_prepare_v2(conn2sql(db->conn),
"PRAGMA foreign_keys = ON;", -1, &stmt, NULL);
err = sqlite3_step(stmt);
sqlite3_finalize(stmt);
return err == SQLITE_DONE;
@ -59,7 +211,7 @@ static bool db_sqlite3_setup(struct db *db)
static bool db_sqlite3_query(struct db_stmt *stmt)
{
sqlite3_stmt *s;
sqlite3 *conn = (sqlite3*)stmt->db->conn;
sqlite3 *conn = conn2sql(stmt->db->conn);
int err;
err = sqlite3_prepare_v2(conn, stmt->query->query, -1, &s, NULL);
@ -108,10 +260,15 @@ static bool db_sqlite3_exec(struct db_stmt *stmt)
{
int err;
bool success;
struct db_sqlite3 *wrapper = (struct db_sqlite3 *) stmt->db->conn;
#if !HAVE_SQLITE3_EXPANDED_SQL
/* Register the tracing function if we don't have an explicit way of
* expanding the statement. */
sqlite3_trace(stmt->db->conn, trace_sqlite3, stmt);
struct db_sqlite3_trace trace;
trace.wrapper = wrapper;
trace.stmt = stmt;
sqlite3_trace(conn2sql(stmt->db->conn), trace_sqlite3, &trace);
#endif
if (!db_sqlite3_query(stmt)) {
@ -132,7 +289,7 @@ static bool db_sqlite3_exec(struct db_stmt *stmt)
/* Manually expand and call the callback */
char *expanded_sql;
expanded_sql = sqlite3_expanded_sql(stmt->inner_stmt);
db_changes_add(stmt, expanded_sql);
db_sqlite3_changes_add(wrapper, stmt, expanded_sql);
sqlite3_free(expanded_sql);
#endif
success = true;
@ -141,7 +298,7 @@ done:
#if !HAVE_SQLITE3_EXPANDED_SQL
/* Unregister the trace callback to avoid it accessing the potentially
* stale pointer to stmt */
sqlite3_trace(stmt->db->conn, NULL, NULL);
sqlite3_trace(conn2sql(stmt->db->conn), NULL, NULL);
#endif
return success;
@ -157,11 +314,16 @@ static bool db_sqlite3_begin_tx(struct db *db)
{
int err;
char *errmsg;
err = sqlite3_exec(db->conn, "BEGIN TRANSACTION;", NULL, NULL, &errmsg);
struct db_sqlite3 *wrapper = (struct db_sqlite3 *) db->conn;
err = sqlite3_exec(conn2sql(db->conn),
"BEGIN TRANSACTION;", NULL, NULL, &errmsg);
if (err != SQLITE_OK) {
db->error = tal_fmt(db, "Failed to begin a transaction: %s", errmsg);
return false;
}
replicate_statement(wrapper, "BEGIN TRANSACTION;");
return true;
}
@ -169,11 +331,16 @@ static bool db_sqlite3_commit_tx(struct db *db)
{
int err;
char *errmsg;
err = sqlite3_exec(db->conn, "COMMIT;", NULL, NULL, &errmsg);
struct db_sqlite3 *wrapper = (struct db_sqlite3 *) db->conn;
err = sqlite3_exec(conn2sql(db->conn),
"COMMIT;", NULL, NULL, &errmsg);
if (err != SQLITE_OK) {
db->error = tal_fmt(db, "Failed to commit a transaction: %s", errmsg);
return false;
}
replicate_statement(wrapper, "COMMIT;");
return true;
}
@ -222,19 +389,24 @@ static void db_sqlite3_stmt_free(struct db_stmt *stmt)
static size_t db_sqlite3_count_changes(struct db_stmt *stmt)
{
sqlite3 *s = stmt->db->conn;
sqlite3 *s = conn2sql(stmt->db->conn);
return sqlite3_changes(s);
}
static void db_sqlite3_close(struct db *db)
{
sqlite3_close(db->conn);
db->conn = NULL;
struct db_sqlite3 *wrapper = (struct db_sqlite3 *) db->conn;
if (wrapper->backup_conn)
sqlite3_close(wrapper->backup_conn);
sqlite3_close(wrapper->conn);
db->conn = tal_free(db->conn);
}
static u64 db_sqlite3_last_insert_id(struct db_stmt *stmt)
{
sqlite3 *s = stmt->db->conn;
sqlite3 *s = conn2sql(stmt->db->conn);
return sqlite3_last_insert_rowid(s);
}
@ -243,11 +415,15 @@ static bool db_sqlite3_vacuum(struct db *db)
int err;
sqlite3_stmt *stmt;
sqlite3_prepare_v2(db->conn, "VACUUM;", -1, &stmt, NULL);
struct db_sqlite3 *wrapper = (struct db_sqlite3 *) db->conn;
sqlite3_prepare_v2(conn2sql(db->conn), "VACUUM;", -1, &stmt, NULL);
err = sqlite3_step(stmt);
if (err != SQLITE_DONE)
db->error = tal_fmt(db, "%s", sqlite3_errmsg(db->conn));
db->error = tal_fmt(db, "%s",
sqlite3_errmsg(conn2sql(db->conn)));
sqlite3_finalize(stmt);
replicate_statement(wrapper, "VACUUM;");
return err == SQLITE_DONE;
}