Commit Graph

361 Commits

Author SHA1 Message Date
Rusty Russell
375215a141 lightningd: more graceful shutdown.
Be more graceful in shutting down: this should fix the issue where
bookkeeper gets upset that its commands are rejected during shutdown,
and generally make things more graceful.

1. Stop any new RPC connections.
2. Stop any per-peer daemons (channeld, etc).
3. Shut down plugins.
4. Stop all existing RPC connections.
5. Stop global daemons.
6. Free up peer, chanen HTLC datastructures.
7. Close database.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Changed: Plugins: RPC operations are now still available during shutdown.
2022-09-12 14:00:41 +02:00
Rusty Russell
80a6d9b58e lightningd: set the channel_type feature.
AFAICT we should have been doing this since we started sending and
receiving it, but didn't.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Added: Protocol: we now advertize the `option_channel_type` feature (which we actually supported since v0.10.2)
2022-08-08 11:49:56 -05:00
niftynei
7bbfef5054 tests: flake fix; l1 was waiting too long to reconnect
We were waiting too long for the reconnect to happen (60s default),
which caused this test to timeout.

When testing, let's speed up the reconnect.

L2 tried to reconnect but didn't have connection information in its
gossip -- is there a way to ask/save connection data from a node you're
making a channel with that doesn't rely on their node_announcement?
2022-07-25 16:28:09 +09:30
Rusty Russell
5979a7778f lightningd: expand exit codes for various failures.
Most unexpected ones are still 1, but there are a few recognizable error codes
worth documenting.

Rename the HSM ones to put ERRCODE_ at the front, since we have non-HSM ones
too now.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2022-07-20 19:28:33 +09:30
Rusty Russell
e96eb07ef4 lightningd: test that hsm_secret is as expected, at startup.
If you get the wrong hsm_secret, your node_id will change, and
peers won't know who you are, bitcoind will reject your transaction
signatures, and other madness.

Catch this as soon as it happens, by storing our node_id in the db.

Suggested-by: @cdecker, @fiatjaf
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Changed: Config: `lightningd` will refuse to start with the wrong node_id (i.e. hsm_secret changes).
2022-07-20 19:28:33 +09:30
Rusty Russell
bdefbabbef lightningd: re-transmit all closing transactions on startup.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2022-07-14 12:40:57 -05:00
Christian Decker
e4511452ac bolt: Reflect the zeroconf featurebits in code 2022-07-04 22:14:06 +02:00
Vincenzo Palazzo
7ff62b4a00 lightnind: removeDEFAULT_PORT global definition
Signed-off-by: Vincenzo Palazzo <vincenzopalazzodev@gmail.com>
2022-06-28 06:09:01 +09:30
Vincenzo Palazzo
cc7a405ca4 lightningd: use the standard port derivation in connect command
Complete implementation of BOLT1 port derivation proposal https://github.com/lightning/bolts/pull/968

Changelog-Added: rpc: use the standard port derivation in connect command when the port is not specified.

Signed-off-by: Vincenzo Palazzo <vincenzopalazzodev@gmail.com>
2022-06-28 06:09:01 +09:30
Rusty Russell
3f98cf3fce lightningd: track weird CI crash in test_important_plugin
Looks like we woke one of the startup io_loops early, and thus
we thought we'd finished connectd_activate and we hadn't.  This
caused us to use an uninitialized ld->announceable array, and
finally caused an assert fail in the main loop.

Make *every* loop assert that it was exited for the correct reason,
so if it happens again, we can maybe figure out what part of
the code to look at.

```
lightningd: lightningd/lightningd.c:1186: main: Assertion `io_loop_ret == ld' failed.
lightningd: FATAL SIGNAL 6 (version 4df66fa)
...
------------------------------- Valgrind errors --------------------------------
Valgrind error file: valgrind-errors.895509
==895509== Conditional jump or move depends on uninitialised value(s)
==895509==    at 0x22C58E: to_tal_hdr_or_null (tal.c:184)
==895509==    by 0x22D531: tal_bytelen (tal.c:637)
==895509==    by 0x1F10B6: towire_gossipd_init (gossipd_wiregen.c:100)
==895509==    by 0x13AC6E: gossip_init (gossip_control.c:254)
==895509==    by 0x1497EC: main (lightningd.c:1090)
==895509== 
```

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2022-06-27 17:21:35 +09:30
Rusty Russell
56dde2cb77 lightningd: multiple log-file options allow more than one log output.
I've wanted this for a while: the ability to log to multiple places
at once.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Changed: lightningd: `log-file` option specified multiple times opens multiple log files.
2022-06-27 17:21:35 +09:30
Rusty Russell
836c1b805b doc: update c-lightning to Core Lightning almost everywhere.
Mostly comments and docs: some places are actually paths, which
I have avoided changing.  We may migrate them slowly, particularly
when they're user-visible.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2022-04-07 06:53:26 +09:30
Rusty Russell
e01abf0b34 bolt11: support payment_metadata.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2022-04-02 09:40:18 +10:30
Rusty Russell
ea7120a313 lightningd: add --dev-no-ping-timer to avoid ping response timeouts.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2022-03-31 13:40:27 +10:30
Rusty Russell
20392ae526 connectd: restore obs2 onion support.
I removed these prematurely: we *haven't* had a release since
introducing them!

This consists of reverting d15d629b8b
"plugins/fetchinvoice: remove obsolete string-based API." and
plugins/fetchinvoice: remove obsolete string-based
API. "onion_messages: remove obs2 support."

Some minor changes due to updated fromwire_tlv API since they
were removed, but not much.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-EXPERIMENTAL: REVERT: Removed backwards compat with onion messages from v0.10.1.
2022-03-29 10:55:12 +10:30
Rusty Russell
7829f2eb06 onion_messages: remove obs2 support.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-EXPERIMENTAL: Removed backwards compat with onion messages from v0.10.1.
2022-03-25 13:55:44 +10:30
Rusty Russell
33abf93ec1 lightningd: rename activate_peers() to setup_peers().
Activate means a specific thing now (connectd said something), so avoid
confusing it with this function.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2022-03-23 13:20:12 +10:30
Rusty Russell
deecedb033 connectd: tell lightningd when disconnect is complete.
This avoids races in our tests where we assume it's sync (and is kind
of nicer).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2022-03-23 13:20:12 +10:30
Rusty Russell
0db05f6e9c lightningd: opt_var_onion is now a compulsory feature.
We're about to drop support for legacy.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2022-03-18 09:20:11 +10:30
Michael Schmoock
28b4e57974 lightningd: store recently reported remote_addr 2022-03-11 16:42:45 +10:30
niftynei
ce12d2b8a9 database: pull out database code into a new module
We're going to reuse the database controllers for the accounting plugin
2022-03-05 15:03:34 +10:30
Rusty Russell
0c24334738 lightningd: clean up subds before freeing HTLCs.
Otherwise we get weird effects, as htlcs are being freed:

```
2022-01-26T05:07:37.8774610Z lightningd-1: 2022-01-26T04:47:48.770Z DEBUG   030eeb52087b9dbb27b7aec79ca5249369f6ce7b20a5684ce38d9f4595a21c2fda-chan#8: Failing HTLC 18446744073709551615 due to peer death
2022-01-26T05:07:37.8775287Z lightningd-1: 2022-01-26T04:47:48.770Z **BROKEN** 030eeb52087b9dbb27b7aec79ca5249369f6ce7b20a5684ce38d9f4595a21c2fda-chan#8: Neither origin nor in?
```

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2022-02-08 11:15:52 +10:30
Rusty Russell
ca08f27d54 connectd: remove second gossip fd.
Now we only send and receive gossip messages on this fd.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2022-02-08 11:15:52 +10:30
Rusty Russell
bba468a51c connectd: temporarily have two fds to gossipd.
We want to stream gossip through this, but currently connectd treats the
fd as synchronous.  While we work on getting rid of that, it's easiest to
have two fds.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2022-02-08 11:15:52 +10:30
Rusty Russell
5065bd6fc2 lightningd: use our cached channel_update for errors instead of asking gossipd.
We also no longer strip the type off: everyone handles both forms, and
Eclair doesn't strip (and it's easier!).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2022-02-08 11:15:52 +10:30
Rusty Russell
a8bed542f7 lightningd: start gossipd after channels are loaded.
This is in preparation for gossipd feeding us the latest channel_updates,
rather than having lightningd and channeld query gossipd when it wants
to send an onion error with an update included.

This means gossipd will start telling us the updates, so we need the
channels loaded first.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2022-02-08 11:15:52 +10:30
Rusty Russell
4a4f85dd3f subd: fix waitpid properly.
lightningd would race with the subd destructor to do the waitpid(),
resulting in UNUSUAL log messages, but also us missing if a plugin
was killed via a signal.

We can also get rid of the gratuitous waitpid() in test_subdaemons.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2022-01-25 06:26:52 +10:30
Rusty Russell
beed4fcccc lightningd: fix backwards test in sigchld.
This would never trigger, since test was backward.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2022-01-25 06:26:52 +10:30
Rusty Russell
71f736678f lightningd: make sure gossipd knows before we update blockheight.
Without this, we can get spurious failures from lnprototest, which
is waiting for lightningd to acknowledge the blockheight in getinfo:

```
2021-12-21T00:56:21.460Z DEBUG   lightningd: Adding block 109: 57a7bd3ade3a88a899e5b442075e9722ccec07e0205f9a913b304a3e2e038e26
2021-12-21T00:56:21.470Z DEBUG   lightningd: Adding block 110: 11a280eb76f588e92e20c39999be9d2baff016c3c6bac1837b649a270570b7dd
2021-12-21T00:56:21.479Z DEBUG   lightningd: Adding block 111: 02977fc9529b2ab4e0a805c4bc1bcfbff5a4e6577a8b31266341d22e204a1d27
2021-12-21T00:56:21.487Z DEBUG   lightningd: Adding block 112: 2402f31c5ddfc9e847e8bbfb7df084d29a5d5d936a4358c109f2f4cf9ea8d828
2021-12-21T00:56:21.496Z DEBUG   lightningd: Adding block 113: 5a561fe9423b4df33f004fc09985ee3ef38364d692a56a8b27ecbc6098a16d39
2021-12-21T00:56:21.505Z DEBUG   lightningd: Adding block 114: 4502f5ec23c89177872846848848322e8fa6c3fb6f5eb361194e4cd47596dfe9
2021-12-21T00:56:21.511Z DEBUG   02f9308a019258c31049344f85f89d5229b531c845836f99b08601f113bce036f9-gossipd: Ignoring future channel_announcment for 109x1x0 (current block 108)
```

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2022-01-20 15:24:06 +10:30
Rusty Russell
70ed47d77a channeld: add dev-disable-commit-after instead of dev-disconnect -nocommit
It was always a hack, but an impossible one once connectd will be
interpreting dev-disconnect!

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2021-12-30 09:50:40 +10:30
niftynei
4506f639fa coin_moves: remove 'index' for moves; bump version
Get rid of the 'movement_idx', since we replay events now.

Since we're removing a field from the 'coin_movement' event emission, we
bump the version type.

Changelog-Updated: `coin_movements` events have been revamped and are now on version 2.
2021-12-28 04:42:42 +10:30
Anand Suresh
0c487a61ba Fix comment to reflect the code
The comment mentions a 100MB log book, but the code was modified 15 months ago to allocate a 10MB log book to preserve memory.
2021-12-21 10:59:25 +10:30
Simon Vrouwe
426ff0abff lightningd: cleanup obsolete plugins->shutdown flag
After leaving the main event loop, the only path to destroy_plugin
goes via shutdown_plugins.
2021-12-14 09:33:10 +10:30
Simon Vrouwe
f936fa926f plugins: simplify shutdown loop, simply close the db
The only thing that needs ld->wallet after this is destroy_invoices_waiter (off jsonrpc)
Could not find any other destructors (destroy_*) that need wallet or db access after this.
Any db access would now segfault.
2021-12-14 09:33:10 +10:30
Jan Sarenik
96f28323bd Set default port according to network
The idea is to have different default ports for different networks.

Current default port is `9735` for everything. Let's use it for
the mainnet and reuse the difference added to the default port
from `rpc_port` values in `bitcoin/chainstate.c`.

Testnet would be `19735` (adding rpc_port - 8332 = `10000`).

Signet would be `39735` (adding rpc_port - 8332 = `30000`).

Regtest would be `19846` (adding rpc_port - 8332 = `10111`).

With Vincenzo's kind pair-programming help over tmate.

Two other commits were squashed into this one so that bisecting
never ends up in half-baked state:

1. chainparams: Fix regtest default rpc_port

   bitcoind -help says this:

    -rpcport=<port>
         Listen for JSON-RPC connections on <port> (default: 8332, testnet:
         18332, signet: 38332, regtest: 18443)

2. test_gossip: Default port for regtest

   hex: 2607 is now .... (could be 4d86 but Elements uses another port)
   dec: 9735 is now any port (could be 19846 ^^ but now is for any port)

   The lines which were binding to default port were removed as the
   default port is different on each network.

NOTE: Remember not to modify gossip_store tests which loads everything raw
      including the checksums.

Changelog-Changed: If the port is unspecified, the default port is chosen according to used network similarly to Bitcoin Core.
2021-12-06 17:10:08 +10:30
Rusty Russell
4ffda340d3 check: make sure all files outside contrib/ include "config.h" first.
And turn "" includes into full-path (which makes it easier to put
config.h first, and finds some cases check-includes.sh missed
previously).

config.h sets _GNU_SOURCE which really needs to be done before any
'#includes': we mainly got away with it with glibc, but other platforms
like Alpine may have stricter requirements.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2021-12-06 10:05:39 +10:30
Simon Vrouwe
ae4623c21a lightingd: removal of sigchild_handler in shutdown now also closes its pipe
Setting SIGCHLD back to default (i.e. ignored) makes waitpid hang on an
old SIGCHLD that was still in the pipe?

This happens running test_important_plugin with developer=1:
(or with dev=0 and build-in plugins subscribed to "shutdown")

0  0x00007ff8336b6437 in __GI___waitpid (pid=-1, stat_loc=0x0, options=1) at ../sysdeps/unix/sysv/linux/waitpid.c:30
1  0x000055fb468f733a in sigchld_rfd_in (conn=0x55fb47c7cfc8, ld=0x55fb47bdce58) at lightningd/lightningd.c:785
2  0x000055fb469bcc6b in next_plan (conn=0x55fb47c7cfc8, plan=0x55fb47c7cfe8) at ccan/ccan/io/io.c:59
3  0x000055fb469bd80b in do_plan (conn=0x55fb47c7cfc8, plan=0x55fb47c7cfe8, idle_on_epipe=false) at ccan/ccan/io/io.c:407
4  0x000055fb469bd849 in io_ready (conn=0x55fb47c7cfc8, pollflags=1) at ccan/ccan/io/io.c:417
5  0x000055fb469bfa26 in io_loop (timers=0x55fb47c41198, expired=0x7ffdf4be9028) at ccan/ccan/io/poll.c:453
6  0x000055fb468f1be9 in io_loop_with_timers (ld=0x55fb47bdce58) at lightningd/io_loop_with_timers.c:21
7  0x000055fb46924817 in shutdown_plugins (ld=0x55fb47bdce58) at lightningd/plugin.c:2114
8  0x000055fb468f7c92 in main (argc=22, argv=0x7ffdf4be9228) at lightningd/lightningd.c:1195
2021-11-30 13:34:44 +10:30
Simon Vrouwe
0388314ef8 JSON RPC: Calls made in shutdown loop receive error code -5: LIGHTNINGD_SHUTDOWN 2021-11-30 13:34:44 +10:30
Simon Vrouwe
63bd569bf6 lightningd: cleanup, freeing jsonrpc in shutdown cannot trigger db write's anymore
since PR #3867 utxos are unreserved by height, destroy_utxos and
related functions are not used anymore so clean them up also

However free(ld->jsonrpc) still needs to happen before free(ld) because its
destructors need list_head pointers from ld
2021-11-30 13:34:44 +10:30
Simon Vrouwe
5f69674faa lightningd: shutdown plugins after subdaemons and assert no write access to db
because:
    - shutdown_subdaemons can trigger db write, comments in that function say so at least
    - resurrecting the main event loop with subdaemons still running is counter productive
      in shutting down activity (such as htlc's, hook_calls etc.)
    - custom behavior injected by plugins via hooks should be consistent, see test
      in previous commmit

    IDEA:

    in shutdown_plugins, when starting new io_loop:

    - A plugin that is still running can return a jsonrpc_request response, this triggers
      response_cb, which cannot be handled because subdaemons are gone -> so any response_cb should be blocked/aborted

    - jsonrpc is still there, so users (such as plugins) can make new jsonrpc_request's which
      cannot be handled because subdaemons are gone -> so new rpc_request should also be blocked

    - But we do want to send/receive notifications and log messages (handled in jsonrpc as jsonrpc_notification)
      as these do not trigger subdaemon calls or db_write's
      Log messages and notifications do not have "id" field, where jsonrpc_request *do* have an "id" field

    PLAN (hypothesis):
    - hack into plugin_read_json_one OR plugin_response_handle to filter-out json with
      an "id" field, this should
      block/abandon any jsonrpc_request responses (and new jsonrpc_requests for plugins?)

  Q. Can internal (so not via plugin) jsonrpc_requests called in the main io_loop return/revive in
     the shutdown io_loop?
  A. No. All code under lightningd/ returning command_still_pending depends on either a subdaemon, timer or
     plugin. In shutdown loop the subdaemons are dead, timer struct cleared and plugins will be taken
     care of (in next commits).

 fixup: we can only io_break the main io_loop once
2021-11-30 13:34:44 +10:30
Rusty Russell
65bb989cf1 pytest: don't checksum plugins on startup in VALGRIND developer mode.
This loads up 20MB of plugins temporarily; we seem to be getting OOM
killed under CI and I wonder if this is contributing.

Doesn't significantly reduce runtime here, but I have lots of memory.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2021-11-14 18:49:46 +01:00
ZmnSCPxj jxPCSnmZ
e733fdf62e lightningd/lightningd.c: Only impose fd limit if absolutely needed.
Fixes: #4868

ChangeLog-Fixed: We now no longer self-limit the number of file descriptors (which limits the number of channels) in sufficiently modern systems, or where we can access `/proc` or `/dev/fd`.  We still self-limit on old systems where we cannot find the list of open files on `/proc` or `/dev/fd`, so if you need > ~4000 channels, upgrade or mount `/proc`.
2021-10-22 13:17:37 +02:00
ZmnSCPxj jxPCSnmZ
5356267f15 *: Use new closefrom module from ccan.
This also inadvertently fixes a latent bug: before this patch, in the
`subd` function in `lightningd/subd.c`, we would close `execfail[1]`
*before* doing an `exec`.
We use an EOF on `execfail[1]` as a signal that `exec` succeeded (the
fd is marked CLOEXEC), and otherwise use it to pump `errno` to the
parent.
The intent is that this fd should be kept open until `exec`, at which
point CLOEXEC triggers and close that fd and sends the EOF, *or* if
`exec` fails we can send the `errno` to the parent process vua that
pipe-end.

However, in the previous version, we end up closing that fd *before*
reaching `exec`, either in the loop which `dup2`s passed-in fds (by
overwriting `execfail[1]` with a `dup2`) or in the "close everything"
loop, which does not guard against `execfail[1]`, only
`dev_disconnect_fd`.
2021-10-22 13:17:37 +02:00
Rusty Russell
ed6eaf9171 experimental-websocket-port: option to create a WebSocket port.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2021-10-22 11:56:30 +02:00
Rusty Russell
55dbe82162 features: EXPERIMENTAL_FEATURES: advertize option_quiesce
The latest draft has a feature bit here.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2021-10-08 16:07:21 +02:00
Rusty Russell
09c2fef4a4 onion_message: dev options to ignore obsolete/modern onions.
This lets us test that both work, as expected.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2021-10-04 11:58:31 +02:00
Jan Sarenik
284ad2bade Fix for Alpine Linux ond OpenBSD
After recent header files clean-up it was not possible to
build c-lightning 7401b2682. This patch fixes it both for
Alpine Linux and OpenBSD.

Proposed-by: nathanael <nathanael@dalliard.ch>

Changelog-None
2021-09-20 14:44:27 +02:00
Rusty Russell
7401b26824 cleanup: remove unneeded includes in C files.
Before:
 Ten builds, laptop -j5, no ccache:

```
real	0m36.686000-38.956000(38.608+/-0.65)s
user	2m32.864000-42.253000(40.7545+/-2.7)s
sys	0m16.618000-18.316000(17.8531+/-0.48)s
```

 Ten builds, laptop -j5, ccache (warm):

```
real	0m8.212000-8.577000(8.39989+/-0.13)s
user	0m12.731000-13.212000(12.9751+/-0.17)s
sys	0m3.697000-3.902000(3.83722+/-0.064)s
```

After:
 Ten builds, laptop -j5, no ccache: 8% faster

```
real	0m33.802000-35.773000(35.468+/-0.54)s
user	2m19.073000-27.754000(26.2542+/-2.3)s
sys	0m15.784000-17.173000(16.7165+/-0.37)s
```

 Ten builds, laptop -j5, ccache (warm): 1% faster

```
real	0m8.200000-8.485000(8.30138+/-0.097)s
user	0m12.485000-13.100000(12.7344+/-0.19)s
sys	0m3.702000-3.889000(3.78787+/-0.056)s
```

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2021-09-17 09:43:22 +09:30
Rusty Russell
ea30c34d82 cleanup: remove unneeded includes in header files.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2021-09-17 09:43:22 +09:30
Rusty Russell
84957be410 close: spec is final, it's not experimental.
That was quick!

We remove the 50% test, since the default is now to use quickclose.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Added: Protocol: We now perform quick-close if the peer supports it.
2021-09-09 12:04:48 +09:30