Not all plugins depended on their headers. Keep it simple: all
plugins depend on all plugin headers.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Fixes: #4785Fixes: #4883
Changelog-Changed: Plugins: `shutdown` notification is now send when lightningd is almost completely shutdown, RPC calls then fail with error code -5.
No idea why TCSAFLUSH was used, could not find anything in PR comments.
Also cannot explain exactly what causes the problem, but the hang can be reproduced
*with* TCSAFLUSH and not with TCSANOW.
According to termios doc:
TCSANOW
the change occurs immediately.
TCSAFLUSH
the change occurs after all output written to the object referred by fd has been
transmitted, and all input that has been received but not read will be discarded
before the change is made.
Setting SIGCHLD back to default (i.e. ignored) makes waitpid hang on an
old SIGCHLD that was still in the pipe?
This happens running test_important_plugin with developer=1:
(or with dev=0 and build-in plugins subscribed to "shutdown")
0 0x00007ff8336b6437 in __GI___waitpid (pid=-1, stat_loc=0x0, options=1) at ../sysdeps/unix/sysv/linux/waitpid.c:30
1 0x000055fb468f733a in sigchld_rfd_in (conn=0x55fb47c7cfc8, ld=0x55fb47bdce58) at lightningd/lightningd.c:785
2 0x000055fb469bcc6b in next_plan (conn=0x55fb47c7cfc8, plan=0x55fb47c7cfe8) at ccan/ccan/io/io.c:59
3 0x000055fb469bd80b in do_plan (conn=0x55fb47c7cfc8, plan=0x55fb47c7cfe8, idle_on_epipe=false) at ccan/ccan/io/io.c:407
4 0x000055fb469bd849 in io_ready (conn=0x55fb47c7cfc8, pollflags=1) at ccan/ccan/io/io.c:417
5 0x000055fb469bfa26 in io_loop (timers=0x55fb47c41198, expired=0x7ffdf4be9028) at ccan/ccan/io/poll.c:453
6 0x000055fb468f1be9 in io_loop_with_timers (ld=0x55fb47bdce58) at lightningd/io_loop_with_timers.c:21
7 0x000055fb46924817 in shutdown_plugins (ld=0x55fb47bdce58) at lightningd/plugin.c:2114
8 0x000055fb468f7c92 in main (argc=22, argv=0x7ffdf4be9228) at lightningd/lightningd.c:1195
shutdown_subdaemons frees the channel and calls destroy_close_command_on_channel_destroy, see gdb:
0 destroy_close_command_on_channel_destroy (_=0x55db6ca38e18, cc=0x55db6ca43338) at lightningd/closing_control.c:94
1 0x000055db6a8181b5 in notify (ctx=0x55db6ca38df0, type=TAL_NOTIFY_FREE, info=0x55db6ca38e18, saved_errno=0) at ccan/ccan/tal/tal.c:237
2 0x000055db6a8186bb in del_tree (t=0x55db6ca38df0, orig=0x55db6ca38e18, saved_errno=0) at ccan/ccan/tal/tal.c:402
3 0x000055db6a818a47 in tal_free (ctx=0x55db6ca38e18) at ccan/ccan/tal/tal.c:486
4 0x000055db6a73fffa in shutdown_subdaemons (ld=0x55db6c8b4ca8) at lightningd/lightningd.c:543
5 0x000055db6a741098 in main (argc=21, argv=0x7ffffa3e8048) at lightningd/lightningd.c:1192
Before this PR, there was no io_loop after shutdown_subdaemons and client side raised a
general `Connection to RPC server lost.`
Now we test the more specific `Channel forgotten before proper close.`, which is good!
BTW, this test was added recently in PR #4599.
since PR #3867 utxos are unreserved by height, destroy_utxos and
related functions are not used anymore so clean them up also
However free(ld->jsonrpc) still needs to happen before free(ld) because its
destructors need list_head pointers from ld
because:
- shutdown_subdaemons can trigger db write, comments in that function say so at least
- resurrecting the main event loop with subdaemons still running is counter productive
in shutting down activity (such as htlc's, hook_calls etc.)
- custom behavior injected by plugins via hooks should be consistent, see test
in previous commmit
IDEA:
in shutdown_plugins, when starting new io_loop:
- A plugin that is still running can return a jsonrpc_request response, this triggers
response_cb, which cannot be handled because subdaemons are gone -> so any response_cb should be blocked/aborted
- jsonrpc is still there, so users (such as plugins) can make new jsonrpc_request's which
cannot be handled because subdaemons are gone -> so new rpc_request should also be blocked
- But we do want to send/receive notifications and log messages (handled in jsonrpc as jsonrpc_notification)
as these do not trigger subdaemon calls or db_write's
Log messages and notifications do not have "id" field, where jsonrpc_request *do* have an "id" field
PLAN (hypothesis):
- hack into plugin_read_json_one OR plugin_response_handle to filter-out json with
an "id" field, this should
block/abandon any jsonrpc_request responses (and new jsonrpc_requests for plugins?)
Q. Can internal (so not via plugin) jsonrpc_requests called in the main io_loop return/revive in
the shutdown io_loop?
A. No. All code under lightningd/ returning command_still_pending depends on either a subdaemon, timer or
plugin. In shutdown loop the subdaemons are dead, timer struct cleared and plugins will be taken
care of (in next commits).
fixup: we can only io_break the main io_loop once
Incase the error handling happening after the quted line is non-critical:
```
return tal_fmt(NULL, "Unable to parse address '%s': %s", arg, err_msg);
```
we should not expand the proposed_listen_announce array without adding
a proposed_wireaddr. So we move the expand of proposed_listen_announce
to the location where we also expand the proposed_wireaddr.
Changelog-None
This include the following commits:
- review 1/2: move from tab to space, and remove the exp. prop from doc;
- review 2/2: remove experimental features;
Signed-off-by: Vincenzo Palazzo <vincenzopalazzodev@gmail.com>
It runs 6 nodes: under valgrind this ends up consuming 5.3 GB RSS. By
stopping nodes between, we peak about 1G RSS.
Measured using:
(while true; do echo $(for i in 4 5 6; do ps uh | tr -s ' ' | cut -d\ -f$i | tally; done); sleep 5; done)&
(Which measures my other processes as well, but that's only about 100M).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This happened in my tal_dump(), and I couldn't see how we ended up
with object having more than one "backtrace". Adding asserts that we
never added a second backtrace didn't trigger.
Finally I wondered if we were tal_steal() backtraces, and sure enough
we do that blinding in one place: libwally wrapping. So fix that.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Do proper refcounting on log prefixes; previously we kept them around,
which is fine, but the extra notleak() and backtrace in developer mode
could get quite heavy (I ended up with 1G of backtraces!). This is
mainly due to creating one on every JSONRPC command, and running
clboss.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Fixed: lightningd: remove slow memory leak in DEVELOPER builds.
I noticed that `wallet/db_postgres.c` never actually calls `db_changes_add`.
PostgreSQL arguably has a better replication system (a PostgreSQL cluster)
than what `db_write` hook can offer, so rather than make `db_write` work on
PostgreSQL, just document that it does not actually work there.
ChangeLog-none
Because db->conn is a void *, changing it (from a direct pointer to
a pointer to a pair of pointers) did not break compile if one place hadn't
been update.
The result was a confusing failure: sqlite3 complaining about API misuse,
since the db->conn pointer was not a valid db handle any more.
This is one case where avoiding a void * is hard: we might not even
have the postgresql types, since it might not be installed. But a union
would have been superior here.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
```
cc lightningd/subd.c
lightningd/subd.c:216:7: error: expected identifier or '('
int stdout = STDOUT_FILENO, stderr = STDERR_FILENO;
^
/usr/include/stdio.h:198:17: note: expanded from macro 'stdout'
^
lightningd/subd.c:216:7: error: expected ')'
/usr/include/stdio.h:198:17: note: expanded from macro 'stdout'
^
lightningd/subd.c:216:7: note: to match this '('
/usr/include/stdio.h:198:16: note: expanded from macro 'stdout'
^
lightningd/subd.c:224:12: error: cannot take the address of an rvalue of type 'FILE *' (aka 'struct __sFILE *')
fds[1] = &stdout;
^~~~~~~
lightningd/subd.c:225:12: error: cannot take the address of an rvalue of type 'FILE *' (aka 'struct __sFILE *')
fds[2] = &stderr;
^~~~~~~
4 errors generated.
gmake: *** [Makefile:279: lightningd/subd.o] Error 1
```
Changelog-None: introduced since last release.
Fixes: #4914
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> ```
ChangeLog-Added: With the `sqlite3://` scheme for `--wallet` option, you can now specify a second file path for real-time database backup by separating it from the main file path with a `:` character.