Be more graceful in shutting down: this should fix the issue where
bookkeeper gets upset that its commands are rejected during shutdown,
and generally make things more graceful.
1. Stop any new RPC connections.
2. Stop any per-peer daemons (channeld, etc).
3. Shut down plugins.
4. Stop all existing RPC connections.
5. Stop global daemons.
6. Free up peer, chanen HTLC datastructures.
7. Close database.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Changed: Plugins: RPC operations are now still available during shutdown.
Changelog-Removed: JSONRPC: RPC framework now requires the `"jsonrpc"` property inside the request (deprecated in v0.10.2)
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Somehow we missed this deprecation, found by grep.
Changelog-Removed: JSON API: Removed double wrapping of `rpc_command` payload in `rpc_command` JSON field (deprecated v0.8.2)
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rather than a generic "add member", provide two routines: one which
doesn't quote, and one which does.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We have them split over common/param.c, common/json.c,
common/json_helpers.c, common/json_tok.c and common/json_stream.c.
Change that to:
* common/json_parse (all the json_to_xxx routines)
* common/json_parse_simple (simplest the json parsing routines, for cli too)
* common/json_stream (all the json_add_xxx routines)
* common/json_param (all the param and param_xxx routines)
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Looks like we woke one of the startup io_loops early, and thus
we thought we'd finished connectd_activate and we hadn't. This
caused us to use an uninitialized ld->announceable array, and
finally caused an assert fail in the main loop.
Make *every* loop assert that it was exited for the correct reason,
so if it happens again, we can maybe figure out what part of
the code to look at.
```
lightningd: lightningd/lightningd.c:1186: main: Assertion `io_loop_ret == ld' failed.
lightningd: FATAL SIGNAL 6 (version 4df66fa)
...
------------------------------- Valgrind errors --------------------------------
Valgrind error file: valgrind-errors.895509
==895509== Conditional jump or move depends on uninitialised value(s)
==895509== at 0x22C58E: to_tal_hdr_or_null (tal.c:184)
==895509== by 0x22D531: tal_bytelen (tal.c:637)
==895509== by 0x1F10B6: towire_gossipd_init (gossipd_wiregen.c:100)
==895509== by 0x13AC6E: gossip_init (gossip_control.c:254)
==895509== by 0x1497EC: main (lightningd.c:1090)
==895509==
```
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Mostly comments and docs: some places are actually paths, which
I have avoided changing. We may migrate them slowly, particularly
when they're user-visible.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
And turn "" includes into full-path (which makes it easier to put
config.h first, and finds some cases check-includes.sh missed
previously).
config.h sets _GNU_SOURCE which really needs to be done before any
'#includes': we mainly got away with it with glibc, but other platforms
like Alpine may have stricter requirements.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
because:
- shutdown_subdaemons can trigger db write, comments in that function say so at least
- resurrecting the main event loop with subdaemons still running is counter productive
in shutting down activity (such as htlc's, hook_calls etc.)
- custom behavior injected by plugins via hooks should be consistent, see test
in previous commmit
IDEA:
in shutdown_plugins, when starting new io_loop:
- A plugin that is still running can return a jsonrpc_request response, this triggers
response_cb, which cannot be handled because subdaemons are gone -> so any response_cb should be blocked/aborted
- jsonrpc is still there, so users (such as plugins) can make new jsonrpc_request's which
cannot be handled because subdaemons are gone -> so new rpc_request should also be blocked
- But we do want to send/receive notifications and log messages (handled in jsonrpc as jsonrpc_notification)
as these do not trigger subdaemon calls or db_write's
Log messages and notifications do not have "id" field, where jsonrpc_request *do* have an "id" field
PLAN (hypothesis):
- hack into plugin_read_json_one OR plugin_response_handle to filter-out json with
an "id" field, this should
block/abandon any jsonrpc_request responses (and new jsonrpc_requests for plugins?)
Q. Can internal (so not via plugin) jsonrpc_requests called in the main io_loop return/revive in
the shutdown io_loop?
A. No. All code under lightningd/ returning command_still_pending depends on either a subdaemon, timer or
plugin. In shutdown loop the subdaemons are dead, timer struct cleared and plugins will be taken
care of (in next commits).
fixup: we can only io_break the main io_loop once
Before:
Ten builds, laptop -j5, no ccache:
```
real 0m36.686000-38.956000(38.608+/-0.65)s
user 2m32.864000-42.253000(40.7545+/-2.7)s
sys 0m16.618000-18.316000(17.8531+/-0.48)s
```
Ten builds, laptop -j5, ccache (warm):
```
real 0m8.212000-8.577000(8.39989+/-0.13)s
user 0m12.731000-13.212000(12.9751+/-0.17)s
sys 0m3.697000-3.902000(3.83722+/-0.064)s
```
After:
Ten builds, laptop -j5, no ccache: 8% faster
```
real 0m33.802000-35.773000(35.468+/-0.54)s
user 2m19.073000-27.754000(26.2542+/-2.3)s
sys 0m15.784000-17.173000(16.7165+/-0.37)s
```
Ten builds, laptop -j5, ccache (warm): 1% faster
```
real 0m8.200000-8.485000(8.30138+/-0.097)s
user 0m12.485000-13.100000(12.7344+/-0.19)s
sys 0m3.702000-3.889000(3.78787+/-0.056)s
```
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Vincenzo Palazzo <vincenzopalazzodev@gmail.com>
Changelog-Deprecated: RPC framwork now require the "jsonrpc" propriety inside the request.
Changelog-Fixed: RPC framwork now required the "jsonrpc" propriety to be specified inside each request.
We were masquerading errors when parsing the request by reporting only
a bogus malformed `id` field in the response, when the real issue was
that we were unable to parse the request in the first place (which
caused the null-id error to be returned).
Fixes#4238
This lets callers enable notifications; we won't send any if they don't.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Added: JSON-RPC: `notifications` command to enable notifications.
time lightning-cli -R --network=regtest --lightning-dir /tmp/ltests-k8jhvtty/test_pay_stress_1/lightning-1/ listpays > /dev/null
Before:
real 0m42.741s
user 0m0.149s
sys 0m0.016s
After:
real 0m13.674s
user 0m0.131s
sys 0m0.024s
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Fixed: JSON-RPC: significant speedups for plugins which create large JSON replies (e.g. listpays on large nodes).
The jsmn parser is a beautiful piece of code. In particular, you can parse
part of a string, then continue where you left off.
We don't take advantage of this, however, meaning for large JSON objects
we parse them multiple times before finally having enough to complete.
Expose the parser state and tokens through the API, so the caller can pass
them in repeatedly. For the moment, every caller is allocates each time
(except the unit tests).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
If allow-deprecated-apis=false, don't mention them at all (we already
disallow calling them in that case). Otherwise, note that they're
deprecated in the help msg.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Now we know whether the command completed or not, we can correctly
call command_still_pending() if it didn't complete.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Commit 9aedb0c61f changed this from allocating off `c` to allocating
off NULL, knowing that it's tal_steal() in the callback. But before
that, it can be detected as a mem leak:
```
@pytest.fixture
def teardown_checks(request):
"""A simple fixture to collect errors during teardown.
We need to collect the errors and raise them as the very last step in the
fixture tree, otherwise some fixtures may not be cleaned up
correctly. Require this fixture in all other fixtures that need to either
cleanup before reporting an error or want to add an error that is to be
reported.
"""
errors = TeardownErrors()
yield errors
if errors.has_errors():
# Format a nice list of everything that went wrong and raise an exception
request.node.has_errors = True
> raise ValueError(str(errors))
E ValueError:
E Node errors:
E Global errors:
E - Node /tmp/ltests-iz9y1chb/test_hsmtool_secret_decryption_1/lightning-1/ has memory leaks: [
E {
E "backtrace": [
E "ccan/ccan/tal/tal.c:442 (tal_alloc_)",
E "lightningd/jsonrpc.c:848 (parse_request)",
E "lightningd/jsonrpc.c:941 (read_json)",
E "ccan/ccan/io/io.c:59 (next_plan)",
E "ccan/ccan/io/io.c:407 (do_plan)",
E avis/build/ElementsProject/lightning/lightningd/../plugins/pay
```
Reported-by: @niftynei
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
One is called on every plugin return, and tells us whether to continue;
the other is only called if every plugin says ok.
This works for things like payload replacement, where we need to process
the results from each plugin, not just the final one!
We should probably turn everything into a chained callback next
release.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
They callback must take ownership of the payload (almost all do, but
now it's explicit).
And since the payload and cb_arg arguments to plugin_hook_call_() are
always identical, make them a single parameter.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We were nesting like the following:
```json
{"params": {
"rpc_command": {
"rpc_command": {
}
}
}
```
This is really excessive, so we unwrap once, and now have the following:
```json
{"params": {
"rpc_command": {
}
}
```
Still more wrapping than necessary (the method is repeated in the `params`
object), but it's getting closer.
Changelog-Deprecated: JSON-RPC: Removed double wrapping of `rpc_command` payload in `rpc_command` JSON field.
Suggested-by: @fiatjaf
Signed-off-by: Christian Decker <@cdecker>
This is a common thing to do, so create a macro.
Unfortunately, it still needs the type arg, because the paramter may
be const, and the return cannot be, and C doesn't have a general
"(-const)" cast.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
I reproduced this by putting a sleep(60) in the pay plugin, then
'lightning-cli pay', 'lightning-cli plugin stop pay' and then ^C
the `lightning-cli pay`:
2020-02-14T00:33:11.217Z INFO plugin-pay: Killing plugin: pay stopped by lightningd via RPC
2020-02-14T00:33:15.250Z DEBUG lightningd: Still waiting for initial block download
==5157== Invalid read of size 8
==5157== at 0x12A29C: destroy_jcon (jsonrpc.c:149)
==5157== by 0x1C6F2A: notify (tal.c:235)
==5157== by 0x1C7441: del_tree (tal.c:397)
==5157== by 0x1C7493: del_tree (tal.c:407)
==5157== by 0x1C77DD: tal_free (tal.c:481)
==5157== by 0x1B7380: io_close (io.c:450)
==5157== by 0x1B71A7: do_plan (io.c:401)
==5157== by 0x1B7214: io_ready (io.c:417)
==5157== by 0x1B94AC: io_loop (poll.c:445)
==5157== by 0x1291C9: io_loop_with_timers (io_loop_with_timers.c:24)
==5157== by 0x12EC7E: main (lightningd.c:928)
==5157== Address 0x4ebab98 is 40 bytes inside a block of size 88 free'd
==5157== at 0x483BA3F: free (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==5157== by 0x1C750F: del_tree (tal.c:416)
==5157== by 0x1C7493: del_tree (tal.c:407)
==5157== by 0x1C77DD: tal_free (tal.c:481)
==5157== by 0x153856: clear_plugin (plugin_control.c:209)
==5157== by 0x1538FF: plugin_dynamic_stop (plugin_control.c:225)
==5157== by 0x153C51: json_plugin_control (plugin_control.c:295)
==5157== by 0x12B4EC: command_exec (jsonrpc.c:588)
==5157== by 0x12B8AB: rpc_command_hook_callback (jsonrpc.c:679)
==5157== by 0x154575: plugin_hook_call_ (plugin_hook.c:170)
==5157== by 0x12BCD3: plugin_hook_call_rpc_command (jsonrpc.c:756)
==5157== by 0x12BD04: call_rpc_command_hook (jsonrpc.c:764)
==5157== Block was alloc'd at
==5157== at 0x483A7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==5157== by 0x1C6F98: allocate (tal.c:245)
==5157== by 0x1C7559: tal_alloc_ (tal.c:423)
==5157== by 0x15135A: plugin_rpcmethod_add (plugin.c:706)
==5157== by 0x151600: plugin_rpcmethods_add (plugin.c:756)
==5157== by 0x151BDD: plugin_parse_getmanifest_response (plugin.c:893)
==5157== by 0x151C9C: plugin_manifest_cb (plugin.c:915)
==5157== by 0x14FFB9: plugin_response_handle (plugin.c:258)
==5157== by 0x150165: plugin_read_json_one (plugin.c:356)
==5157== by 0x1502BC: plugin_read_json (plugin.c:388)
==5157== by 0x1B65ED: next_plan (io.c:59)
==5157== by 0x1B71D2: do_plan (io.c:407)
Fixes: #3509
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The newly introduced type is used to determine what the call semantics of the
hook are. We have `single` corresponding to the old behavior, as well as
`chain` which allows multiple plugins to register for the hook, and they are
then called sequentially (if all plugins return `{"result": "continue"}`) or
exit the chain if the hook event was handled.
Changelog-Changed: The hooks `db_write`, `invoice_payment`, and `rpc_command` now accept `{ "result": "continue" }` to mean "do default action", in addition to `true` (`db_write`), `{}` (`invoice_payment`), and `{"continue": true}` (`rpc_command`). The older "default" indicators are now deprecated and are now recognized only if `--deprecated-apis` is set.
Before this patch we used `int` for error codes. The problem with
`int` is that we try to pass it to/from wire and the size of `int` is
not defined by the standard. So a sender with 4-byte `int` would write
4 bytes to the wire and a receiver with 2-byte `int` (for example) would
read just 2 bytes from the wire.
To resolve this:
* Introduce an error code type with a known size:
`typedef s32 errcode_t`.
* Change all error code macros to constants of type `errcode_t`.
Constants also play better with gdb - it would visualize the name of
the constant instead of the numeric value.
* Change all functions that take error codes to take the new type
`errcode_t` instead of `int`.
* Introduce towire / fromwire functions to send / receive the newly added
type `errcode_t` and use it instead of `towire_int()`.
In addition:
* Remove the now unneeded `towire_int()`.
* Replace a hardcoded error code `-2` with a new constant
`INVOICE_EXPIRED_DURING_WAIT` (903).
Changelog-Changed: The waitinvoice command would now return error code 903 to designate that the invoice expired during wait, instead of the previous -2
In particular:
1. It must redirect to an existing command.
2. It must contain method, params and id.
And update the docs to show the id, which is vital.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
lightning-cli is going to need to know what network we're on, so
it will need to parse the config files. Move the code which does
the initial bootstrap parsing into common, as well as the config
file parsing core.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>