Looks like we woke one of the startup io_loops early, and thus
we thought we'd finished connectd_activate and we hadn't. This
caused us to use an uninitialized ld->announceable array, and
finally caused an assert fail in the main loop.
Make *every* loop assert that it was exited for the correct reason,
so if it happens again, we can maybe figure out what part of
the code to look at.
```
lightningd: lightningd/lightningd.c:1186: main: Assertion `io_loop_ret == ld' failed.
lightningd: FATAL SIGNAL 6 (version 4df66fa)
...
------------------------------- Valgrind errors --------------------------------
Valgrind error file: valgrind-errors.895509
==895509== Conditional jump or move depends on uninitialised value(s)
==895509== at 0x22C58E: to_tal_hdr_or_null (tal.c:184)
==895509== by 0x22D531: tal_bytelen (tal.c:637)
==895509== by 0x1F10B6: towire_gossipd_init (gossipd_wiregen.c:100)
==895509== by 0x13AC6E: gossip_init (gossip_control.c:254)
==895509== by 0x1497EC: main (lightningd.c:1090)
==895509==
```
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Mostly comments and docs: some places are actually paths, which
I have avoided changing. We may migrate them slowly, particularly
when they're user-visible.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
And turn "" includes into full-path (which makes it easier to put
config.h first, and finds some cases check-includes.sh missed
previously).
config.h sets _GNU_SOURCE which really needs to be done before any
'#includes': we mainly got away with it with glibc, but other platforms
like Alpine may have stricter requirements.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
because:
- shutdown_subdaemons can trigger db write, comments in that function say so at least
- resurrecting the main event loop with subdaemons still running is counter productive
in shutting down activity (such as htlc's, hook_calls etc.)
- custom behavior injected by plugins via hooks should be consistent, see test
in previous commmit
IDEA:
in shutdown_plugins, when starting new io_loop:
- A plugin that is still running can return a jsonrpc_request response, this triggers
response_cb, which cannot be handled because subdaemons are gone -> so any response_cb should be blocked/aborted
- jsonrpc is still there, so users (such as plugins) can make new jsonrpc_request's which
cannot be handled because subdaemons are gone -> so new rpc_request should also be blocked
- But we do want to send/receive notifications and log messages (handled in jsonrpc as jsonrpc_notification)
as these do not trigger subdaemon calls or db_write's
Log messages and notifications do not have "id" field, where jsonrpc_request *do* have an "id" field
PLAN (hypothesis):
- hack into plugin_read_json_one OR plugin_response_handle to filter-out json with
an "id" field, this should
block/abandon any jsonrpc_request responses (and new jsonrpc_requests for plugins?)
Q. Can internal (so not via plugin) jsonrpc_requests called in the main io_loop return/revive in
the shutdown io_loop?
A. No. All code under lightningd/ returning command_still_pending depends on either a subdaemon, timer or
plugin. In shutdown loop the subdaemons are dead, timer struct cleared and plugins will be taken
care of (in next commits).
fixup: we can only io_break the main io_loop once
Before:
Ten builds, laptop -j5, no ccache:
```
real 0m36.686000-38.956000(38.608+/-0.65)s
user 2m32.864000-42.253000(40.7545+/-2.7)s
sys 0m16.618000-18.316000(17.8531+/-0.48)s
```
Ten builds, laptop -j5, ccache (warm):
```
real 0m8.212000-8.577000(8.39989+/-0.13)s
user 0m12.731000-13.212000(12.9751+/-0.17)s
sys 0m3.697000-3.902000(3.83722+/-0.064)s
```
After:
Ten builds, laptop -j5, no ccache: 8% faster
```
real 0m33.802000-35.773000(35.468+/-0.54)s
user 2m19.073000-27.754000(26.2542+/-2.3)s
sys 0m15.784000-17.173000(16.7165+/-0.37)s
```
Ten builds, laptop -j5, ccache (warm): 1% faster
```
real 0m8.200000-8.485000(8.30138+/-0.097)s
user 0m12.485000-13.100000(12.7344+/-0.19)s
sys 0m3.702000-3.889000(3.78787+/-0.056)s
```
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Vincenzo Palazzo <vincenzopalazzodev@gmail.com>
Changelog-Deprecated: RPC framwork now require the "jsonrpc" propriety inside the request.
Changelog-Fixed: RPC framwork now required the "jsonrpc" propriety to be specified inside each request.
We were masquerading errors when parsing the request by reporting only
a bogus malformed `id` field in the response, when the real issue was
that we were unable to parse the request in the first place (which
caused the null-id error to be returned).
Fixes#4238
This lets callers enable notifications; we won't send any if they don't.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Added: JSON-RPC: `notifications` command to enable notifications.
time lightning-cli -R --network=regtest --lightning-dir /tmp/ltests-k8jhvtty/test_pay_stress_1/lightning-1/ listpays > /dev/null
Before:
real 0m42.741s
user 0m0.149s
sys 0m0.016s
After:
real 0m13.674s
user 0m0.131s
sys 0m0.024s
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Fixed: JSON-RPC: significant speedups for plugins which create large JSON replies (e.g. listpays on large nodes).
The jsmn parser is a beautiful piece of code. In particular, you can parse
part of a string, then continue where you left off.
We don't take advantage of this, however, meaning for large JSON objects
we parse them multiple times before finally having enough to complete.
Expose the parser state and tokens through the API, so the caller can pass
them in repeatedly. For the moment, every caller is allocates each time
(except the unit tests).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
If allow-deprecated-apis=false, don't mention them at all (we already
disallow calling them in that case). Otherwise, note that they're
deprecated in the help msg.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Now we know whether the command completed or not, we can correctly
call command_still_pending() if it didn't complete.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Commit 9aedb0c61f changed this from allocating off `c` to allocating
off NULL, knowing that it's tal_steal() in the callback. But before
that, it can be detected as a mem leak:
```
@pytest.fixture
def teardown_checks(request):
"""A simple fixture to collect errors during teardown.
We need to collect the errors and raise them as the very last step in the
fixture tree, otherwise some fixtures may not be cleaned up
correctly. Require this fixture in all other fixtures that need to either
cleanup before reporting an error or want to add an error that is to be
reported.
"""
errors = TeardownErrors()
yield errors
if errors.has_errors():
# Format a nice list of everything that went wrong and raise an exception
request.node.has_errors = True
> raise ValueError(str(errors))
E ValueError:
E Node errors:
E Global errors:
E - Node /tmp/ltests-iz9y1chb/test_hsmtool_secret_decryption_1/lightning-1/ has memory leaks: [
E {
E "backtrace": [
E "ccan/ccan/tal/tal.c:442 (tal_alloc_)",
E "lightningd/jsonrpc.c:848 (parse_request)",
E "lightningd/jsonrpc.c:941 (read_json)",
E "ccan/ccan/io/io.c:59 (next_plan)",
E "ccan/ccan/io/io.c:407 (do_plan)",
E avis/build/ElementsProject/lightning/lightningd/../plugins/pay
```
Reported-by: @niftynei
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
One is called on every plugin return, and tells us whether to continue;
the other is only called if every plugin says ok.
This works for things like payload replacement, where we need to process
the results from each plugin, not just the final one!
We should probably turn everything into a chained callback next
release.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
They callback must take ownership of the payload (almost all do, but
now it's explicit).
And since the payload and cb_arg arguments to plugin_hook_call_() are
always identical, make them a single parameter.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We were nesting like the following:
```json
{"params": {
"rpc_command": {
"rpc_command": {
}
}
}
```
This is really excessive, so we unwrap once, and now have the following:
```json
{"params": {
"rpc_command": {
}
}
```
Still more wrapping than necessary (the method is repeated in the `params`
object), but it's getting closer.
Changelog-Deprecated: JSON-RPC: Removed double wrapping of `rpc_command` payload in `rpc_command` JSON field.
Suggested-by: @fiatjaf
Signed-off-by: Christian Decker <@cdecker>
This is a common thing to do, so create a macro.
Unfortunately, it still needs the type arg, because the paramter may
be const, and the return cannot be, and C doesn't have a general
"(-const)" cast.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
I reproduced this by putting a sleep(60) in the pay plugin, then
'lightning-cli pay', 'lightning-cli plugin stop pay' and then ^C
the `lightning-cli pay`:
2020-02-14T00:33:11.217Z INFO plugin-pay: Killing plugin: pay stopped by lightningd via RPC
2020-02-14T00:33:15.250Z DEBUG lightningd: Still waiting for initial block download
==5157== Invalid read of size 8
==5157== at 0x12A29C: destroy_jcon (jsonrpc.c:149)
==5157== by 0x1C6F2A: notify (tal.c:235)
==5157== by 0x1C7441: del_tree (tal.c:397)
==5157== by 0x1C7493: del_tree (tal.c:407)
==5157== by 0x1C77DD: tal_free (tal.c:481)
==5157== by 0x1B7380: io_close (io.c:450)
==5157== by 0x1B71A7: do_plan (io.c:401)
==5157== by 0x1B7214: io_ready (io.c:417)
==5157== by 0x1B94AC: io_loop (poll.c:445)
==5157== by 0x1291C9: io_loop_with_timers (io_loop_with_timers.c:24)
==5157== by 0x12EC7E: main (lightningd.c:928)
==5157== Address 0x4ebab98 is 40 bytes inside a block of size 88 free'd
==5157== at 0x483BA3F: free (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==5157== by 0x1C750F: del_tree (tal.c:416)
==5157== by 0x1C7493: del_tree (tal.c:407)
==5157== by 0x1C77DD: tal_free (tal.c:481)
==5157== by 0x153856: clear_plugin (plugin_control.c:209)
==5157== by 0x1538FF: plugin_dynamic_stop (plugin_control.c:225)
==5157== by 0x153C51: json_plugin_control (plugin_control.c:295)
==5157== by 0x12B4EC: command_exec (jsonrpc.c:588)
==5157== by 0x12B8AB: rpc_command_hook_callback (jsonrpc.c:679)
==5157== by 0x154575: plugin_hook_call_ (plugin_hook.c:170)
==5157== by 0x12BCD3: plugin_hook_call_rpc_command (jsonrpc.c:756)
==5157== by 0x12BD04: call_rpc_command_hook (jsonrpc.c:764)
==5157== Block was alloc'd at
==5157== at 0x483A7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==5157== by 0x1C6F98: allocate (tal.c:245)
==5157== by 0x1C7559: tal_alloc_ (tal.c:423)
==5157== by 0x15135A: plugin_rpcmethod_add (plugin.c:706)
==5157== by 0x151600: plugin_rpcmethods_add (plugin.c:756)
==5157== by 0x151BDD: plugin_parse_getmanifest_response (plugin.c:893)
==5157== by 0x151C9C: plugin_manifest_cb (plugin.c:915)
==5157== by 0x14FFB9: plugin_response_handle (plugin.c:258)
==5157== by 0x150165: plugin_read_json_one (plugin.c:356)
==5157== by 0x1502BC: plugin_read_json (plugin.c:388)
==5157== by 0x1B65ED: next_plan (io.c:59)
==5157== by 0x1B71D2: do_plan (io.c:407)
Fixes: #3509
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The newly introduced type is used to determine what the call semantics of the
hook are. We have `single` corresponding to the old behavior, as well as
`chain` which allows multiple plugins to register for the hook, and they are
then called sequentially (if all plugins return `{"result": "continue"}`) or
exit the chain if the hook event was handled.
Changelog-Changed: The hooks `db_write`, `invoice_payment`, and `rpc_command` now accept `{ "result": "continue" }` to mean "do default action", in addition to `true` (`db_write`), `{}` (`invoice_payment`), and `{"continue": true}` (`rpc_command`). The older "default" indicators are now deprecated and are now recognized only if `--deprecated-apis` is set.
Before this patch we used `int` for error codes. The problem with
`int` is that we try to pass it to/from wire and the size of `int` is
not defined by the standard. So a sender with 4-byte `int` would write
4 bytes to the wire and a receiver with 2-byte `int` (for example) would
read just 2 bytes from the wire.
To resolve this:
* Introduce an error code type with a known size:
`typedef s32 errcode_t`.
* Change all error code macros to constants of type `errcode_t`.
Constants also play better with gdb - it would visualize the name of
the constant instead of the numeric value.
* Change all functions that take error codes to take the new type
`errcode_t` instead of `int`.
* Introduce towire / fromwire functions to send / receive the newly added
type `errcode_t` and use it instead of `towire_int()`.
In addition:
* Remove the now unneeded `towire_int()`.
* Replace a hardcoded error code `-2` with a new constant
`INVOICE_EXPIRED_DURING_WAIT` (903).
Changelog-Changed: The waitinvoice command would now return error code 903 to designate that the invoice expired during wait, instead of the previous -2
In particular:
1. It must redirect to an existing command.
2. It must contain method, params and id.
And update the docs to show the id, which is vital.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
lightning-cli is going to need to know what network we're on, so
it will need to parse the config files. Move the code which does
the initial bootstrap parsing into common, as well as the config
file parsing core.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Spaces just make life a little harder for everyone.
(Plus, fix documentation: it's 'jsonrpc' not 'json' subsystem).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
1. Printed form is always "[<nodeid>-]<prefix>: <string>"
2. "jcon fd %i" becomes "jsonrpc #%i".
3. "jsonrpc" log is only used once, and is removed.
4. "database" log prefix is use for db accesses.
5. "lightningd(%i)" becomes simply "lightningd" without the pid.
6. The "lightningd_" prefix is stripped from subd log prefixes, and pid removed.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-changed: Logging: formatting made uniform: [NODEID-]SUBSYSTEM: MESSAGE
Changelog-removed: `lightning_` prefixes removed from subdaemon names, including in listpeers `owner` field.
We had a separate logbook for each peer, and copy log entries above
the printable log level into the master logbook. This didn't always
work well, since we didn't dump it on crash for example.
Keep a single global logbook instead, and remove this infrastructure.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
A log can have a default node_id, which can be overridden on a per-entry
basis. This changes the format of logging, so some tests need rework.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The 'rpc_command' hook allows a plugin to take over any RPC command.
It sends the complete JSONRPC request to the plugin, which can then respond
with :
- {'continue'}: executes the command normally
- {'replace': {a_jsonrpc_request}}: replaces the request made
- {'return': {'result': {}}}: send a custom response
- {'return': {'error': {}}}: send a custom error
This way, a plugin can modify (/reimplement) or restrict the usage of
any of `lightningd`'s commands.
Changelog-Added: Plugin: A new plugin hook, `rpc_command` allows a plugin to take over `lightningd` for any RPC command.
Rather than reaching into data structures, let them register their own
callbacks. This avoids us having to expose "memleak_remove_xxx"
functions, and call them manually.
Under the hood, this is done by having a specially-named tal child of
the thing we want to assist, containing the callback.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
I suspect multiple plugins trying to connect at the same
time are overrunning the 1-deep listen queue:
From man listen(2):
The backlog argument defines the maximum length to which the queue of
pending connections for sockfd may grow. If a connection request arβ
rives when the queue is full, the client may receive an error with an
indication of ECONNREFUSED
Fixes: #2922
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>