The main change which affects us is that 2016 blocks to forget a channel
is a fixed number in the spec; we make this clear by renaming the
(developer-only) max_funding_unconfirmed to dev_max_funding_unconfirmed
and making it compile DEVELOPER only.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We were reporting the failure immediately but still continuing with
the startup. This could happen if an important plugin ends up in a
race with another plugin (important or not) for a contended
resource (CLI option or RPC method name). We would eventually notice
that we were supposed to abort, but at that point we already processed
a couple of blocks, loaded the entire state, etc.
This just aborts early with a sane error message.
Changelog-Added: plugin: If there is a misconfiguration with important plugins we now abort early with a more descriptive error message.
Reported-by: PsySc0rpi0n
Reported-by: Ján Sáreník <@jsarenik>
We use waitpid() manually for subdaemons, so we need to step
around that (otherwise we could simply ignore them).
We could destroy subdaemons only once they've exited, but
that works badly with the sd->conn, which will be freed
when error (i.e. close) is detected, so the current code
is probably the best compromise.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Since this all stays in dualopend/dual_open_control, we can hold
onto the openchannel_signed command to wait for a response here locally.
Previously we were splitting across the channeld/openingd boundary.
This avoids duplication of both logic and error-prone values, such as
the salt. Grouping all hsm encryption logic into a public API will also
allow us to fuzz it.
Signed-off-by: Antoine Poinsot <darosior@protonmail.com>
`openchannel_signed` commands hang out across the openingd/channeld
boundary -- we don't return until we've successfully broadcast the
transaction (or timed out waiting for them to send a tx_sigs back).
The 100MB log buffer has been the biggest memory footprint for the daemon.
Keeping 10MB for emergency log dumps seems sufficient.
This has been mentioned in the last developer meeting.
Changelog-Changed: In-memory log buffer reduced from 100MB to 10MB
We're going to want this for bolt13 formation as well.
As a result of reworking the logic into "candidate selection" then
"route hint selection", we need to change the way round-robin works.
We use a simple incrementing index now.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This removes the reservation cleanup at startup, too, now they're all
using 'reserved_til'.
This changes test_withdraw, since it asserted that outputs were marked
spent as soon as we broadcast a transaction: now they're reserved until
it's mined. Similarly, test_addfunds_from_block assumed we'd see funds
as soon as we broadcast the tx.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Changed: JSON-RPC: `withdraw` now randomizes input and output order, not BIP69.
we're about to add a migration that requires access to the bip32_key
in order to calculate missing scriptpubkeys.
prior to this patch, we don't have access to the bip32 key in the db
migration, as it's set on the wallet but after the db migrations are
run.
here we patch it through so that every migration can access it
Reduces VALGRIND=1 node_factory.line_graph(5) time on my laptop from 42s to 36s.
This is simply because forking all the subdaemons just to check the
version is very expensive under valgrind.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We're going to use the hsm for a migration, so we need to set up the HSM
before we get to the wallet migration code.
All that this requires is removing the places in HSM init that we touch
the database struct -- easy enough to accomplish by passing the required
field back out from init, and then associating it onto the wallet after
it's been initialized.
This will allow the dynamic starting code to use them too.
Also lets us move dev_debug_subprocess under #if DEVELOPER.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We use the new function `plugins_free` to define the correct deallocation
order on shutdown, since under normal operation the allocation tree is
organized to allow plugins to terminate and automatically free all dependent
resources. During shutdown the deallocation order is under-defined since
siblings may get freed in any order, but we implicitly rely on them staying
around.
common/onion is going to need to use this for the case where it finds a blinding
seed inside the TLV. But how it does ecdh is daemon-specific.
We already had this problem for devtools/gossipwith, which supplied a
special hsm_do_ecdh(). This just makes it more general.
So we create a generic ecdh() interface, with a specific implementation
which subdaemons and lightningd can use.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It's almost always "their_features" and "our_features" respectively, so
make those names clear.
Suggested-by: @cdecker
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Turns out that unnecessary: all callers can access the feature_set,
so make it much more like a normal primitive.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is to prepare for dynamic features, including making plugins first
class citizens at setting them.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Instead of saving a stripped_update, we use the new
local_fail_in_htlc_needs_update.
One minor change: we return the more correct
towire_temporary_channel_failure when the node is still syncing.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We make the current state of `lightningd` explicit so we don't have to
identify a shutdown by its side-effects. We then use this in order to prevent
the killing and freeing of plugins to continue down the chain of registered
plugins.
Changelog-Added: lightningd: Added --subdaemon command to allow alternate subdaemons.
[ Wow, that was mammoth; 44 comments over 12 commits. Feels almost unfair to squash it into one commit, so I wanted to note @ksedgwic's perseverence here! --RR ]
This is needed to fully implement handling of blockheight disagreements
between us and payee.
If payee believes the blockheight is higher than ours, then `pay`
should wait for our node to achieve that blockheight.
Changelog-Add: Implement `waitblockheight` to wait for a specific blockheight.
Thanks to @t-bast, who made this possible by interop testing with Eclair!
Changelog-Added: Protocol: can now send and receive TLV-style onion messages.
Changelog-Added: Protocol: can now send and receive BOLT11 payment_secrets.
Changelog-Added: Protocol: can now receive basic multi-part payments.
Changelog-Added: RPC: low-level commands sendpay and waitsendpay can now be used to manually send multi-part payments.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
AFAICT this only "worked" previously because replay htlc simply failed
them all (no peers are currently connected). With upcoming changes
(foreshadowed by the comment) this is no longer true:
Attempting to prepare a db_stmt outside of a transaction: wallet/invoices.c:373
lightningd: FATAL SIGNAL 6 (version v0.7.3-188-g45b0af4-modded)
0x55b475590a73 send_backtrace
common/daemon.c:41
0x55b475590b1d crashdump
common/daemon.c:54
0x7f16c557b46f ???
???:0
0x7f16c557b3eb ???
???:0
0x7f16c555a898 ???
???:0
0x55b475564c8f fatal
lightningd/log.c:814
0x55b4755c3ed5 db_prepare_v2_
wallet/db.c:605
0x55b4755c76b5 invoices_find_unpaid
wallet/invoices.c:373
0x55b4755ce91c wallet_invoice_find_unpaid
wallet/wallet.c:1990
0x55b47555861f invoice_check_payment
lightningd/invoice.c:257
0x55b475557a7c htlc_add_set
lightningd/htlc_set.c:112
0x55b47557b294 handle_localpay
lightningd/peer_htlcs.c:332
0x55b47557c63c htlc_accepted_hook_callback
lightningd/peer_htlcs.c:857
0x55b475585573 plugin_hook_call_
lightningd/plugin_hook.c:118
0x55b47557c747 plugin_hook_call_htlc_accepted
lightningd/peer_htlcs.c:882
0x55b47557ca3e peer_accepted_htlc
lightningd/peer_htlcs.c:991
0x55b47557ffb9 htlcs_resubmit
lightningd/peer_htlcs.c:2131
0x55b4755620f7 main
lightningd/lightningd.c:801
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This isn't plumbed in yet, but the idea is that every htlc gets put
into a "set" and then we process them once the set is satisfied. For
the !EXPERIMENTAL_FEATURES, the set is simply always size 1.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is a transient field, so rework things so we don't leave it in
struct htlc_out. Instead, load htlc_in first and connect htlc_out to
them as we go.
This also changes one place where we use it instead of the am_origin
flag.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
lightning-cli is going to need to know what network we're on, so
it will need to parse the config files. Move the code which does
the initial bootstrap parsing into common, as well as the config
file parsing core.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This allows finegrained logging control of particular subdaemons or
subsystems.
To do this, we defer setting the logging levels for each log object
until after early argument parsing (since e.g. "bitcoind" log object
is created early).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-changed: Options: log-level can now specify different levels for different subsystems.
1. Printed form is always "[<nodeid>-]<prefix>: <string>"
2. "jcon fd %i" becomes "jsonrpc #%i".
3. "jsonrpc" log is only used once, and is removed.
4. "database" log prefix is use for db accesses.
5. "lightningd(%i)" becomes simply "lightningd" without the pid.
6. The "lightningd_" prefix is stripped from subd log prefixes, and pid removed.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-changed: Logging: formatting made uniform: [NODEID-]SUBSYSTEM: MESSAGE
Changelog-removed: `lightning_` prefixes removed from subdaemon names, including in listpeers `owner` field.
We had a separate logbook for each peer, and copy log entries above
the printable log level into the master logbook. This didn't always
work well, since we didn't dump it on crash for example.
Keep a single global logbook instead, and remove this infrastructure.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
A log can have a default node_id, which can be overridden on a per-entry
basis. This changes the format of logging, so some tests need rework.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
==1310== Conditional jump or move depends on uninitialised value(s)
==1310== at 0x127C7F: io_loop_with_timers (io_loop_with_timers.c:30)
==1310== by 0x14F0E1: plugins_init (plugin.c:1019)
==1310== by 0x12E4B1: main (lightningd.c:694)
==1310==
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
--dev-force-tmp-channel-id flag takes a 64-character hex string
to use as the temporary channel id. Useful for spec tests
[ Fixed crash in non-DEVELOPER mode --RR ]
Changelog-None
This splits maybe_create_hsm_secret() in two parts (either encrypted
or in clear) for clarity, and adds an encryption detection in load_hsm().
There are actually three cases if an encryption key is passed:
- There is no hsm_secret => just create it and store the encrypted seed
- There is an encrypted hsm_secret => the provided key should be able to
decrypt the seed, if the wrong key is passed libsodium will nicely error
and hsmd will exit() to not throw a backtrace (using status_failed() as for
other errors) at the face of an user who mistyped its password.
- There is a non-encrypted hsm_secret => load the seed, delete the
hsm_secret, create the hsm_secret, store the encrypted seed.
Add a new startup option which will, if set, prompt the user for a
password to derive a key from. This key will later be used to encrypt
and/or decrypt `hsm_secret`.
This was made a noarg option even if it would have been preferable to
let the user the choice of how to specify the password. Since we have
to chose, better to not let the password in the commands history.
I was seeing some accidental pruning under load / Travis, and in
particular we stopped accepting channel_updates because they were 103
seconds old. But making it too long makes the prune test untenable,
so restore a separate flag that this test can use.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We should never open more than 1024 file descriptors anyway, and under some
situations, namely running as root or in docker, would give us huge
allowances. This then results in a huge, unneeded, cleanup for subprocesses,
which we use a lot.
Fixes#2977
This will let gossipd be more intelligent about gossiping before we're
synced, and also it might know how far behind we are.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It's generally clearer to have simple hardcoded numbers with an
#if DEVELOPER around it, than apparent variables which aren't, really.
Interestingly, our pruning test was always kinda broken: we have to pass
two cycles, since l2 will refresh the channel once to avoid pruning.
Do the more obvious thing, and cut the network in half and check that
l1 and l3 time out.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We now have a much stronger consistency check from the combination of
transaction wrapping, tal memory leak detection. Tramsaction wrapping ensures
that each statement is executed before the transaction is committed. The
commit is also driven by the `io_loop`, which means that it is no longer
possible for us to have statements outside of transactions and transactions
are guaranteed to commit at the round's end.
By adding the tal-awareness we can also get a much better indication as to
whether we have un-freed statements flying around, which we can test at the
end of the round as well.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
This was causing `--help` to fail if we already had a `lightningd` running
with the same `--lightning-dir`.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
1. Now checking the pid file really does precede touching the db and
starting plugins, which is far safer.
2. Crashlog is now activated just after daemon parent release, and just
before the main loop, which means no "crash" on startup if we call fatal().
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Dumb programs which have a --daemon option call fork() early. This is
terrible UX since startup errors get lost: the program exits with
"success" immediately then you discover via the logs that it didn't
start at all.
However, forking late introduced a heap of problems with changing
pids. Instead, fork early but keep stderr and the parent around: if
we fail early on, the parent fails with us. We release our parent
with an explicit action just before the main loop.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Note that we move adding the plugin to the plugins list to the end, otherwise
the hook from logging can examine the (uninitialized) plugin.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This adds a 'configured' boolean member to the plugin struct so that we can add plugins to ld->plugins' list and differenciate fresh plugins.
This also adds 'plugins_start' so that new plugins can be started without calling 'plugins_init' and running an io loop
I noticed that --network=regtest didn't override 'network=bitcoin' in
the config file.
Normally we parse the config file first, then the commandline (so the cmdline
wins). But for early options, we do cmdline first so we can find the config
file. That was fine when the only early option was the location of the
config file, but now it includes plugins and the network setting.
So do a boutique cmdline parse *just* to find the config file, then parse
the config file early options, then the cmdline early options.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Direct leak of 16 byte(s) in 1 object(s) allocated from:
#0 0x7ff02889063e in calloc (/lib/x86_64-linux-gnu/libasan.so.5+0x10c63e)
#1 0x555ce2ad8d2e in htable_default_alloc ccan/ccan/htable/htable.c:19
#2 0x555ce2ad9698 in double_table ccan/ccan/htable/htable.c:226
#3 0x555ce2ad9b62 in htable_add_ ccan/ccan/htable/htable.c:331
#4 0x555ce2a638e4 in htlc_in_map_add lightningd/htlc_end.h:113
#5 0x555ce2a63beb in connect_htlc_in lightningd/htlc_end.c:39
#6 0x555ce2a85cbc in channel_added_their_htlc lightningd/peer_htlcs.c:1382
#7 0x555ce2a860e1 in peer_got_commitsig lightningd/peer_htlcs.c:1466
#8 0x555ce2a5db04 in channel_msg lightningd/channel_control.c:228
#9 0x555ce2a8d393 in sd_msg_read lightningd/subd.c:474
#10 0x555ce2ada157 in next_plan ccan/ccan/io/io.c:59
#11 0x555ce2adacd4 in do_plan ccan/ccan/io/io.c:407
#12 0x555ce2adad12 in io_ready ccan/ccan/io/io.c:417
#13 0x555ce2adcd67 in io_loop ccan/ccan/io/poll.c:445
#14 0x555ce2a67c66 in io_loop_with_timers lightningd/io_loop_with_timers.c:24
#15 0x555ce2a6e56b in main lightningd/lightningd.c:822
#16 0x7ff028242b6a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x26b6a)
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Direct leak of 1024 byte(s) in 2 object(s) allocated from:
#0 0x7f4c84ce4448 in malloc (/lib/x86_64-linux-gnu/libasan.so.5+0x10c448)
#1 0x55d11b782c96 in timer_default_alloc ccan/ccan/timer/timer.c:16
#2 0x55d11b7832b7 in add_level ccan/ccan/timer/timer.c:166
#3 0x55d11b783864 in timer_fast_forward ccan/ccan/timer/timer.c:334
#4 0x55d11b78396a in timers_expire ccan/ccan/timer/timer.c:359
#5 0x55d11b774993 in io_loop ccan/ccan/io/poll.c:395
#6 0x55d11b72322f in plugins_init lightningd/plugin.c:1013
#7 0x55d11b7060ea in main lightningd/lightningd.c:664
#8 0x7f4c84696b6a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x26b6a)
To fix this, we actually make 'ld->timers' a pointer, so we can clean
it up last of all. We can't free it before ld, because that causes
timers to be destroyed.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is an old bug, where a plugin can get called while we're shutting
down (and have freed plugins), but it's triggered more reliably by the
new warning notification hook.
For good measure, we also make freeing a plugin self-delete.
Valgrind error file: valgrind-errors.16763
==16886== Invalid read of size 8
==16886== at 0x422919: plugins_notify (plugin.c:1096)
==16886== by 0x413919: notify_warning (notification.c:61)
==16886== by 0x412BDE: logv (log.c:251)
==16886== by 0x412A98: log_ (log.c:311)
==16886== by 0x4044BE: bcli_finished (bitcoind.c:178)
==16886== by 0x459480: destroy_conn (poll.c:244)
==16886== by 0x459499: destroy_conn_close_fd (poll.c:250)
==16886== by 0x4619E1: notify (tal.c:235)
==16886== by 0x461A7E: del_tree (tal.c:397)
==16886== by 0x461AB5: del_tree (tal.c:407)
==16886== by 0x461AB5: del_tree (tal.c:407)
==16886== by 0x461AB5: del_tree (tal.c:407)
==16886== Address 0x634a578 is 40 bytes inside a block of size 352 free'd
==16886== at 0x4C2EDEB: free (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==16886== by 0x461AFD: del_tree (tal.c:416)
==16886== by 0x461FB7: tal_free (tal.c:481)
==16886== by 0x411E0A: main (lightningd.c:841)
==16886== Block was alloc'd at
==16886== at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==16886== by 0x4617CE: allocate (tal.c:245)
==16886== by 0x461E4C: tal_alloc_ (tal.c:423)
==16886== by 0x42255E: plugins_new (plugin.c:106)
==16886== by 0x41133D: new_lightningd (lightningd.c:218)
==16886== by 0x411AD4: main (lightningd.c:649)
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is a painpoint with testing, that there's a noticable delay between
"Shutting down" from lightning-cli and being able to restart lightningd.
This fixes that by creating a canned response for this case, which is
simply written out immediately before exit. At this point, the pidfile
has been deleted, the sockets have been closed, and the database
has been closed.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
These are generalized from our internal implementations.
The main difference is that 'struct json_escaped' is now 'struct
json_escape', so we replace that immediately.
The difference between lightningd's json-writing ringbuffer and the
more generic ccan/json_out is that the latter has a better API and
handles escaping transparently if something slips through (though
it does offer direct accessors so you can mess things up yourself!).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
- Related Changes for `warning` notification
Add a `bool` type parameter in `log_()` and `lov()`, this `bool` flag
indicates if we should call `warning` notifier.
1) The process of copying `log_book` of every peer to the `log_book` of
`ld` is usually included in `log_()` and `lov()`, and it may lead to
repeated `warning` notification. So a `bool`, which explicitly indicates
if the `warning` notification is disabled during this call, is necessary
.
2) The `LOG_INFO` and `LOG_DEBUG` level don't need to call
warning, so set that `bool` paramater as `FALSE` for these log level and
only set it as `TRUE` for `LOG_UNUAUSL`/`LOG_BROKEN`. As for `LOG_IO`,
it use `log_io()` to log, so we needn't think about notifier for it.
We reserve inputs when we're going to send a transaction, but we don't
unreserve them if we crash. This is most graphically demonstrated by
the txprepare case, which makes it easier to trigger.
Instead, we should query bitcoind to see whether the tx made it out or
not, as we would do manually with dev-rescan-outputs.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We now have a test blockchain for MCP which has the correct channels,
so this is not needed.
Also fix a benchmark script bug where 'mv "$DIR"/log
"$DIR"/log.old.$$' would fail if you log didn't exist from a previous run.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
I tried to just do gossipd, but it was uncontainable, so this ended up being
a complete sweep.
We didn't get much space saving in gossipd, even though we should save
24 bytes per node.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This lets us benchmark without a valid blockchain.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Header from folded patch 'fixup!_gossipd__dev_option_to_allow_unknown_channels.patch':
fixup! gossipd: dev option to allow unknown channels.
Suggested-by: @cdecker
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This will make the logger write 4 newlines to re-attached logfiles.
The newlines wont appear on logfiles that are just created.
Additionally the server prints 50 '-' dashes before printing his
startup message, which also help increase readability on logfile.
This was inspired by the way Bitcoin Core handles logfiles.
Next patch will call commands to get usage inside jsonrpc_new(): to do
this it will need access to ld->jsonrpc, so we can't use the current
pattern.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Since we are planning to release a bug fix release, and the plugin
subsystem is not yet complete, it is better to make plugin support
opt-in while we continue testing.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The processes that were used to test the subdaemon versions were not
reaped correctly keeping some resources bound.
Signed-off-by: Christian Decker <decker.christian@gmail.com>