Commit Graph

9358 Commits

Author SHA1 Message Date
Rusty Russell
82ed71d621 connectd: don't crash if connect() fails immediately.
Took me a while (stressing under valgrind) to reproduce this,
then longer to figure out how it happened.

Turns out io_new_conn() can fail if the init function fails.
In our case, this can happen if connect() immediately returns
an error (inside io_connect).  But we've already set the finish
function, which (if this was the last address), will free connect,
making the assignment `connect->conn = ...` write to a freed address.

Either way, if it fails, try_connect_one_addr() has taken care to
update connect->conn, or free connect, and the caller should not do it.

Here's the valgrind trace:
```
==384981== Invalid write of size 8
==384981==    at 0x11127C: try_connect_one_addr (connectd.c:880)
==384981==    by 0x112BA1: destroy_io_conn (connectd.c:708)
==384981==    by 0x141459: destroy_conn (poll.c:244)
==384981==    by 0x14147F: destroy_conn_close_fd (poll.c:250)
==384981==    by 0x149EB9: notify (tal.c:240)
==384981==    by 0x149F8B: del_tree (tal.c:402)
==384981==    by 0x14A51A: tal_free (tal.c:486)
==384981==    by 0x140036: io_close (io.c:450)
==384981==    by 0x1400B3: do_plan (io.c:401)
==384981==    by 0x140134: io_ready (io.c:423)
==384981==    by 0x141A57: io_loop (poll.c:445)
==384981==    by 0x112CB0: main (connectd.c:1703)
==384981==  Address 0x4d67020 is 64 bytes inside a block of size 160 free'd
==384981==    at 0x483CA3F: free (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==384981==    by 0x14A020: del_tree (tal.c:421)
==384981==    by 0x14A51A: tal_free (tal.c:486)
==384981==    by 0x1110C5: try_connect_one_addr (connectd.c:806)
==384981==    by 0x112BA1: destroy_io_conn (connectd.c:708)
==384981==    by 0x141459: destroy_conn (poll.c:244)
==384981==    by 0x14147F: destroy_conn_close_fd (poll.c:250)
==384981==    by 0x149EB9: notify (tal.c:240)
==384981==    by 0x149F8B: del_tree (tal.c:402)
==384981==    by 0x14A51A: tal_free (tal.c:486)
==384981==    by 0x140036: io_close (io.c:450)
==384981==    by 0x1405DC: io_connect_ (io.c:345)
==384981==  Block was alloc'd at
==384981==    at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==384981==    by 0x149CF1: allocate (tal.c:250)
==384981==    by 0x14A3C6: tal_alloc_ (tal.c:428)
==384981==    by 0x1114F2: try_connect_peer (connectd.c:1526)
==384981==    by 0x111717: connect_to_peer (connectd.c:1558)
==384981==    by 0x1124F5: recv_req (connectd.c:1627)
==384981==    by 0x1188B2: handle_read (daemon_conn.c:31)
==384981==    by 0x13FBCB: next_plan (io.c:59)
==384981==    by 0x140076: do_plan (io.c:407)
==384981==    by 0x140113: io_ready (io.c:417)
==384981==    by 0x141A57: io_loop (poll.c:445)
==384981==    by 0x112CB0: main (connectd.c:1703)
```

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Fixed: Occasional crash in connectd due to use-after-free
Fixes: #4343
2021-02-01 21:01:06 +01:00
Rusty Russell
0056dd7557 lightningd: disallow --daemon without --log-file.
From #clightning:

    (11:24:10) andytoshi: hiya, i'm trying to set up a new lightningd node, and when i run lightningd --network=bitcoin --log-level=debug --daemon
    (11:24:17) andytoshi: i get errors of the form fetchinvoice: Malformed JSON reply '2021-01-25T00:51:16.655Z DEBUG   plugin-offers: disabled itself at init: offers not enabled in config
    (11:24:43) andytoshi: there are a couple variants of this, but always some form of "something: failed to parse <a log line> as json"

Indeed, we close stdout, and it ends up being reused for some plugin.
But the real problem is that we log to stdout by default, which doesn't
make sense.  If they really want to discard logs, they can use
--log-file=/dev/null.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Fixed: JSON failures when --daemon is used without --log-file.
2021-02-01 09:57:54 +10:30
Rusty Russell
5eb209f57a bitcoind: remove v0.9.0-compat for rejecting sendrawtransaction arg.
Changelog-Removed: `bcli` replacements must allow `allowhighfees` argument (deprecated 0.9.1).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2021-02-01 09:57:35 +10:30
Rusty Russell
406eb37717 listsendpays: remove deprecated "null" amount_msat.
Changelog-Removed: `listsendpays` will no longer add `amount_msat` `null` (deprecated 0.9.1).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2021-02-01 09:57:35 +10:30
Michael Schmoock
9eeb290637 chore: cleanup some nits
rearranges the`peer_connected_hook_payload` definition to the location
where this is used in the file.

Fixes certain blanklines and linebreaks to make the code look nicer.
2021-02-01 09:57:15 +10:30
Michael Schmoock
bc40287ade pytest: peer_connected chainable tests 2021-02-01 09:57:15 +10:30
Michael Schmoock
91bdb6d2d9 feat: make peer_connected hook chainable
Changelog-Changed: peer_connected hook is now chainable
2021-02-01 09:57:15 +10:30
Michael Schmoock
54675546ab doc: openchannel note close_to can only be set once 2021-02-01 09:57:15 +10:30
Michael Schmoock
7106349eab doc: document peer_connected hook chainable 2021-02-01 09:57:15 +10:30
Michael Schmoock
bdf0d60fd6 chore: fix typo in openchannel hook log
Nit: The underscore in "openchannel_hook" is wrong, bcause the name of
the hook is just "openchannel". The "_hook" implied this to be part of
the name.

Changelog-None
2021-02-01 09:57:15 +10:30
Michael Schmoock
3a0b1c5b1d pytest: improve test_openchannel_hook_chaining
The current test was not checking for the output of the first plugin in
the chain.
2021-01-29 13:37:42 +10:30
Michael Schmoock
67f2939540 pytest: custommsg chainable tests 2021-01-29 13:37:42 +10:30
Michael Schmoock
4e8d3f395b doc: document custommsg hook now chainable 2021-01-29 13:37:42 +10:30
Michael Schmoock
8e71c7a1f1 feat: make custommsg hook chainable
Changelog-Changed: custommsg hook is now chainable
2021-01-29 13:37:42 +10:30
Christian Decker
a5f16ab5b1 pyln: Catch OSError when cleaning up test directories 2021-01-29 10:29:09 +10:30
Christian Decker
fc071c6784 travis: Goodbye Travis, hello github actions 2021-01-29 10:29:09 +10:30
Christian Decker
2834aaced0 gci: Stabilize test_forward_event_notification 2021-01-29 10:29:09 +10:30
Christian Decker
04df500f8e gci: Add the JSON report plugin to the ci configuration 2021-01-29 10:29:09 +10:30
Christian Decker
ea67710e01 pyln: Pretty print RPC calls in the testing framework
We are printing `repr(obj)` which is not pretty-printed, hard to read,
and can't even be copied and inspected to JSON tools. We now print the
JSONified and indented calls and responses for easier debugging based
on solely the logs (useful for CI!).

Changelog-Added: pyln-testing: The RPC client will now pretty-print requests and responses to facilitate log-based debugging.
2021-01-29 10:29:09 +10:30
Christian Decker
3ff82216c5 gci: Give each configuration an CFG value to identify them later
We don't have a good way of referring to the configuration that
failed, so let's give them a numberic ID. Particularly useful for the
artifacts that'd be overwritten otherwise.
2021-01-29 10:29:09 +10:30
Christian Decker
80ef684d25 gci: Pin mypy to version 0.790 since 0.800 gives strange errors 2021-01-29 10:29:09 +10:30
Christian Decker
bb15d7f042 gci: Upload the junit.xml report
Can be used in a second stage to generate stats and detect flaky
tests.
2021-01-29 10:29:09 +10:30
Christian Decker
14b64ecc7e gci: Switch to the flaky plugin
rerunfailures keeps not working.
2021-01-29 10:29:09 +10:30
Christian Decker
62cb1c3fbc pytest: Stabilize test_forward_stats 2021-01-29 10:29:09 +10:30
Christian Decker
da2e956538 pytest: Stabilize test_routing_gossip
openchannel internally generates blocks, which may cause nodes to be
out of sync and ignore "future" channel announcements, resulting in
bad gossip.
2021-01-29 10:29:09 +10:30
Christian Decker
5ecaff65ee pytest: Give each run of the hsmtool its own pty 2021-01-29 10:29:09 +10:30
Christian Decker
1463797c61 pytest: Stabilize test_funding_close_upfront
Reconnections and unsynchronized states where causing us some issues.
2021-01-29 10:29:09 +10:30
Christian Decker
bbdf35c6fe pytest: Stabilize test_closing_negotiation_reconnect
The test was not considering that concurrent sendrawtx of the same tx
is not stable, and either endpoint will submit it first. Now just
checking state transitions and the mempool.
2021-01-29 10:29:09 +10:30
Christian Decker
52e82b76b6 pytest: Stabilize test_bad_onion 2021-01-29 10:29:09 +10:30
Christian Decker
3d4c111721 pytest: Stabilize test_multiple_channels
If we're quick (or the node is slow) we end up reconnecting before our
counterparty has realized the state transition, resulting in an
unexpected re-establish.
2021-01-29 10:29:09 +10:30
Christian Decker
8c94d1a358 make: Remove hardcoded timeout to pytest
This should really be set by the environment by creating either a
pytest.ini or setting PYTEST_OPTS envvar.
2021-01-29 10:29:09 +10:30
Christian Decker
1ea0dd6af2 gci: Add pytest.ini in order to randomize the groups
We have a couple of very heavy tests bunched together, randomization
could potentially lessen the peak load.
2021-01-29 10:29:09 +10:30
Christian Decker
6119f1f5fb gci: Format the build script
We were using a lot of docker conventions, which are not necessary in
the script itself.
2021-01-29 10:29:09 +10:30
Christian Decker
cd9aa267b4 pyln: Adjust maximum load allowed by the throttler 2021-01-29 10:29:09 +10:30
Christian Decker
542f3225e3 pytest: Parameterize process waits for hsmtool calls
We were sometimes waiting only 5 seconds, which is way too short on a
heavily loaded machine such as CI. Making it 30 seconds and collecting
it in a single place so we can adjust more easily.
2021-01-29 10:29:09 +10:30
Christian Decker
4c3ee04bb7 pyln: Use a fair FS lock to throttle node startups
We were getting a couple of starvations, so we need a fair filelock. I
also wasn't too happy with the lock as is, so I hand-coded it quickly.

Should be correct, but the overall timeout will tell us how well we
are doing on CI.
2021-01-29 10:29:09 +10:30
Christian Decker
8cc62d76e4 pytest: Stabilize test_channel_{spendable,receivable}
They were using TIMEOUT / 2 which may be way too long (hit against
test timeout), so we use a still ludicrous 30 seconds instead.
2021-01-29 10:29:09 +10:30
Christian Decker
6384cadd69 pytest: Stabilize the negotiation tests
We also make the logic a bit nicer to read. The failure was due to
more than one status message being present if we look at the wrong
time:

```
arr = ['CLOSINGD_SIGEXCHANGE:We agreed on a closing fee of 20334 satoshi for tx:17f1e9d377840edf79d8b6f1ed0faba59bb307463461...9b98', 'CLOSINGD_SIGEXCHANGE:Waiting for another closing fee offer: ours was 20334 satoshi, theirs was 20332 satoshi,']                                                                                                   │

     def only_one(arr):
         """Many JSON RPC calls return an array; often we only expect a single entry
         """
 >       assert len(arr) == 1
 E       AssertionError
```
2021-01-29 10:29:09 +10:30
Christian Decker
04ed93f5f8 pytest: Stabilize test_funding_external_wallet_corners 2021-01-29 10:29:09 +10:30
Christian Decker
18483ca582 pytest: Disable test_v2_open if not developer
It requires `--dev-force-features` which isn't available without
`DEVELOPER=1`
2021-01-29 10:29:09 +10:30
Christian Decker
7962db821c pytest: Stabilize test_channel_state_changed_bilateral 2021-01-29 10:29:09 +10:30
Christian Decker
07f5054700 pytest: Stabilize test_setchannelfee_state
Synching with the blockchain was slower than our timeout...
2021-01-29 10:29:09 +10:30
Christian Decker
03449e3cf0 pytest: Stabilize test_gossip_persistence
We weren't waiting for the `dev_fail` transaction to hit the mempool,
throwing the results off.
2021-01-29 10:29:09 +10:30
Christian Decker
ae40c10bcb pytest: Stabilize test_onchain_timeout
The timeout on the pay future was too short under valgrind.
2021-01-29 10:29:09 +10:30
Christian Decker
fc677e331a pyln: Update dependencies for all pyln packages
We were getting a number of incompatibility warning due to the
dependencies being expressed too rigidly. This losens the requirement
definitions to being compatible with a known good version, and while
we're at it we also bump all outdated requirements.
2021-01-29 10:29:09 +10:30
Christian Decker
c564f165fa pytest: Stabilize test_penalty_htlc_tx_timeout
We weren't waiting for the transactions to enter the mempool which
could cause all of our fine-tuned block counts to be off. Now just
waiting for the expected number of txs.
2021-01-29 10:29:09 +10:30
Christian Decker
c0f06f2779 pytest: Simplify and stabilize test_reconnect_no_update 2021-01-29 10:29:09 +10:30
Christian Decker
2b12cac31e pytest: Skip hsm encryption test if we don't have a TTY 2021-01-29 10:29:09 +10:30
Christian Decker
be77cd7669 gci: Expand matrix to include all CI configurations 2021-01-29 10:29:09 +10:30
Christian Decker
b447944285 gci: Add a tester Dockerfile 2021-01-29 10:29:09 +10:30