Commit Graph

9676 Commits

Author SHA1 Message Date
Rusty Russell
6b11cc8b8c common: disallow NULL channel_id to peer_failed_err.
No more sending "all-channel" errors; in particular, gossipd now only
sends warnings (which make us hang up), not errors, and peer_connected
rejections are warnings (and disconnect), not errors.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Changed: Plugins: `peer_connected` rejections now send a warning, not an error, to the peer.
2021-02-04 12:02:52 +10:30
Rusty Russell
f4ee41a989 common: remove peer_failed in favor of peer_failed_warn/peer_failed_err
And make all the callers choose which one.  In general, I prefer warn,
which lets them reconnect and try again, however some places are either
stated that they must be errors in the spec itself, or in openingd
where we abandon the channel when we close the connection anyway.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Changed: Protocol: we now send warning messages and close the connection, except on unrecoverable errors.
2021-02-04 12:02:52 +10:30
Rusty Russell
e6cefc4b00 common: infrastructure to construct warning messages.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2021-02-04 12:02:52 +10:30
Rusty Russell
d14e273b04 common: treat all "all-channels" errors as if they were warnings.
This is in line with the warnings draft, where all-zeroes in a
channel_id is no longer special (i.e. it will be ignored).

But gossipd would send these if it got upset with us, so it's best
practice to ignore them for now anyway.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Added: Protocol: we treat error messages from peer which refer to "all channels" as warnings, not errors.
2021-02-04 12:02:52 +10:30
Rusty Russell
a7c5a1f1d2 lightningd: implement receiving warnings.
This takes from the draft spec at https://github.com/lightningnetwork/lightning-rfc/pull/834

Note that if this draft does not get included, the peer will simply
ignore the warning message (we always close the connection afterwards
anyway).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Added: Protocol: we now report the new (draft) warning message.
2021-02-04 12:02:52 +10:30
Rusty Russell
f0659d0ab0 pytest: fix false bad gossip issue in test_forward
We construct the route manually so we may not have the
channel_announcement yet.  But we can get an update from the error
packet, which can lead to:

```
2021-01-29T01:38:23.4767334Z            ValueError: 
2021-01-29T01:38:23.4767987Z            Node errors:
2021-01-29T01:38:23.4768767Z             - lightningd-1: had bad gossip messages
2021-01-29T01:38:23.4769512Z            Global errors:
2021-01-29T01:38:23.4770300Z 
2021-01-29T01:38:23.4771109Z contrib/pyln-testing/pyln/testing/fixtures.py:197: ValueError
...
2021-01-29T01:38:23.7820197Z lightningd-1: 2021-01-29T01:26:57.460Z DEBUG   gossipd: Extracted channel_update 01027217b3086ad9f3dee1fa55b94c5fd2a4b0637bec70ba727ba4151a8de5173ddc749db3502d41ab0ae164addc8fd013d2088b6a12a2f478ae0affa94d76d8845c06226e46111a0b59caaf126043eb5bbf28c34f3a5e332a1fc7b2b73cf188910f000067000001000160136459010000060000000000000000000000010000000a000000003b023380 from onionreply 100d0000007500887217b3086ad9f3dee1fa55b94c5fd2a4b0637bec70ba727ba4151a8de5173ddc749db3502d41ab0ae164addc8fd013d2088b6a12a2f478ae0affa94d76d8845c06226e46111a0b59caaf126043eb5bbf28c34f3a5e332a1fc7b2b73cf188910f000067000001000160136459010000060000000000000000000000010000000a000000003b023380
2021-01-29T01:38:23.7837450Z lightningd-1: 2021-01-29T01:26:57.461Z DEBUG   gossipd: Bad gossip order: WIRE_CHANNEL_UPDATE before announcement 103x1x1/0
```

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2021-02-04 12:02:52 +10:30
Rusty Russell
c5bd518d2f gossmap: change local API.
Now we create a separate set of local mods, and apply and unapply it.
This is more efficient than the previous approach, since we can do
some work up-front.  It's also more graceful (and well-defined) when a
local modification overlaps an existing one.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2021-02-04 12:02:36 +10:30
Rusty Russell
1490ae4cf6 common: support transitory local map additions to gossmap.
This will let us add routehints to the map and use dijkstra etc like
normal.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2021-02-04 12:02:36 +10:30
Rusty Russell
3fc8b1cd20 common: fix gossmap min/max rounding.
They were backwards.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2021-02-04 12:02:36 +10:30
Rusty Russell
cc4ea9420a common: extract fp16 routines into their own file.
We might want to use them elsewhere.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2021-02-04 12:02:36 +10:30
Rusty Russell
06a54606a3 check-includes: allow redundant "config.h"
We should actually be including this (as it may define _GNU_SOURCE
etc) before any system headers.  But where we include <assert.h> we
often didn't, because check-includes would complain that the headers
included it too.

Weaken that check, and include config.h in C files before assert.h.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2021-02-04 12:02:36 +10:30
Michael Schmoock
c8198d9820 pyln-testing: add scids param to wait_for_htlcs 2021-02-04 12:02:15 +10:30
Michael Schmoock
d76ca6ed35 pyln-testing: improve wait_for a bit
This 'fixes' the `wait_for` helper by removing a pointless final
`time.sleep()`, thus potentially making the method return quicker.

The old code could have had three final states:
 - success() := True
 - Timeout and success() := True
 - Timeout and success() := False

The new code has just two final state:
 - success() := True
 - Timeout and success() := False

It ensures the final `time.sleep()` is just the right amount before timeout.
And more importantly making it more readable :-)

Changelog-None
2021-02-04 12:02:15 +10:30
Rusty Russell
9da191e034 docs: document addgossip API.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Added: JSON-RPC: `addgossip` allows direct injection of network gossip messages.
2021-02-02 13:44:01 +01:00
Rusty Russell
1a85edd207 lightningd: no longer forward failures to gossipd, let caller do it.
We fix up the test by using pay, instead of sendpay (and making pay log
the expected message).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Changed: sendpay no longer extracts updates from errors, the caller should do it from the `raw_message`.
2021-02-02 13:44:01 +01:00
Rusty Russell
780fc25413 pay: inject channel updates from errors ourselves.
Cut & paste from gossipd.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2021-02-02 13:44:01 +01:00
Rusty Russell
3c5502426b lightningd: addgossip API to inject gossip messages.
Importantly, this is synchronous, so pay will be able to use it
reliably.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2021-02-02 13:44:01 +01:00
Rusty Russell
f1c599516e gossipd: add an internal flag to force a channel update
This overcomes the internal spam filter on updates, which can be useful
if we're actually trying to send through such a node.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Fixed: Protocol: always accept channel_updates from errors, even they'd otherwise be rejected as spam.
Fixes: #4300
2021-02-02 13:44:01 +01:00
Rusty Russell
82ed71d621 connectd: don't crash if connect() fails immediately.
Took me a while (stressing under valgrind) to reproduce this,
then longer to figure out how it happened.

Turns out io_new_conn() can fail if the init function fails.
In our case, this can happen if connect() immediately returns
an error (inside io_connect).  But we've already set the finish
function, which (if this was the last address), will free connect,
making the assignment `connect->conn = ...` write to a freed address.

Either way, if it fails, try_connect_one_addr() has taken care to
update connect->conn, or free connect, and the caller should not do it.

Here's the valgrind trace:
```
==384981== Invalid write of size 8
==384981==    at 0x11127C: try_connect_one_addr (connectd.c:880)
==384981==    by 0x112BA1: destroy_io_conn (connectd.c:708)
==384981==    by 0x141459: destroy_conn (poll.c:244)
==384981==    by 0x14147F: destroy_conn_close_fd (poll.c:250)
==384981==    by 0x149EB9: notify (tal.c:240)
==384981==    by 0x149F8B: del_tree (tal.c:402)
==384981==    by 0x14A51A: tal_free (tal.c:486)
==384981==    by 0x140036: io_close (io.c:450)
==384981==    by 0x1400B3: do_plan (io.c:401)
==384981==    by 0x140134: io_ready (io.c:423)
==384981==    by 0x141A57: io_loop (poll.c:445)
==384981==    by 0x112CB0: main (connectd.c:1703)
==384981==  Address 0x4d67020 is 64 bytes inside a block of size 160 free'd
==384981==    at 0x483CA3F: free (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==384981==    by 0x14A020: del_tree (tal.c:421)
==384981==    by 0x14A51A: tal_free (tal.c:486)
==384981==    by 0x1110C5: try_connect_one_addr (connectd.c:806)
==384981==    by 0x112BA1: destroy_io_conn (connectd.c:708)
==384981==    by 0x141459: destroy_conn (poll.c:244)
==384981==    by 0x14147F: destroy_conn_close_fd (poll.c:250)
==384981==    by 0x149EB9: notify (tal.c:240)
==384981==    by 0x149F8B: del_tree (tal.c:402)
==384981==    by 0x14A51A: tal_free (tal.c:486)
==384981==    by 0x140036: io_close (io.c:450)
==384981==    by 0x1405DC: io_connect_ (io.c:345)
==384981==  Block was alloc'd at
==384981==    at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==384981==    by 0x149CF1: allocate (tal.c:250)
==384981==    by 0x14A3C6: tal_alloc_ (tal.c:428)
==384981==    by 0x1114F2: try_connect_peer (connectd.c:1526)
==384981==    by 0x111717: connect_to_peer (connectd.c:1558)
==384981==    by 0x1124F5: recv_req (connectd.c:1627)
==384981==    by 0x1188B2: handle_read (daemon_conn.c:31)
==384981==    by 0x13FBCB: next_plan (io.c:59)
==384981==    by 0x140076: do_plan (io.c:407)
==384981==    by 0x140113: io_ready (io.c:417)
==384981==    by 0x141A57: io_loop (poll.c:445)
==384981==    by 0x112CB0: main (connectd.c:1703)
```

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Fixed: Occasional crash in connectd due to use-after-free
Fixes: #4343
2021-02-01 21:01:06 +01:00
Rusty Russell
0056dd7557 lightningd: disallow --daemon without --log-file.
From #clightning:

    (11:24:10) andytoshi: hiya, i'm trying to set up a new lightningd node, and when i run lightningd --network=bitcoin --log-level=debug --daemon
    (11:24:17) andytoshi: i get errors of the form fetchinvoice: Malformed JSON reply '2021-01-25T00:51:16.655Z DEBUG   plugin-offers: disabled itself at init: offers not enabled in config
    (11:24:43) andytoshi: there are a couple variants of this, but always some form of "something: failed to parse <a log line> as json"

Indeed, we close stdout, and it ends up being reused for some plugin.
But the real problem is that we log to stdout by default, which doesn't
make sense.  If they really want to discard logs, they can use
--log-file=/dev/null.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Fixed: JSON failures when --daemon is used without --log-file.
2021-02-01 09:57:54 +10:30
Rusty Russell
5eb209f57a bitcoind: remove v0.9.0-compat for rejecting sendrawtransaction arg.
Changelog-Removed: `bcli` replacements must allow `allowhighfees` argument (deprecated 0.9.1).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2021-02-01 09:57:35 +10:30
Rusty Russell
406eb37717 listsendpays: remove deprecated "null" amount_msat.
Changelog-Removed: `listsendpays` will no longer add `amount_msat` `null` (deprecated 0.9.1).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2021-02-01 09:57:35 +10:30
Michael Schmoock
9eeb290637 chore: cleanup some nits
rearranges the`peer_connected_hook_payload` definition to the location
where this is used in the file.

Fixes certain blanklines and linebreaks to make the code look nicer.
2021-02-01 09:57:15 +10:30
Michael Schmoock
bc40287ade pytest: peer_connected chainable tests 2021-02-01 09:57:15 +10:30
Michael Schmoock
91bdb6d2d9 feat: make peer_connected hook chainable
Changelog-Changed: peer_connected hook is now chainable
2021-02-01 09:57:15 +10:30
Michael Schmoock
54675546ab doc: openchannel note close_to can only be set once 2021-02-01 09:57:15 +10:30
Michael Schmoock
7106349eab doc: document peer_connected hook chainable 2021-02-01 09:57:15 +10:30
Michael Schmoock
bdf0d60fd6 chore: fix typo in openchannel hook log
Nit: The underscore in "openchannel_hook" is wrong, bcause the name of
the hook is just "openchannel". The "_hook" implied this to be part of
the name.

Changelog-None
2021-02-01 09:57:15 +10:30
Michael Schmoock
3a0b1c5b1d pytest: improve test_openchannel_hook_chaining
The current test was not checking for the output of the first plugin in
the chain.
2021-01-29 13:37:42 +10:30
Michael Schmoock
67f2939540 pytest: custommsg chainable tests 2021-01-29 13:37:42 +10:30
Michael Schmoock
4e8d3f395b doc: document custommsg hook now chainable 2021-01-29 13:37:42 +10:30
Michael Schmoock
8e71c7a1f1 feat: make custommsg hook chainable
Changelog-Changed: custommsg hook is now chainable
2021-01-29 13:37:42 +10:30
Christian Decker
a5f16ab5b1 pyln: Catch OSError when cleaning up test directories 2021-01-29 10:29:09 +10:30
Christian Decker
fc071c6784 travis: Goodbye Travis, hello github actions 2021-01-29 10:29:09 +10:30
Christian Decker
2834aaced0 gci: Stabilize test_forward_event_notification 2021-01-29 10:29:09 +10:30
Christian Decker
04df500f8e gci: Add the JSON report plugin to the ci configuration 2021-01-29 10:29:09 +10:30
Christian Decker
ea67710e01 pyln: Pretty print RPC calls in the testing framework
We are printing `repr(obj)` which is not pretty-printed, hard to read,
and can't even be copied and inspected to JSON tools. We now print the
JSONified and indented calls and responses for easier debugging based
on solely the logs (useful for CI!).

Changelog-Added: pyln-testing: The RPC client will now pretty-print requests and responses to facilitate log-based debugging.
2021-01-29 10:29:09 +10:30
Christian Decker
3ff82216c5 gci: Give each configuration an CFG value to identify them later
We don't have a good way of referring to the configuration that
failed, so let's give them a numberic ID. Particularly useful for the
artifacts that'd be overwritten otherwise.
2021-01-29 10:29:09 +10:30
Christian Decker
80ef684d25 gci: Pin mypy to version 0.790 since 0.800 gives strange errors 2021-01-29 10:29:09 +10:30
Christian Decker
bb15d7f042 gci: Upload the junit.xml report
Can be used in a second stage to generate stats and detect flaky
tests.
2021-01-29 10:29:09 +10:30
Christian Decker
14b64ecc7e gci: Switch to the flaky plugin
rerunfailures keeps not working.
2021-01-29 10:29:09 +10:30
Christian Decker
62cb1c3fbc pytest: Stabilize test_forward_stats 2021-01-29 10:29:09 +10:30
Christian Decker
da2e956538 pytest: Stabilize test_routing_gossip
openchannel internally generates blocks, which may cause nodes to be
out of sync and ignore "future" channel announcements, resulting in
bad gossip.
2021-01-29 10:29:09 +10:30
Christian Decker
5ecaff65ee pytest: Give each run of the hsmtool its own pty 2021-01-29 10:29:09 +10:30
Christian Decker
1463797c61 pytest: Stabilize test_funding_close_upfront
Reconnections and unsynchronized states where causing us some issues.
2021-01-29 10:29:09 +10:30
Christian Decker
bbdf35c6fe pytest: Stabilize test_closing_negotiation_reconnect
The test was not considering that concurrent sendrawtx of the same tx
is not stable, and either endpoint will submit it first. Now just
checking state transitions and the mempool.
2021-01-29 10:29:09 +10:30
Christian Decker
52e82b76b6 pytest: Stabilize test_bad_onion 2021-01-29 10:29:09 +10:30
Christian Decker
3d4c111721 pytest: Stabilize test_multiple_channels
If we're quick (or the node is slow) we end up reconnecting before our
counterparty has realized the state transition, resulting in an
unexpected re-establish.
2021-01-29 10:29:09 +10:30
Christian Decker
8c94d1a358 make: Remove hardcoded timeout to pytest
This should really be set by the environment by creating either a
pytest.ini or setting PYTEST_OPTS envvar.
2021-01-29 10:29:09 +10:30
Christian Decker
1ea0dd6af2 gci: Add pytest.ini in order to randomize the groups
We have a couple of very heavy tests bunched together, randomization
could potentially lessen the peak load.
2021-01-29 10:29:09 +10:30