When compiled without DEVELOPER this will now filter out `remote_addr` that
come from localhost. The testcase checks for DEVELOPER to test for correct
function of `remote_addr`.
Also, I renamed "test_connect" to "test_connect_basic" so it can be started
without all the other tests in that file that start with "test_connect..."
For now hooks are treated identically to rpcmethods, with the
exception of not being returned in the `getmanifest` call. Later on we
can add typed handlers as well.
Having a list of very targeted suppressions allows us to still run the
majority of tests with valgrind checking, and not fail when Rust does
some trickery. This is for example the case with `std::sync::Once`
which uses `num_procs` calling out to the cgroups subsystem, sometimes
with a null path.
Suggested-by: Rusty Russell <@rustyrussell>
`valgrind` reports seems to flag some memory accesses that are ok in
the Rust standard library, which we can consider false positives for
our purposes:
```Valgrind error file: valgrind-errors.69147
==69147== Syscall param statx(file_name) points to unaddressable byte(s)
==69147== at 0x4B049FE: statx (statx.c:29)
==69147== by 0x2E2DA0: std::sys::unix::fs::try_statx (weak.rs:139)
==69147== by 0x2D7BD5: <std::fs::File as std::io::Read>::read_to_string (fs.rs:784)
==69147== by 0x2632CE: num_cpus::linux::Cgroup::param (linux.rs:214)
==69147== by 0x263179: num_cpus::linux::Cgroup::quota_us (linux.rs:203)
==69147== by 0x263002: num_cpus::linux::Cgroup::cpu_quota (linux.rs:188)
==69147== by 0x262C01: num_cpus::linux::load_cgroups (linux.rs:149)
==69147== by 0x26289D: num_cpus::linux::init_cgroups (linux.rs:129)
==69147== by 0x26BD88: core::ops::function::FnOnce::call_once (function.rs:227)
==69147== by 0x26B749: std::sync::once::Once::call_once::{{closure}} (once.rs:262)
==69147== by 0x139717: std::sync::once::Once::call_inner (once.rs:419)
==69147== by 0x26B6D5: std::sync::once::Once::call_once (once.rs:262)
==69147== Address 0x0 is not stack'd, malloc'd or (recently) free'd
==69147==
==69147== Syscall param statx(buf) points to unaddressable byte(s)
==69147== at 0x4B049FE: statx (statx.c:29)
==69147== by 0x2E2DA0: std::sys::unix::fs::try_statx (weak.rs:139)
==69147== by 0x2D7BD5: <std::fs::File as std::io::Read>::read_to_string (fs.rs:784)
==69147== by 0x2632CE: num_cpus::linux::Cgroup::param (linux.rs:214)
==69147== by 0x263179: num_cpus::linux::Cgroup::quota_us (linux.rs:203)
==69147== by 0x263002: num_cpus::linux::Cgroup::cpu_quota (linux.rs:188)
==69147== by 0x262C01: num_cpus::linux::load_cgroups (linux.rs:149)
==69147== by 0x26289D: num_cpus::linux::init_cgroups (linux.rs:129)
==69147== by 0x26BD88: core::ops::function::FnOnce::call_once (function.rs:227)
==69147== by 0x26B749: std::sync::once::Once::call_once::{{closure}} (once.rs:262)
==69147== by 0x139717: std::sync::once::Once::call_inner (once.rs:419)
==69147== by 0x26B6D5: std::sync::once::Once::call_once (once.rs:262)
==69147== Address 0x0 is not stack'd, malloc'd or (recently) free'd
==69147==
```
Only shows up on delayed to us outputs, but nice to have anyway.
It's missing for channel index destined deposits, maybe nice to add at
some point in the future; right now you can figure out which close a
wallet deposit comes from via the channel close txid
Changelog-Experimental: option `--lease-fee-base-msat` renamed to `--lease-fee-base-sat`
Changelog-Experimental: option `--lease-fee-base-msat` deprecated and will be removed next release
These tests have proven to be:
a) very expensive, as they spin up many nodes, and perform long setup
b) are not testing anything specific, they just fuzz functionality
that is already tested otherwise
c) have not helped pinpoint any issues in living memory
d) are very flaky, making for really bad signal-to-noise, so much
that devs usually just restart without even looking at the logs
e) even if we were to look at the logs, we'd be unable to reproduce
due to the inherent randomness involved in these tests
f) are really noisy neighbors, causing other tests to flake as well,
further muddying the water
All in all, these tests are a waste of time, and source of
frustration.
[ Cleaned up python unused imports --RR ]
Changelog-None
This restores the behaviour prior to `lightningd: use our cached
channel_update for errors instead of asking gossipd.`, where gossipd
would refuse to give us channel_updates for unannounced channels.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
But this requires a watch-only wallet, and python-bitcoinlib doesn't support
multiple wallets, so we need to unload the original one, but then we need
to generate a block, so that can't generate a new address, so we need
an address arg to generate_block.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We really need our own lnprototest tests for packet-based stuff;
these message-based tests are inherently delicate and awkward.
In particular, connectd now does dev-disconnect, so the socket is not
immediately closed after a dev-disconnect command. In this case, the
WIRE_SHUTDOWN has often already been written from connectd to channeld.
But it sometimes works, too.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
If the HTLCs are completely negotiated, we can get a channel break when
we mine a pile of blocks. This is mainly seen with Postgres, due to the db
speed.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
If we call update_channel_from_inflight *twice* with the same inflight, we
will get bad results. Using tal_steal() here was a premature optimization:
```
Valgrind error file: valgrind-errors.496395
==496395== Invalid read of size 8
==496395== at 0x22A9D3: to_tal_hdr (tal.c:174)
==496395== by 0x22B4B5: tal_steal_ (tal.c:498)
==496395== by 0x16A13D: update_channel_from_inflight (peer_control.c:1225)
==496395== by 0x16A4C7: funding_depth_cb (peer_control.c:1299)
==496395== by 0x182807: txw_fire (watch.c:232)
==496395== by 0x182AA9: watch_topology_changed (watch.c:300)
==496395== by 0x1290ED: updates_complete (chaintopology.c:624)
==496395== by 0x129BF4: get_new_block (chaintopology.c:835)
==496395== by 0x125EEF: getrawblockbyheight_callback (bitcoind.c:362)
==496395== by 0x176ECC: plugin_response_handle (plugin.c:584)
==496395== by 0x1770F5: plugin_read_json_one (plugin.c:690)
==496395== by 0x1772D9: plugin_read_json (plugin.c:735)
==496395== Address 0x89fbb08 is 24 bytes inside a block of size 104 free'd
==496395== at 0x483CA3F: free (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==496395== by 0x22B193: del_tree (tal.c:421)
==496395== by 0x22B461: tal_free (tal.c:486)
==496395== by 0x16A123: update_channel_from_inflight (peer_control.c:1223)
==496395== by 0x16A4C7: funding_depth_cb (peer_control.c:1299)
==496395== by 0x182807: txw_fire (watch.c:232)
==496395== by 0x182AA9: watch_topology_changed (watch.c:300)
==496395== by 0x1290ED: updates_complete (chaintopology.c:624)
==496395== by 0x129BF4: get_new_block (chaintopology.c:835)
==496395== by 0x125EEF: getrawblockbyheight_callback (bitcoind.c:362)
==496395== by 0x176ECC: plugin_response_handle (plugin.c:584)
==496395== by 0x1770F5: plugin_read_json_one (plugin.c:690)
==496395== Block was alloc'd at
==496395== at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==496395== by 0x22AC1C: allocate (tal.c:250)
==496395== by 0x22B1DD: tal_alloc_ (tal.c:428)
==496395== by 0x22B3A6: tal_alloc_arr_ (tal.c:471)
==496395== by 0x22C094: tal_dup_ (tal.c:805)
==496395== by 0x12B274: new_inflight (channel.c:187)
==496395== by 0x136D4C: wallet_commit_channel (dual_open_control.c:1260)
==496395== by 0x13B084: handle_commit_received (dual_open_control.c:2839)
==496395== by 0x13B6AF: dual_opend_msg (dual_open_control.c:2976)
==496395== by 0x1809FF: sd_msg_read (subd.c:553)
==496395== by 0x218F5D: next_plan (io.c:59)
==496395== by 0x219B65: do_plan (io.c:407)
```
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
If we fund a channel between two nodes, then mine all the blocks to
announce it, any other nodes may see the announcement before the
blocks, causing CI to complain about "bad gossip":
```
lightningd-4: 2022-01-25T22:33:25.468Z DEBUG 032cf15d1ad9c4a08d26eab1918f732d8ef8fdc6abb9640bf3db174372c491304e-gossipd: Ignoring future channel_announcment for 113x1x1 (current block 112)
lightningd-4: 2022-01-25T22:33:25.468Z DEBUG 032cf15d1ad9c4a08d26eab1918f732d8ef8fdc6abb9640bf3db174372c491304e-gossipd: Bad gossip order: WIRE_CHANNEL_UPDATE before announcement 113x1x1/0
lightningd-4: 2022-01-25T22:33:25.468Z DEBUG 032cf15d1ad9c4a08d26eab1918f732d8ef8fdc6abb9640bf3db174372c491304e-gossipd: Bad gossip order: WIRE_CHANNEL_UPDATE before announcement 113x1x1/1
lightningd-4: 2022-01-25T22:33:25.468Z DEBUG 032cf15d1ad9c4a08d26eab1918f732d8ef8fdc6abb9640bf3db174372c491304e-gossipd: Bad gossip order: WIRE_NODE_ANNOUNCEMENT before announcement 032cf15d1ad9c4a08d26eab1918f732d8ef8fdc6abb9640bf3db174372c491304e
```
Add a new helper for this case, and use it where there are more than 2 nodes.
Cleans up test_routing_gossip and a few other places which did this manually.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We were relying on the fee update to create an additional tx. That's
ugly; do an actual payment and make sure we definitely complete a new
tx by waiting for that *then* both revoke_and_ack.
(Without this, we could get a unilateral close instead of a penalty).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is neater than what we had before, and slightly more general.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Changed: JSON_RPC: `sendcustommsg` now works with any connected peer, even when shutting down a channel.
Next patch starts a timeout ping, which can interfere with results.
In theory, we should reply, but in practice (so far!) we seem to get enough
time that it doesn't hang up on us.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We also no longer strip the type off: everyone handles both forms, and
Eclair doesn't strip (and it's easier!).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Even if we're deferring putting them in the store and broadcasting them,
we tell lightningd so it will use it in any error messages.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This fixes lightningd's chronic weight underestimate.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Fixed: closingd: more accurate weight estimation helps mutual closing near min/max feerates.
The blockheight is zero though, since these aren't included in a block
yet.
We also don't issue an 'external' deposit event if we can tell that the
address you're sending to actually belongs to our wallet (we'll issue a
deposit event when it gets included in a block)
```
l1.rpc.disconnect(l2.info['id'], force=True)
l1.rpc.connect(l2.info['id'], 'localhost', l2.port)
> l1.daemon.wait_for_log('option_static_remotekey enabled at 2/2')
tests/test_connection.py:3653:
```
If l2's channeld gets killed (due to reconnect) before it tells
lightningd it got the revoke_and_ack it will need a retransmission
*again*.
This makes the test more robust, and does more checks too.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
OK, now this test makes more sense! Now we don't ignore errors, we
*will* drop to chain if we reconnect after one side has dropped to
chain.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
There's actually a bug in our closing tx size estimation; I'll do
a separate patch for this, though.
Seems this used to be flaky, now we always flush queues, so it's
more reliably caught.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We seem to hit a race between manual reconnect (with address hint) and an automatic
reconnection attempt which fails:
```
> l4.rpc.connect(l3.info['id'], 'localhost', l3.port)
...
E pyln.client.lightning.RpcError: RPC call failed: method: connect, payload: {'id': '035d2b1192dfba134e10e540875d366ebc8bc353d5aa766b80c090b39c3a5d885d', 'host': 'localhost', 'port': 41285}, error: {'code': 401, 'message': 'All addresses failed: 127.0.0.1:36678: Connection establishment: Connection refused. '}
```
See how it didn't even try the given address?
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
l1 might split in a commitment_signed before it notices the disconnect, and this test fails:
```
for i in range(0, len(disconnects)):
with pytest.raises(RpcError):
l1.rpc.sendpay(route, rhash, payment_secret=inv['payment_secret'])
> l1.rpc.waitsendpay(rhash)
E Failed: DID NOT RAISE <class 'pyln.client.lightning.RpcError'>
```
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We now let gossipd do it.
This also means there's nothing left in 'struct per_peer_state' to
send across the wire (the fds are sent separately), so that gets
removed from wire messages too.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We actually intercept the gossip_timestamp_filter, so the gossip_store
mechanism inside the per-peer daemon never kicks off for normal connections.
The gossipwith tool doesn't set OPT_GOSSIP_QUERIES, so it gets both, but
that only effects one place.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
channeld can't do it any more: it's using local sockets. Connectd
can do it, and simply does it by type.
Amazingly, on my machine the timing change *always* caused
test_channel_receivable() to fail, due to a latent race.
Includes feedback from @cdecker.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>