We locally generate an update with our local alias, and get one from
the peer with the remote alias, so we need to add them both. We do so
only if using the alias in the first place though.
`alias_local` is generated locally and sent to the peer so it knows
what we're calling the channel, while `alias_remote` is received by
the peer so we know what to include in routehints when generating
invoices.
This is the counterpart of the `mindepth` parameter in `fundchannel`
and friends. Allows dynamic lookups of `node_id` and selectively
opting into `option_zeroconf` being used.
Changelog-Added: plugin: The `openchannel` hook may return a `mindepth` indicating how many confirmations are required.
With `option_zeroconf` we may now send `channel_ready` at any time we
want, rendering the `mindepth` parameter a mere heads up. We ignore it
in favor of our own value, since we plan to trigger releasing the
`channel_ready` once we reach our own depth.
This will eventually enable us to specify 0 for zeroconf channels.
Changelog-Added: JSON-RPC: Added `mindepth` argument to specify the number of confirmations we require for `fundchannel` and `multifundchannel`
Ok this should be fixed the following stack trace
```
2022-06-29T14:19:41.183Z DEBUG lightningd: Command returned result after jcon close
------------------------------- Valgrind errors --------------------------------
Valgrind error file: valgrind-errors.55581
==55581== Syscall param statx(file_name) points to unaddressable byte(s)
==55581== at 0x4B0188E: statx (statx.c:29)
==55581== by 0x1133481: std::sys::unix::fs::try_statx (weak.rs:178)
==55581== by 0x11265E0: std::fs::buffer_capacity_required (fs.rs:851)
==55581== by 0x112675B: <std::fs::File as std::io::Read>::read_to_string (fs.rs:644)
==55581== by 0x10DACA8: num_cpus::linux::Cgroup::param (linux.rs:214)
==55581== by 0x10DAB39: num_cpus::linux::Cgroup::quota_us (linux.rs:203)
==55581== by 0x10DA9C2: num_cpus::linux::Cgroup::cpu_quota (linux.rs:188)
==55581== by 0x10DA5A1: num_cpus::linux::load_cgroups (linux.rs:149)
==55581== by 0x10DA23D: num_cpus::linux::init_cgroups (linux.rs:129)
==55581== by 0x10DCDC8: core::ops::function::FnOnce::call_once (function.rs:227)
==55581== by 0x10DC749: std::sync::once::Once::call_once::{{closure}} (once.rs:276)
==55581== by 0x21EE89: std::sync::once::Once::call_inner (once.rs:434)
==55581== Address 0x0 is not stack'd, malloc'd or (recently) free'd
==55581==
==55581== Syscall param statx(buf) points to unaddressable byte(s)
==55581== at 0x4B0188E: statx (statx.c:29)
==55581== by 0x1133481: std::sys::unix::fs::try_statx (weak.rs:178)
==55581== by 0x11265E0: std::fs::buffer_capacity_required (fs.rs:851)
==55581== by 0x112675B: <std::fs::File as std::io::Read>::read_to_string (fs.rs:644)
==55581== by 0x10DACA8: num_cpus::linux::Cgroup::param (linux.rs:214)
==55581== by 0x10DAB39: num_cpus::linux::Cgroup::quota_us (linux.rs:203)
==55581== by 0x10DA9C2: num_cpus::linux::Cgroup::cpu_quota (linux.rs:188)
==55581== by 0x10DA5A1: num_cpus::linux::load_cgroups (linux.rs:149)
==55581== by 0x10DA23D: num_cpus::linux::init_cgroups (linux.rs:129)
==55581== by 0x10DCDC8: core::ops::function::FnOnce::call_once (function.rs:227)
==55581== by 0x10DC749: std::sync::once::Once::call_once::{{closure}} (once.rs:276)
==55581== by 0x21EE89: std::sync::once::Once::call_inner (once.rs:434)
==55581== Address 0x0 is not stack'd, malloc'd or (recently) free'd
==55581==
--------------------------------------------------------------------------------
Leaving base_dir /tmp/ltests-hzt9ppqp intact, it still has test sub-directories with failure details: ['test_peers_1']
```
Signed-off-by: Vincenzo Palazzo <vincenzopalazzodev@gmail.com>
```
VALGRIND=1 valgrind -q --error-exitcode=7 --track-origins=yes --leak-check=full --show-reachable=yes --errors-for-leak-kinds=all connectd/test/run-netaddress > /dev/null
==2483395== 16 bytes in 1 blocks are still reachable in loss record 1 of 15
==2483395== at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==2483395== by 0x10D59A: autodata_register_ (autodata.c:20)
==2483395== by 0x10EB26: register_autotype_type_to_string (type_to_string.h:77)
==2483395== by 0x10EB6B: register_one_type_to_string0 (type_to_string.c:8)
==2483395== by 0x188C0C: __libc_csu_init (in /home/rusty/devel/cvs/lightning/connectd/test/run-netaddress)
==2483395== by 0x4A3A00F: (below main) (libc-start.c:264)
==2483395==
==2483395== 40 bytes in 1 blocks are still reachable in loss record 2 of 15
==2483395== at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
...
```
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This was fixed in 1c495ca5a8 ("connectd:
fix accidental handling of old reconnections.") and then reverted by
the rework in "connectd: avoid use-after-free upon multiple
reconnections by a peer".
The latter made the race much less likely, since we cleaned up the
reconnecting struct once the connection was hung up by the remote
node, but it's still theoretically possible.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
`peer_reconnected` was freeing a `struct peer_reconnected` instance
while a pointer to that instance was registered to be passed as an
argument to the `retry_peer_connected` callback function. This caused a
use-after-free crash when `retry_peer_connected` attempted to reparent
the instance to the temporary context.
Instead, never have `peer_reconnected` free a `struct peer_reconnected`
instance, and only ever allow such an instance to be freed after the
`retry_peer_connected` callback has finished with it. To ensure that the
instance is freed even if the connection is closed before the callback
can be invoked, parent the instance to the connection rather than to the
daemon.
Absent the need to free `struct peer_reconnected` instances outside of
the `retry_peer_connected` callback, there is no use for the
`reconnected` hashtable, so remove it as well.
See: https://github.com/ElementsProject/lightning/issues/5282#issuecomment-1141454255Fixes: #5282Fixes: #5284
Changelog-Fixed: connectd no longer crashes when peers reconnect.
The crash below from @zerofeerouting left me confused. The invalid
value in fmt_wireaddr_internal is a telltale sign of use-after-free.
This backtrace shows us destroying the conn *twice*: what's happening?
Well, tal carefully protects against destroying twice: it's not that
unusual to free something in a destructor which has already been freed.
So this indicates that there are *two* io_conn hanging off one
struct connecting, which isn't supposed to happen! We deliberately
call try_connect_one_addr() initially, then inside the io_conn destructor.
But due to races in connectd vs lightningd connection state, we added
a fix which allows a connect command to sit around while the peer is
cleaning up (6cc9f37cab) and get fired
off when it's done.
But what if, in the chaos, we are already connecting again? Now we'll
end up with *two* connections.
Fortunately, we have a `conn` pointer inside struct connecting, which
(with a bit of additional care) we can ensure is only non-NULL while
we're actually trying to connect. This lets us check that before
firing off a new connection attempt in peer_conn_closed.
```
lightning_connectd: FATAL SIGNAL 6 (version v0.11.2rc2-2-g8f7e939)
0x5614a4915ae8 send_backtrace
common/daemon.c:33
0x5614a4915b72 crashdump
common/daemon.c:46
0x7ffa14fcd72f ???
???:0
0x7ffa14dc87bb ???
???:0
0x7ffa14db3534 ???
???:0
0x5614a491fc71 fmt_wireaddr_internal
common/wireaddr.c:255
0x5614a491fc7a fmt_wireaddr_internal_
common/wireaddr.c:257
0x5614a491ea6b type_to_string_
common/type_to_string.c:32
0x5614a490beaa destroy_io_conn
connectd/connectd.c:754
0x5614a494a2f1 destroy_conn
ccan/ccan/io/poll.c:246
0x5614a494a313 destroy_conn_close_fd
ccan/ccan/io/poll.c:252
0x5614a4953804 notify
ccan/ccan/tal/tal.c:240
0x5614a49538d6 del_tree
ccan/ccan/tal/tal.c:402
0x5614a4953928 del_tree
ccan/ccan/tal/tal.c:412
0x5614a4953e07 tal_free
ccan/ccan/tal/tal.c:486
0x5614a4908b7a try_connect_one_addr
connectd/connectd.c:870
0x5614a490bef1 destroy_io_conn
connectd/connectd.c:759
0x5614a494a2f1 destroy_conn
ccan/ccan/io/poll.c:246
0x5614a494a313 destroy_conn_close_fd
ccan/ccan/io/poll.c:252
0x5614a4953804 notify
ccan/ccan/tal/tal.c:240
0x5614a49538d6 del_tree
ccan/ccan/tal/tal.c:402
0x5614a4953e07 tal_free
ccan/ccan/tal/tal.c:486
0x5614a4948f08 io_close
ccan/ccan/io/io.c:450
0x5614a4948f59 do_plan
ccan/ccan/io/io.c:401
0x5614a4948fe1 io_ready
ccan/ccan/io/io.c:417
0x5614a494a8e6 io_loop
ccan/ccan/io/poll.c:453
0x5614a490c12f main
connectd/connectd.c:2164
0x7ffa14db509a ???
???:0
0x5614a4904e99 ???
???:0
0xffffffffffffffff ???
???:0
```
Fixes: #5339
Changelog-Fixed: connectd: occasional crash when we reconnect to a peer quickly.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Complete implementation of BOLT1 port derivation proposal https://github.com/lightning/bolts/pull/968
Changelog-Added: rpc: use the standard port derivation in connect command when the port is not specified.
Signed-off-by: Vincenzo Palazzo <vincenzopalazzodev@gmail.com>
There's a 1 in 256 chance that our signature on the transaction is 70,
not 71 bytes long. This changes the freerate. So fix up the weight in
this case, to be the expected weight.
```
@unittest.skipIf(TEST_NETWORK == 'liquid-regtest', "Fees on elements are different")
@pytest.mark.developer("uses dev-fail")
@pytest.mark.openchannel('v1') # v2 the weight calculation is off by 3
deftest_multifunding_feerates(node_factory, bitcoind):
'''
Test feerate parameters for multifundchannel
'''
funding_tx_feerate = '10000perkw'
commitment_tx_feerate_int = 2000
commitment_tx_feerate = str(commitment_tx_feerate_int) + 'perkw'
l1, l2, l3 = node_factory.get_nodes(3, opts={'log-level': 'debug'})
l1.fundwallet(1 << 26)
def_connect_str(node):
return'{}@localhost:{}'.format(node.info['id'], node.port)
destinations = [{"id": _connect_str(l2), 'amount': 50000}]
res = l1.rpc.multifundchannel(destinations, feerate=funding_tx_feerate,
commitment_feerate=commitment_tx_feerate)
entry = bitcoind.rpc.getmempoolentry(res['txid'])
weight = entry['weight']
expected_fee = int(funding_tx_feerate[:-5]) * weight // 1000
> assert expected_fee == entry['fees']['base'] * 10 ** 8
E AssertionError: assert 7000 == (Decimal('0.00007010') * (10 ** 8))
tests/test_connection.py:1982: AssertionError
```
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Again, our new behaviour of sending our own gossip even before
they ask can confuse our gossip query tests.
In this case, simply eliminate duplicates.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Again, our new behaviour of sending our own gossip even before
they ask can confuse our gossip query tests.
Create a new node, attach it, and perform queries on it.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Even though we generally wait until a node has seen the gossip,
that doesn't mean that connectd has processed it! This means when
we connect it may still send us "old" gossip.
So we set the OPT_GOSSIP_QUERIES bit, which means don't send until we
ask. But now it sends us WIRE_QUERY_CHANNEL_RANGE, so everyone needs
to filter that out.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Looks like we woke one of the startup io_loops early, and thus
we thought we'd finished connectd_activate and we hadn't. This
caused us to use an uninitialized ld->announceable array, and
finally caused an assert fail in the main loop.
Make *every* loop assert that it was exited for the correct reason,
so if it happens again, we can maybe figure out what part of
the code to look at.
```
lightningd: lightningd/lightningd.c:1186: main: Assertion `io_loop_ret == ld' failed.
lightningd: FATAL SIGNAL 6 (version 4df66fa)
...
------------------------------- Valgrind errors --------------------------------
Valgrind error file: valgrind-errors.895509
==895509== Conditional jump or move depends on uninitialised value(s)
==895509== at 0x22C58E: to_tal_hdr_or_null (tal.c:184)
==895509== by 0x22D531: tal_bytelen (tal.c:637)
==895509== by 0x1F10B6: towire_gossipd_init (gossipd_wiregen.c:100)
==895509== by 0x13AC6E: gossip_init (gossip_control.c:254)
==895509== by 0x1497EC: main (lightningd.c:1090)
==895509==
```
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
I saw another "only_one()" fail on alias checking: it's not
entirely clear to me with the more aggressive sending of
own gossip, that we necessarily process in order, so we might
not have actually seen all channels just because we saw
the farthest one.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Make sure bitcoind gets tx before mining blocks!
```
# l1<->l2 mutual close should work
chan = l1.get_channel_scid(l2)
l2.rpc.connect(l1.info['id'], 'localhost', l1.port)
l1.rpc.close(chan)
l2.daemon.wait_for_log('State changed from CLOSINGD_SIGEXCHANGE to CLOSINGD_COMPLETE')
bitcoind.generate_block(2)
sync_blockheight(bitcoind, [l1, l2])
> l1.daemon.wait_for_log('Resolved FUNDING_TRANSACTION/FUNDING_OUTPUT by MUTUAL_CLOSE')
tests/test_closing.py:851:
```
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
```
> raiseValueError(str(errors))
E ValueError:
E Node errors:
E Global errors:
E - Node /tmp/ltests-x5sfpiwp/test_openchannel_hook_chaining_1/lightning-2/ has memory leaks: [
E {
E "backtrace": [
E "ccan/ccan/tal/tal.c:442 (tal_alloc_)",
E "ccan/ccan/io/io.c:91 (io_new_conn_)",
E "lightningd/subd.c:773 (new_subd)",
E "lightningd/subd.c:827 (new_channel_subd_)",
E "lightningd/opening_control.c:870 (peer_start_openingd)",
E "lightningd/peer_control.c:1307 (peer_active)",
E "lightningd/connect_control.c:457 (connectd_msg)",
E "lightningd/subd.c:556 (sd_msg_read)",
E "lightningd/subd.c:357 (read_fds)",
E "ccan/ccan/io/io.c:59 (next_plan)",
E "ccan/ccan/io/io.c:407 (do_plan)",
E "ccan/ccan/io/io.c:417 (io_ready)",
E "ccan/ccan/io/poll.c:453 (io_loop)",
E "lightningd/io_loop_with_timers.c:22 (io_loop_with_timers)",
E "lightningd/lightningd.c:1181 (main)",
E "../csu/libc-start.c:308 (__libc_start_main)"
E ],
E "label": "ccan/ccan/io/io.c:91:struct io_conn",
E "parents": [
E "lightningd/lightningd.c:107:struct lightningd"
E ],
E "value": "0x2b5a898"
E }
E ]
```
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We were waiting for the start to timeout waiting for the "public key"
message. Instead, start manually.
Before, this took (with TIMEOUT=30) 97 seconds, after 8 seconds.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Get it to log direct to stdout, so we see what's happening *as it
happens* rather than as we read it. We could restore the thread we
were using before, but that added more problems than it solved.
This means that we need the hsm password prompts in the log though.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Some flakes are caused by weird races in this code. Plus, if we
get things to write straight to files, we might see things in
there on post-mortem which happen after the python runner exits.
It's a bit less efficient, but much simpler. Let's see if it helps!
Some tests need a rework now, since we don't get a failure (except
eventual timeout), but they're simpler.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>