Everyone understands gossip_queries now, but peers leave it unset to indicate
they have nothing useful to say.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Currently, anything which doesn't have a live channel is considered transient.
We free this first under stress, and also if they're still connecting.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
If we don't find one searching from our random spot in the peer table,
we're supposed to wrap, not crash!
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We don't actually support it yet, but this threads through the type change,
puts it in "decode" etc.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We use a crude heuristic: if we were trying to contact them, it's a
"deliberate" connection, and should be preserved.
Changelog-Changed: connectd: prioritize peers with channels (and log!) if we run low on file descriptors.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
I thought I was going to want to have a convenient way of counting
these, but it turns out unnecessary. Still, this is slightly more
efficient and simple, so I am including it.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This can happen if we're totally out of fds, but previously we gave
no log message indicating this!
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This has the benefit of being shorter, as well as more reliable (you
will get a link error if we can't print it, not a runtime one!).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This code was trying to check that the address type is not one of the ADDR_TYPE_TOR*
types, but the is_toraddr() function checks a domain name! The cast should have been
a clue that this was wrong!
Anyway, wireaddr_to_addrinfo() aborts on these cases already, so the asserts here are
superfluous.
Found in unrelated CI run:
```
Valgrind error file: valgrind-errors.20610
==20610== Conditional jump or move depends on uninitialised value(s)
==20610== at 0x484ED28: strlen (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==20610== by 0x138FA3: is_toraddr (wireaddr.c:344)
==20610== by 0x11499B: conn_init (connectd.c:729)
==20610== by 0x28FD73: next_plan (io.c:59)
==20610== by 0x28FF94: io_new_conn_ (io.c:116)
==20610== by 0x11531B: try_connect_one_addr (connectd.c:927)
==20610== by 0x1182A8: try_connect_peer (connectd.c:1781)
==20610== by 0x11834E: connect_to_peer (connectd.c:1797)
==20610== by 0x119241: recv_req (connectd.c:2074)
==20610== by 0x12836F: handle_read (daemon_conn.c:35)
==20610== by 0x28FD73: next_plan (io.c:59)
==20610== by 0x2909A8: do_plan (io.c:407)
==20610==
```
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This happens if:
1. The peer sets a timestamp filter to non-zero, and
2. We have a channel_announcement without a channel_update.
The timestamp is 0 as a placeholder as part of the recent gossip rework
(we used to hold these channel_announcement in memory, which was complex).
But this means we won't send it in this case, and if we later send the
channel_update, CI will complain about 'Bad gossip order'.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We weakened this progressively over time, and gossip v1.5 makes spam
impossible by protocol, so we can wait until then.
Removing this code simplifies things a great deal!
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Removed: Protocol: we no longer ratelimit gossip messages by channel, making our code far simpler.
Make sure plugin has got message to connectd before sending!
```
def test_even_sendcustommsg(node_factory):
l1, l2 = node_factory.get_nodes(2, opts={'log-level': 'io',
'allow_warning': True})
l1.connect(l2)
# Even-numbered message
msg = hex(43690)[2:] + ('ff' * 30) + 'bb'
# l2 will hang up when it gets this.
l1.rpc.sendcustommsg(l2.info['id'], msg)
l2.daemon.wait_for_log(r'\[IN\] {}'.format(msg))
l1.daemon.wait_for_log('Invalid unknown even msg')
wait_for(lambda: l1.rpc.listpeers(l2.info['id'])['peers'] == [])
# Now with a plugin which allows it
l1.connect(l2)
l2.rpc.plugin_start(os.path.join(os.getcwd(), "tests/plugins/allow_even_msgs.py"))
l1.rpc.sendcustommsg(l2.info['id'], msg)
l2.daemon.wait_for_log(r'\[IN\] {}'.format(msg))
> l2.daemon.wait_for_log(r'allow_even_msgs.*Got message 43690')
tests/test_misc.py:3623:
...
> raise TimeoutError('Unable to find "{}" in logs.'.format(exs))
E TimeoutError: Unable to find "[re.compile('allow_even_msgs.*Got message 43690')]" in logs.
contrib/pyln-testing/pyln/testing/utils.py:327: TimeoutError
```
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
If we get a WIRE_TX_ABORT then another message, we send the other message to the same
subd (even though the tx abort causes it to shutdown). This means we effectively
lose the next message, and timeout (see below from CI, reproduced locally).
So, have connectd ignore the subd after it forwards the WIRE_TX_ABORT. The next
message will, correctly, cause a fresh subdaemon to be spawned.
```
@unittest.skipIf(TEST_NETWORK != 'regtest', 'elementsd doesnt yet support PSBT features we need')
@pytest.mark.openchannel('v2')
def test_v2_rbf_multi(node_factory, bitcoind, chainparams):
l1, l2 = node_factory.get_nodes(2,
opts={'may_reconnect': True,
'dev-no-reconnect': None,
'allow_warning': True})
l1.rpc.connect(l2.info['id'], 'localhost', l2.port)
amount = 2**24
chan_amount = 100000
bitcoind.rpc.sendtoaddress(l1.rpc.newaddr()['bech32'], amount / 10**8 + 0.01)
bitcoind.generate_block(1)
# Wait for it to arrive.
wait_for(lambda: len(l1.rpc.listfunds()['outputs']) > 0)
res = l1.rpc.fundchannel(l2.info['id'], chan_amount)
chan_id = res['channel_id']
vins = bitcoind.rpc.decoderawtransaction(res['tx'])['vin']
assert(only_one(vins))
prev_utxos = ["{}:{}".format(vins[0]['txid'], vins[0]['vout'])]
# Check that we're waiting for lockin
l1.daemon.wait_for_log(' to DUALOPEND_AWAITING_LOCKIN')
# Attempt to do abort, should fail since we've
# already gotten an inflight
with pytest.raises(RpcError):
l1.rpc.openchannel_abort(chan_id)
rate = int(find_next_feerate(l1, l2)[:-5])
# We 4x the feerate to beat the min-relay fee
next_feerate = '{}perkw'.format(rate * 4)
# Initiate an RBF
startweight = 42 + 172 # base weight, funding output
initpsbt = l1.rpc.utxopsbt(chan_amount, next_feerate, startweight,
prev_utxos, reservedok=True,
min_witness_weight=110,
excess_as_change=True)
# Do the bump
bump = l1.rpc.openchannel_bump(chan_id, chan_amount,
initpsbt['psbt'],
funding_feerate=next_feerate)
# Abort this open attempt! We will re-try
aborted = l1.rpc.openchannel_abort(chan_id)
assert not aborted['channel_canceled']
# We no longer disconnect on aborts, because magic!
assert only_one(l1.rpc.listpeers()['peers'])['connected']
# Do the bump, again, same feerate
> bump = l1.rpc.openchannel_bump(chan_id, chan_amount,
initpsbt['psbt'],
funding_feerate=next_feerate)
tests/test_opening.py:668:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
contrib/pyln-client/pyln/client/lightning.py:1206: in openchannel_bump
return self.call("openchannel_bump", payload)
contrib/pyln-testing/pyln/testing/utils.py:718: in call
res = LightningRpc.call(self, method, payload, cmdprefix, filter)
contrib/pyln-client/pyln/client/lightning.py:398: in call
resp, buf = self._readobj(sock, buf)
contrib/pyln-client/pyln/client/lightning.py:315: in _readobj
b = sock.recv(max(1024, len(buff)))
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = <pyln.client.lightning.UnixSocket object at 0x7f34675aae80>
length = 1024
def recv(self, length: int) -> bytes:
if self.sock is None:
raise socket.error("not connected")
> return self.sock.recv(length)
E Failed: Timeout >1200.0s
```
Previously, we would forward the message to a subd, but now we have
the case where the subd is gone, but we're still connected. If the
peer anything but a reestablish in that state, we drop the connection.
Instead, an error should always make us fail the channel.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
On Mac most tests report BROKEN because sodium creating an untracked fd pointing to /dev/random. dev_report_fd’s finds it at tear down and reports a BROKEN message.
We allow a single “char special” fd without reporting it as broken improving QOL for Mac developers.
While we’re here we added the fd mode to the log to help with future rogue fd issues.
ChangeLog-None
This makes it easier to use outside simple subds, and now lightningd can
simply dump to log rather than returning JSON.
JSON formatting was a lot of work, and we only did it for lightningd, not for
subdaemons. Easier to use the logs in all cases.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We still refuse to run dev commands if lightningd sends it to us
despite us not being in developer mode, but that's mainly paranoia.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Also requires us to expose memleak when !DEVELOPER, however we only
ever used the memleak tracking when the LIGHTNINGD_DEV_MEMLEAK
environment variable was set, so keep that.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Update the lightningd <-> channeld interface with lots of new commands to needed to facilitate spicing.
Implement the channeld splicing protocol leveraging the interactivetx protocol.
Implement lightningd’s channel_control to support channeld in its splicing efforts.
Changelog-Added: Added the features to enable splicing & resizing of active channels.
Fixes: #6368
Changelog-Fixed: Protocol: we no longer gossip about recently-closed channels (Eclair gets upset with this).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We will access the freed connection to gossipd. This is weird to track
down when the *actual* issue is that gossipd died!
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
I never really liked this hack: websockets are useful, advertizing
them not so much.
Note that we never actually documented that we would advertize these!
Changelog-EXPERIMENTAL: Protocol: Removed support for advertizing websocket addresses in gossip.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
1. Make it the standard "return the error" pattern.
2. Rather than flags to indicate what types are allowed, have the callers
check the return explicitly.
3. Document the APIs.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This contained cut & paste code, and it wasn't clear to me that
the first loop included DNS entries with IPv6 entries.
Instead, allow the iterator to take multiple types, and use
a switch statement so compile will break as new types are added.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
After the first iteration of the loop, we call memmem with a buflen that
points past the end of buf.
In practice we probably never read the uninitialized memory since we
guarantee the buffer ends with "\r\n", and since most/all libc
implementations probably read the haystack sequentially. But maybe
there's some libc with a crazy optimization out there. It's good to use
an accurate buflen just in case.
Discovered this while running some unit tests with MSan.
The push bit was convenient for connectd to send our own gossip
to peers upon connecting by naively traversing the gossip_store
and sending anything flagged `push`. This function is now
performed by gossipd leaving no use for the push bit.
Changelog-Changed: `gossipd`: gossip_store PUSH bit is no longer set.