It doesn't get the right errno, and it says "create" not "bind".
```
2022-05-20T03:04:46.498Z DEBUG connectd: Failed to create 2 socket: Success
2022-05-20T03:04:46.500Z DEBUG connectd: REPLY WIRE_CONNECTD_INIT_REPLY with 0 fds
2022-05-20T03:04:46.501Z DEBUG connectd: connectd_init_done
2022-05-20T03:04:46.503Z **BROKEN** connectd: Failed to bind socket for 127.0.0.1:37871: Address already in use
```
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Before this fix, there was the situation where a DEVELOPER=1 node would
announce non-public addresses on mainnet if detected. Since there
are some nodes on the internet that falsely report local addresses
we move this 'testing feature' to 'dev-allow-locahost' nodes.
Changelog-None
Instead of doing an allocation per entry, put the entry in directly.
This means only 30 bit resolution on 32-bit machines, but if a bit
of gossip gets accidently suppressed that's ok.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
When we moved gossip filtering to connectd, this aging got lost.
Without this, we hit the 10,000 entry limit before expiring full
gossip anti-echo cache. This is under 1M in allocations per peer, but
in DEVELOPER mode each allocation includes adds 3 notifiers (32 bytes
each) and a backtrace child (40 + 40 + 256 bytes), making it almost
10MB per peer, plus allocation overhead.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Fixed: connectd: large memory usage with many peers fixed.
Per BIP-0171, the signature map is of pubkey to "The signature as would
be pushed to the stack from a scriptSig or witness".
Fixes 5298
Changelog-Fixed: PSBT: Fix signature encoding to comply with BIP-0171.
Signed-off-by: Jon Griffiths <jon_p_griffiths@yahoo.com>
Now it's formatted properly, we don't need the patch.
But we need to explicitly marshal/unmarshal into a byte stream,
which involves some code rearrangement.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
I have a test which reproduces this, too, and it's been seen in the
wild. It seems we can add a subd as we're closing, which causes
this assert to trigger.
Fixes: #5254
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We had multiple reports of channels being unilaterally closed because
it seemed like the peer was sending old revocation numbers.
Turns out, it was actually old reestablish messages! When we have a
reconnection, we would put the new connection aside, and tell lightningd
to close the current connection: when it did, we would restart
processing of the initial reconnection.
However, we could end up with *multiple* "reconnecting" connections,
while waiting for an existing connection to close. Though the
connections were long gone, there could still be messages queued
(particularly the channel_reestablish message, which comes early on).
Eventually, a normal reconnection would cause us to process one of
these reconnecting connections, and channeld would see the (perhaps
very old!) messages, and get confused.
(I have a test which triggers this, but it also hangs the connect
command, due to other issues we will fix in the next release...)
Fixes: #5240
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This seems to prevent broad propagation, due to LND not allowing it. See
https://github.com/lightningnetwork/lnd/issues/6432
We still announce it if you disable deprecated-apis, so tests still work,
and hopefully we can enable it in future.
Fixes: #5196
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-EXPERIMENTAL: Protocol: disabled websocket announcement due to LND propagation issues
We seem to have made node_announcement propagation *worse*, not
better. Explorers don't see my nodes updates.
At least some LND nodes never send us timestamp_filter, so we are
never actually stream *any* gossip. We should send gossip about
ourselves, even if they haven't set a filter (yet).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Added: Protocol: we more aggressively send our own gossip, to improve propagation chances.
I was seeing a strange crash:
Connectd gave bad CONNECT_PEER_CONNECTED message
The message is indeed mangled, around the remote_addr!
A quick review of the code revealed that we were not making a copy
when it was a reconnect, and so the remote_addr pointer was pointing
to memory which was freed.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Mostly comments and docs: some places are actually paths, which
I have avoided changing. We may migrate them slowly, particularly
when they're user-visible.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We do this (send warnings) in almost all cases anyway, so mainly this
is a textual update, but there are some changes:
1. Send ERROR not WARNING if they send a malformed commitment secret.
2. Send WARNING not ERROR if they get the shutdown_scriptpubkey wrong (vs upfront)
3. Send WARNING not ERROR if they send a bad shutdown_scriptpubkey (e.g. p2pkh in future)
4. Rename some vars 'err' to 'warn' to make it clear we send a warning.
This means test_option_upfront_shutdown_script can be made reliable, too,
and it now warns and doesn't automatically close channel.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Gossipd didn't actually suppress all gossip, resulting in a flake!
Doing it in connectd now makes much more sense.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
I removed these prematurely: we *haven't* had a release since
introducing them!
This consists of reverting d15d629b8b
"plugins/fetchinvoice: remove obsolete string-based API." and
plugins/fetchinvoice: remove obsolete string-based
API. "onion_messages: remove obs2 support."
Some minor changes due to updated fromwire_tlv API since they
were removed, but not much.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-EXPERIMENTAL: REVERT: Removed backwards compat with onion messages from v0.10.1.
Requiring the caller to allocate them is ugly, and differs from
other types.
This means we need a context arg if we don't have one already.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
No more "towire_offer", but "towire_tlv_offer".
This means we double-up on the unfortunately-named `tlv_payload` inside
the onion, but we should rename that in the spec when we remove
old payloads.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This means doing some wire interpretation, and handling the transient
case where we switch from temporary to permenant channel_id, but it's
not that bad (and required for accurate demux when multiple channels
are involved for a single peer).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Now we always have it (either extracted from an unsolicited message,
or told to us by lightningd when it tells us it wants to talk), we can
always send it.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This means lightningd needs to create the temporary one and tell it to
openingd/dualopend, rather than the other way around.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Either because lightningd tells us it wants to talk, or because the peer
says something about a channel.
We also introduce a behavior change: we disconnect after a failed open.
We might want to modify this later, but we it's a side-effect of openingd
not holding onto idle connections.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We would return success from connect even though the peer was closing;
this is technically correct but fairly undesirable. Better is to pass
every connect attempt to connectd, and have it block if the peer is
exiting (and retry), otherwise tell us it's already connected.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The message from lightningd simply acknowleges that we are allowed to
discard the peer (because no subdaemons are talking to it anymore).
This difference becomes more stark once connectd holds on to idle
peers.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Use tmpctx, rather than freeing manually everywhere (proof: next patch
added a branch and forgot to free it!).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This happens when we send a warning or lightningd tells us to send a
final message then close. Normally io logging is done by the
subdaemon that creates it, but this is a special case.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Example request that is dying:
NEW REQUEST! lightning_websocketd:main [1955685] <-- bad request from safari
read 507
write_all 1
-> websocket_to_lightningd
-> read_payload_header
read 2
read_all 1
read -11 <--- This tried to read a part of the header, is this -EAGAIN?
read_all 0 should we be blocking on these reads?
*dies*
Fixes#5089
Changelog-Fixed: `experimental-websocket` intermittent read errors fixed
Signed-off-by: William Casarin <jb55@jb55.com>
When compiled without DEVELOPER this will now filter out `remote_addr` that
come from localhost. The testcase checks for DEVELOPER to test for correct
function of `remote_addr`.
Also, I renamed "test_connect" to "test_connect_basic" so it can be started
without all the other tests in that file that start with "test_connect..."
For example, if you do:
```
./lightningd/lightningd --network=regtest --experimental-websocket-port=19846
```
Then you're trying to reuse the normal port as the websocket port, but this
only fails at *listen* time, when we activate connectd. Catch this too.
Fixes incorrect fatal() message, too.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
By accessing `addr` after the loop, it's possible that it's one which
failed, in complex scenarios.
Also gives us a chance to warn if they specify a websocket but don't
actually end up advertizing it (you *must* advertize a normal addr as
well).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We always added to both arrays, might as well just keep one.
We make mayfail an explicit flag, rather than relying on the presence
of errstr, which is never NULL now.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This lets us give a better error message if listen fails, and also
moved the callback closer to where it's needed.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>