Instead of backfilling gossip by buffering (up to) ten messages at
a time, only buffer one message at a time, as the peers' outbound
socket buffer drains. This moves the outbound backfill messages out
of `PeerHandler` and into the operating system buffer, where it
arguably belongs.
Not buffering causes us to walk the gossip B-Trees somewhat more
often, but avoids allocating vecs for the responses. While its
probably (without having benchmarked it) a net performance loss, it
simplifies buffer tracking and leaves us with more room to play
with the buffer sizing constants as we add onion message forwarding
which is an important win.
Note that because we change how often we check if we're out of
messages to send before pinging, we slightly change how many
messages are exchanged at once, impacting the
`test_do_attempt_write_data` constants.
This avoids any extra calls to `read_event` after a write fails to
flush the write buffer fully, as is required by the PeerManager
API (though it isn't critical).
Unlike very ancient versions of lightning-net-tokio, this does not
rely on a single global process_events future, but instead has one
per connection. This could still cause significant contention, so
we'll ensure only two process_events calls can exist at once in
the next few commits.
I recently saw the following panic on one of my test nodes:
```
thread 'tokio-runtime-worker' panicked at 'called `Result::unwrap()`
on an `Err` value: Os { code: 107, kind: NotConnected, message:
"Transport endpoint is not connected" }',
rust-lightning/lightning-net-tokio/src/lib.rs:250:38
```
Presumably what happened is somehow the connection was closed in
between us accepting it and us going to start processing it. While
this is a somewhat surprising race, its clearly reachable. The fix
proposed here is quite trivial - simply don't `unwrap` trying to
fetch our peer's socket address, instead treat the peer address as
`None` and discover the disconnection later when we go to read.
Its somewhat strange to have a trait method which is named after
the intended action, rather than the action that occurred, leaving
it up to the implementor what action they want to take.
MessageSendEvent::PaymentFailureNetworkUpdate served as a hack to pass
an HTLCFailChannelUpdate from ChannelManager to NetGraphMsgHandler via
PeerManager. Instead, remove the event entirely and move the contained
data (renamed NetworkUpdate) to Event::PaymentFailed to be processed by
an event handler.
It isn't exactly a critical error situation when we disconnect a
socket, so we shouldn't be printing to stderr, entirely bypassing
user logging, when it happens. We do still print to stderr if we
fail to write the first message to the socket, but this should
never happen unless the user has a reasonably-configured system
with at least one packet in bytes available for the socket buffer.
This will be used to expose forwarding info for route hints in the next commit.
Co-authored-by: Valentine Wallace <vwallace@protonmail.com>
Co-authored-by: Antoine Riard <ariard@student.42.fr>