Commit Graph

39 Commits

Author SHA1 Message Date
Rusty Russell
dffbf8de85 gossipd: convert wire to new scheme.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2020-08-25 12:53:13 +09:30
Rusty Russell
2aad3ffcf8 common: tal_dup_talarr() helper.
This is a common thing to do, so create a macro.

Unfortunately, it still needs the type arg, because the paramter may
be const, and the return cannot be, and C doesn't have a general
"(-const)" cast.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2020-02-27 14:16:16 +10:30
Christian Decker
9fd84169bb common: Add an assertion for custommsgs in gossip handler
This is mainly meant as a marker so that we can later remove the code if we
decide to make the handling of custommsgs a non-developer option. It marks the
place that we would otherwise handle what in dev-mode is a custommsg.
2020-01-28 23:50:52 +01:00
Rusty Russell
1d0c433dc4 channeld: treat all incoming errors as "soft", so we retry.
We still close the channel if we *send* an error, but we seem to have hit
another case where LND sends an error which seems transient, so this will
make a best-effort attempt to preserve our channel in that case.

Some test have to be modified, since they don't terminate as they did
previously :(

Changelog-Changed: quirks: We'll now reconnect and retry if we get an error on an established channel. This works around lnd sending error messages that may be non-fatal.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-12-13 16:36:18 +01:00
Rusty Russell
4d0c2e93bf common: remove spammy debug msg.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-10-22 07:05:47 -07:00
Rusty Russell
596ed6a83b gossipd: better description of what's happening with the seeker.
By combining set_state() with selected_peer() we can give a single log
line describing what we're asking for, from whom.

We also add more verbosity to a few key areas, such as gossip rotation
and when gossipd tells a peer to send an error.  And move a comment which
was above the wrong function (due to rebase?).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-10-10 21:48:52 -05:00
darosior
0b0ad4c22d transition from status_trace() to status_debug 2019-09-10 02:02:51 +00:00
Rusty Russell
c99906a9a9 per-peer-daemons: tie in gossip filter.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-09-06 14:35:01 +02:00
Rusty Russell
596972366d wire: always ignore unknown odd messages.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-08-05 09:54:30 +00:00
Rusty Russell
728bb4e662 common/gossip_store: handle timestamp filtering.
This means we intercept the peer's gossip_timestamp_filter request
in the per-peer subdaemon itself.  The rest of the semantics are fairly
simple however.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-06-04 01:29:39 +00:00
Rusty Russell
5591c0b5d8 gossipd: don't send gossip stream, let per-peer daemons read it themselves.
Keeping the uintmap ordering all the broadcastable messages is expensive:
130MB for the million-channels project.  But now we delete obsolete entries
from the store, we can have the per-peer daemons simply read that sequentially
and stream the gossip itself.

This is the most primitive version, where all gossip is streamed;
successive patches will bring back proper handling of timestamp filtering
and initial_routing_sync.

We add a gossip_state field to track what's happening with our gossip
streaming: it's initialized in gossipd, and currently always set, but
once we handle timestamps the per-peer daemon may do it when the first
filter is sent.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-06-04 01:29:39 +00:00
Rusty Russell
a5f6ef385a gossipd: don't wrap messages when we send them to the peer.
They already send *us* gossip messages, so they have to be distinct anyway.
Why make us both do extra work?

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-06-04 01:29:39 +00:00
Rusty Russell
38d2899fbb common/per_per_state: generalize lightningd/peer_comm Part 1
Encapsulating the peer state was a win for lightningd; not surprisingly,
it's even more of a win for the other daemons, especially as we want
to add a little gossip information.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-06-04 01:29:39 +00:00
Rusty Russell
f1b4b14be5 channeld: don't queue gossip msgs while waiting for foreign_channel_update.
We ask gossipd for the channel_update for the outgoing channel; any other
messages it sends us get queued for later processing.

But this is overzealous: we can shunt those msgs to the peer while
we're waiting.  This fixes a nasty case where we have to handle
WIRE_GOSSIPD_NEW_STORE_FD messages by queuing the fd for later.

This then means that WIRE_GOSSIPD_NEW_STORE_FD can be handled
internally inside handle_gossip_msg(), since it's always dealt with
the same, simplifying all callers.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-06-04 01:29:39 +00:00
Rusty Russell
f5a218f9d1 gossipd: send per-peer daemons offsets into gossip store.
Instead of reading the store ourselves, we can just send them an
offset.  This saves gossipd a lot of work, putting it where it belongs
(in the daemon responsible for the specific peer).

MCP bench results:
   store_load_msec:28509-31001(29206.6+/-9.4e+02)
   vsz_kb:580004-580016(580006+/-4.8)
   store_rewrite_sec:11.640000-12.730000(11.908+/-0.41)
   listnodes_sec:1.790000-1.880000(1.83+/-0.032)
   listchannels_sec:21.180000-21.950000(21.476+/-0.27)
   routing_sec:2.210000-11.160000(7.126+/-3.1)
   peer_write_all_sec:36.270000-41.200000(38.168+/-1.9)

Signficant savings in streaming gossip:
   -peer_write_all_sec:48.160000-51.480000(49.608+/-1.1)
   +peer_write_all_sec:35.780000-37.980000(36.43+/-0.81)

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-05-13 05:16:18 +00:00
Rusty Russell
d8db4e871f gossipd: provide new fd to per-peer daemons when we compact it.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-05-13 05:16:18 +00:00
Rusty Russell
13717c6ebb gossipd: hand a gossip_store_fd to all subdaemons.
This will let them read from the gossip store directly.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-05-13 05:16:18 +00:00
Rusty Russell
c9a907cd71 common: handle peer input before gossipd input (for closingd, openingd)
Similar to the previous "handle peer input before gossip input", this
fixes similar potential deadlock for closingd and openingd which use
peer_or_gossip_sync_read.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-01-29 11:45:17 +01:00
Rusty Russell
5c60d7ffb2 gossipd: split wire types into msgs from lightningd and msgs from per-peer daemons
This avoids some very ugly switch() statements which mixed the two,
but we also take the chance to rename 'towire_gossip_' to
'towire_gossipd_' for those inter-daemon messages; they're messages to
gossipd, not gossip messages.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-11-21 00:36:31 +00:00
Rusty Russell
136f10e4a3 common/read_peer_msg: remove.
Also means we simplify the handle_gossip_msg() since everyone wants it to
use sync_crypto_write().

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-08-05 02:03:58 +00:00
Rusty Russell
0b08601951 sync_crypto_write/sync_crypto_read: just fail, don't return NULL.
There's only one thing the caller ever does, just do that internally.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-08-05 02:03:58 +00:00
Rusty Russell
09cce4a9c7 common/read_peer_msg: deconstruct into individual helper routines.
The One Big API is confusing, and has enough corner cases that we should
ditch it rather than add more.

See: https://www.sandimetz.com/blog/2016/1/20/the-wrong-abstraction

In particular, when openingd is changed to chat to peers even when
it's not actively opening a channel, it wants to handle (most) errors
by continuing, not calling peer_failed().

This exposes the constituent parts.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-08-05 02:03:58 +00:00
Rusty Russell
b5fcd54ef0 channeld: don't read from gossipfd while we're reconnecting.
That was the cause of the bad gossip order failures: gossipd thought our
channel was live, but the other end didn't receive message last time.

Now gossipd doesn't use fd to kill us (connectd tells master to do so), we
can implement read_peer_msg_nogossip().

Fixes: #1706
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-07-25 02:13:52 +00:00
Rusty Russell
a52d522525 gossipd: handle ping messages for remote peers too.
This simplifies our ping handling: make gossipd always do it.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-07-25 02:13:52 +00:00
Rusty Russell
2d533dc82e channeld: don't manually disable channel.
gossipd will do it when peer dies anyway.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-07-07 16:07:53 +02:00
Rusty Russell
fed5a117e7 Update ccan/structeq.
structeq() is too dangerous: if a structure has padding, it can fail
silently.

The new ccan/structeq instead provides a macro to define foo_eq(),
which does the right thing in case of padding (which none of our
structures currently have anyway).

Upgrade ccan, and use it everywhere.  Except run-peer-wire.c, which
is only testing code and can use raw memcmp(): valgrind will tell us
if padding exists.

Interestingly, we still declared short_channel_id_eq, even though
we didn't define it any more!

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-07-04 23:57:00 +02:00
Rusty Russell
82ff891202 Update to latest BOLT version.
And remove the FIXMEs now that the gossip_query extension is merged.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-07-01 17:37:03 +02:00
Rusty Russell
bf1076080b common: typo fix.
Old gossip is rarely interesting.

Reported-by: @ZmnSCPxj
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-04-26 05:47:57 +00:00
Rusty Russell
b68fb24758 read_peer_msg: handle incoming gossip from gossipd.
This means that openingd and closingd now forward our gossip.  But the real
reason we want to do this is that it gives an easy way for gossipd to kill
any active daemon, by closing its fd: previously closingd and openingd didn't
read the fd, so tended not to notice.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-04-26 05:47:57 +00:00
Rusty Russell
ab9d9ef3b8 gossipd: drain fd instead of passing around gossip index.
(This was sitting in my gossip-enchancement patch queue, but it simplifies
this set too, so I moved it here).

In 94711969f we added an explicit gossip_index so when gossipd gets
peers back from other daemons, it knows what gossip it has sent (since
gossipd can send gossip after the other daemon is already complete).

This solution is insufficient for the more general case where gossipd
wants to send other messages reliably, so replace it with the other
solution: have gossipd drain the "gossip fd" which the daemon returns.

This turns out to be quite simple, and is probably how I should have
done it originally :(

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-04-26 05:47:57 +00:00
Rusty Russell
e63b7bb539 take: allocate temporary variables off NULL.
If we're going to simply take() a pointer, don't allocate it off a random
object.  Using NULL makes our intent clear, particularly with allocating
packets we're going to take() onto a queue.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-03-16 00:16:10 +00:00
Rusty Russell
61f048bbf1 gossip: rename is_gossip_msg to is_msg_for_gossipd.
We're going to expand the range of messages which go through gossipd
when we support queries.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-03-13 16:34:55 +01:00
practicalswift
91a9c2923f Mark intentionally unused parameters as such (with "UNUSED") 2018-02-22 01:09:12 +00:00
Rusty Russell
cfa50d393a openingd: use peer_failed like normal instead of boutique negotiation_failed.
Because peer_failed would previously drop the connection, we had a
special 'negotiation_failed' message which made the master hand it
back to gossipd.  We don't need that any more.

This also meant we no longer need a special hook in read_peer_msg
for openingd to send this message.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-02-19 02:56:51 +00:00
Rusty Russell
f76ff90485 status: split off error messages into a new 'peer_status' type.
Several daemons (onchaind, hsm) want to use the status messages, but
don't communicate with peers.  The coming changes made them drag in
more code they didn't need, so instead we have a different
non-overlapping type.

We combine the status_received_errmsg and status_sent_errmsg
into a single status_peer_error, with the presence or not of the
'error_for_them' field indicating direction. 

We also rename status_fatal_connection_lost() to
peer_failed_connection_lost() to fit in.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-02-19 02:56:51 +00:00
Rusty Russell
65ff5f8bb1 read_peer_msg: ignore errors not destined for this channel.
We quoted the spec, but somehow the implementation disappeared.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-02-19 02:56:51 +00:00
Rusty Russell
cc9ca82821 status: separate types for peer failure vs "impossible" failures.
Ideally we'd rename status_failed() to status_fatal(), but that's
too much churn for now.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-02-08 19:07:12 +01:00
Rusty Russell
99e246becd channeld: rely on io_logging, not our own boutique logging.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-02-07 00:46:49 +00:00
Rusty Russell
ef9b16399c common: read_peer_msg helpers for per-peer subds.
It's a bit ugly because each caller has slightly different needs, but
we have three hooks and standard helpers.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-02-01 05:57:56 +00:00