core-lightning

mirror of https://github.com/ElementsProject/lightning.git synced 2024-12-28 01:24:42 +01:00

Author	SHA1	Message	Date
Rusty Russell	247d249ea8	gossipd: provide helper to get a channels cupdate, create routine to use it. The idea is that gossipd can give us the cupdate we need for an error, and we wire things up so that we ask for it (async) just before we send the error to the subdaemon. I tried many other things, but they were all too high-risk. 1. We need to ask gossipd every time, since it produces these lazily (in particular, it doesn't actually generate an offline update unless the channel is used). 2. We can't do async calls in random places, since we'll end up with an HTLC in limbo. What if another path tries to fail it at the same time? 3. This allows us to use a temporary_node_failure error, and upgrade it when gossipd replies. This doesn't change any existing assumptions. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2020-02-28 09:44:47 +10:30
Rusty Russell	2aad3ffcf8	common: tal_dup_talarr() helper. This is a common thing to do, so create a macro. Unfortunately, it still needs the type arg, because the paramter may be const, and the return cannot be, and C doesn't have a general "(-const)" cast. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2020-02-27 14:16:16 +10:30
Rusty Russell	684ed4231f	common/wireaddr: don't include lightningd/lightningd. common should not include specific per-daemon files. Turns out this caused a lot of indirect includes to be exposed. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2020-02-27 14:16:16 +10:30
Vasil Dimov	89ceb273f5	wire: remove towire_double() Before this patch we used to send `double`s over the wire by just copying them. This is not portable because the internal represenation of a `double` is implementation specific. Instead of this, multiply any floating-point numbers that come from the outside (e.g. JSONs) by 1 million and round them to integers when handling them. * Introduce a new param_millionths() that expects a floating-point number and returns it multipled by 1000000 as an integer. * Replace param_double() and param_percent() with param_millionths() * Previously the riskfactor would be allowed to be negative, which must have been unintentional. This patch changes that to require a non-negative number. Changelog-None	2020-02-27 09:07:04 +10:30
Rusty Russell	7ab5c424b6	gossipd: provide (stripped) channel_update when resolving a channel. I hadn't realized that lightningd asks gossipd every time we forward a payment. But I'm going to abuse it here to get the latest channel_update, otherwise (as lightningd takes over error message generation) lightningd needs to do an async request at various painful points. So have gossipd tell us the lastest update (stripped so compatible with the strange in-onion-error format). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2020-02-25 11:12:12 +10:30
Rusty Russell	77e3df0a29	gossipd: remove assert which can trigger. We can actually fail to find a shorter route, but it's a fairly obscure case. Fixes: #3517 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2020-02-15 08:48:04 +10:30
darosior	3510c29e5d	common: move json_stream helpers to common/json Now that we have json_stream in common/, we can move all the related helpers from lightningd/json to common/json. This way everyone can benefit of them (including libplugin, the plugins themselves, potentially lightning-cli), not lightningd alone! Note that the Makefile of the common/test/ had to be modified, because the new helpers make use of common/wireaddr... Which turns out to \#include <lightingd/lightningd.h> ! So we couldnt just include the .c and add mocks if we redefined some structs (hello run-param).	2020-02-04 13:24:32 +10:30
Vasil Dimov	18a40c0c5d	build: re-record the result of `make update-mocks` Changelog-None	2020-02-03 15:38:11 +00:00
Rusty Russell	1099f6a5e1	common: use struct onionreply. This makes it clear we're dealing with a message which is a wrapped error reply (needing unwrap_onionreply), not an already-wrapped one. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2020-01-23 16:17:42 +10:30
Rusty Russell	11dc1b341c	gossipd: hand all candidates up to lightningd to select routeboost. This lets us do more flexible filtering in the next patch. But it also keeps some weird logic out of gossipd. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2020-01-04 08:07:22 +08:00
Christian Decker	db92c2ac5e	tlv: Remove unused TLV deserialization function	2019-12-03 00:37:15 +00:00
Rusty Russell	d119758b09	gossipd: don't crash if we have > 7000 stale short_channel_ids. Fixes: #3269 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Changelog-Fixed: gossipd crash on huge number of unknown channels.	2019-11-21 04:21:38 +00:00
Rusty Russell	5a95e9f29a	gossipd: remove chainparams local var. We have a global, let's use it. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-11-20 20:41:53 +01:00
Rusty Russell	6e433e0108	gossipd: work around LND reply_channel_range. We've been sending them errors for invalid replies; instead, this works around it. Changelog-Added: Workaround LND's reply_channel_range issues instead of sending error. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-11-18 11:01:20 +01:00
Rusty Russell	eed654f684	connectd, gossipd: use per-peer logging. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-11-18 04:50:22 +00:00
Rusty Russell	00cb5adfe6	common: allow subdaemons to specify the node_id in status messages. This is ignored in subdaemons which are per-peer, but very useful for multi-peer daemons like connectd and gossipd. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-11-18 04:50:22 +00:00
Rusty Russell	fdd69af1f6	gossipd: don't discard node_announcements with old timestamps. It really, really doesn't matter. But we were dramatically reducing our view of the network: In my gossip_store (mainnet): channel_announcement: 30349 channel_update: 55119 node_announcment: 1783 Changelog-Fixed: No longer discard most node_announcements (fixes #3194) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-11-15 13:01:34 +01:00
Rusty Russell	3a25e9b8d6	gossipd: add hop-style to nodes to mark whether they speak TLV onion. We keep the feature bitmap on disk, so we cache this in the struct explicitly. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-11-14 10:15:33 +01:00
Rusty Russell	5df9e5b7b4	gossipd: allow node_announcements and channel_announcements with unsupported features. The flat feature PR changes the rules so these are OK to propagate. That makes sense: the unsupported features means there's something unsupported about the node or channel, not the msg itself (for that we'd use a different message type). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-11-10 10:42:29 +01:00
Rusty Russell	5a8677edc6	gossipd: add txout_failure when a close is seen. This prevents a gratuitous lookup of we get a late channel_announce, but even better, it suppresses the "bad gossip" messages in case of a late channel_update, which have plagued Travis (especially since we got aggressive in pushing our own updates). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-11-07 03:50:53 +00:00
Rusty Russell	abe7133bd5	gossipd: use in_txout_failures to do lookup in channel_announcement. This correctly refreshes the txout entry against aging. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-11-07 03:50:53 +00:00
Rusty Russell	7f45e55d84	gossipd: set the push marker for our own messages. This is a better fix than doing it manually, which turned out to do it in the wrong order (node_announcement followed by channel_announcement) anyway. Should fix many "Bad gossip" messages. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-11-04 17:50:58 +01:00
Rusty Russell	bb370e66a8	gossipd: handle a "push" marker into the gossip_store. This tells clients to ignore any timestamp_filter and always send this message when it sees it. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-11-04 17:50:58 +01:00
Rusty Russell	fe17acf07b	TAGS: reformat to fix when PRINTF_FMT() used. I was wondering why TAGS was missing some functions, and finally tracked it down: PRINTF_FMT() confuses etags if it's at the start of a function, and it ignores the rest of the file. So we put PRINTF_FMT at the end, but that doesn't work for definitions, only declarations. So we remove it from definitions and add gratuitous declarations in the few static places.1 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-11-01 17:27:20 -05:00
arowser	0985c6e219	Fix build fail on 32bit OS.	2019-10-23 07:23:33 +11:00
Rusty Russell	35bbba68a5	Revert "gossipd: query_messages: fail the connection if peer says it does not have up-to-date infos" This reverts commit `7fd2f6db6d`. We want to set 'complete' to false in future if we don't know everything!	2019-10-21 15:28:42 +02:00
Rusty Russell	bc430cced3	gossipd: fix false-positive memleak detection in pending_node_map. lightning_gossipd(17421): MEMLEAK: 0x564b4b17b5a8 ligtning_gossipd(17421): label=gossipd/routing.c:1490:struct pending_node_announce lightning_gossipd(17421): backtrace: lightning_gossipd(17421): ccan/ccan/tal/tal.c:437 (tal_alloc_) lightning_gossipd(17421): gossipd/routing.c:1490 (catch_node_announcement) lightning_gossipd(17421): gossipd/routing.c:1837 (handle_channel_announcement) lightning_gossipd(17421): gossipd/gossipd.c:238 (handle_channel_announcement_msg) lightning_gossipd(17421): gossipd/gossipd.c:461 (peer_msg_in) lightning_gossipd(17421): common/daemon_conn.c:31 (handle_read) lightning_gossipd(17421): ccan/ccan/io/io.c:59 (next_plan) lightning_gossipd(17421): ccan/ccan/io/io.c:407 (do_plan) lightning_gossipd(17421): ccan/ccan/io/io.c:417 (io_ready) lightning_gossipd(17421): ccan/ccan/io/poll.c:445 (io_loop) lightning_gossipd(17421): gossipd/gossipd.c:1700 (main) lightning_gossipd(17421): parents: lightning_gossipd(17421): gossipd/routing.c:294:struct routing_state Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-10-21 14:08:05 +02:00
Rusty Russell	78c9d69111	gossipd: makes probe larger. These are fairly cheap, and it's important to make sure we're not missing gossip. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-10-18 16:12:42 +02:00
Rusty Russell	4b33b50625	gossipd: ask a peer for every channel it knows on startup. Asking for the last few blocks was logical, but my node is missing most gossip in practice. For the moment, simply ask a peer for every channel it knows, once we're started up. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-10-18 16:12:42 +02:00
Rusty Russell	79df507442	gossipd: exclude early blocks from random probes. When probing, no point probing for before lightning became cool. Current logic means we often probe below block 500,000, and think things are OK because there are no short_channel_ids. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-10-18 16:12:42 +02:00
Rusty Russell	79e2c3f89a	gossipd: don't crash if we're forced to discard corrupt gossip store. When we're in remove_all_gossip, we don't call free_chan, but free it manually. This trips over the developer-mode check that we called free_chan! Make it also insert the magic so that destroy_chan_check passes: lightning_gossipd: gossipd/routing.c:496: destroy_chan_check: Assertion `chan->sat.satoshis == (u64)chan' failed. lightning_gossipd: FATAL SIGNAL 6 (version v0.7.3rc2-2-gf89d7c1) 0x5632436a4544 send_backtrace common/daemon.c:41 0x5632436a45ea crashdump common/daemon.c:54 0x7f053c3c7f5f ??? ???:0 0x7f053c3c7ed7 ??? ???:0 0x7f053c3a9534 ??? ???:0 0x7f053c3a940e ??? ???:0 0x7f053c3b9011 ??? ???:0 0x563243698b9d destroy_chan_check gossipd/routing.c:496 0x5632436dca46 notify ccan/ccan/tal/tal.c:235 0x5632436dcf35 del_tree ccan/ccan/tal/tal.c:397 0x5632436dd2c1 tal_free ccan/ccan/tal/tal.c:481 0x56324369f004 remove_all_gossip gossipd/routing.c:2981 0x563243692f5d gossip_store_load gossipd/gossip_store.c:772 0x56324368eff4 gossip_init gossipd/gossipd.c:872 0x563243690cbb recv_req gossipd/gossipd.c:1580 0x5632436a4a69 handle_read common/daemon_conn.c:31 0x5632436cc7ae next_plan ccan/ccan/io/io.c:59 0x5632436cd32b do_plan ccan/ccan/io/io.c:407 0x5632436cd369 io_ready ccan/ccan/io/io.c:417 0x5632436cf52f io_loop ccan/ccan/io/poll.c:445 0x56324369102f main gossipd/gossipd.c:1700 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-10-17 23:40:05 +02:00
Rusty Russell	624b76e32e	gossipd: fix stale scid query. We always ended up sending an empty query when we had stale scids! And it turns out we consider such a query invalid: Bad query_short_channel_ids query_flags 010506226e46111a0b59caaf126043eb5bbf28c34f3a5e332a1fc7b2b73cf188910f000100010100 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-10-15 18:05:54 +02:00
Rusty Russell	e462bd4de0	gossipd: increase number of gossiping peers. We only chose 3 peers to gossip with us (down from 8 last release). There's no justification for this number, or reason to believe that it is sufficient to keep us in sync. Be more conservative for now; we can always decrease it later once we have more data. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-10-15 18:05:54 +02:00
Rusty Russell	41d8308b68	seeker: be more random with node_announcement probes. polling the last 32 is fairly useless in practice, since they tend to be recent nodes; it won't detect long-forgotten ones. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-10-15 18:05:54 +02:00
Rusty Russell	1e59d2a738	gossipd: count channel_updates on new channels correctly. If we get a channel_update while we're still verifying the channel_announcement we didn't set the peer pointer, so it didn't get credit. As a result, the seeker tended to think we were done gossiping sooner than we were. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-10-15 18:05:54 +02:00
Rusty Russell	034ed1711c	gossipd: fix memleak when we getnodes has no nodes. In this case, node_arr is NULL. Triggered by the next test. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-10-14 18:58:44 -05:00
Rusty Russell	ca53c1b699	gossipd: push our own gossip messages harder. I had a report of a 0.7.2 user whose node hadn't appeared on 1ml. Their node_announcement wasn't visible to my node, either. I suspect this is a consequence of recent version reducing the amount of gossip they send, as well as large nodes increasingly turning off gossip altogether from some peers (as we do). We should ignore timestamp filters for our own channels: the easiest way to do this is to push them out directly from gossipd (other messages are sent via the store). We change channeld to wrap the local channel_announcements: previously we just handed it to gossipd as for any other gossip message we received from our peer. Now gossipd knows to push it out, as it's local. This interferes with the logic in tests/test_misc.py::test_htlc_send_timeout which expects the node_announcement message last, so we generalize that too. [ Thanks to @trueptolmy for bugfix! ] Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-10-14 15:00:37 -05:00
Rusty Russell	bd55f6d940	common/features: only support a single feature bitset. This is mainly an internal-only change, especially since we don't offer any globalfeatures. However, LND (as of next release) will offer global features, and also expect option_static_remotekey to be a global feature. So we send our (merged) feature bitset as both global and local in init, and fold those bitsets together when we get an init msg. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-10-11 02:52:04 +00:00
Rusty Russell	9485919a81	queries: make sure scids are in order. I thought LND had a bug, but turns out it doesn't like out-of-order short_channel_ids: in fact, the spec says they have to be in order! This means we use uintmap instead of a htable for unknown_scids and stale_scids so they're nicely ordered. But our nodes-missing-announcements probe is harder since they can also contain duplicates: we switch that to iterate through channels rather than nodes. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-10-10 21:48:52 -05:00
Rusty Russell	48f0362eae	seeker: handle non-synced state internally. We weren't supposed to do any gossiping until we were synced (and thus knew blockheight), but our seeker_check() didn't wait for it! Move the waiting all into seeker.c, so it can handle it all consistently. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-10-10 21:48:52 -05:00
Rusty Russell	89d97b330e	seeker: don't try to fill peers when they connect. On testing, I found a node which would hang up every time we asked it for query_short_channel_ids (despite it offering features 0x81, meaning it should handle this message). Then it would reconnect, and we'd choose it again as our PROBING_NANNOUNCES peer! Instead, leave finding another peer to the once-a-minute seeker_check() function. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-10-10 21:48:52 -05:00
Rusty Russell	596ed6a83b	gossipd: better description of what's happening with the seeker. By combining set_state() with selected_peer() we can give a single log line describing what we're asking for, from whom. We also add more verbosity to a few key areas, such as gossip rotation and when gossipd tells a peer to send an error. And move a comment which was above the wrong function (due to rebase?). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-10-10 21:48:52 -05:00
Rusty Russell	a88553ea6d	gossipd: fix typo in seeker random probe logic. No point probing past current block. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-10-10 21:48:52 -05:00
Rusty Russell	0c7c765a28	seeker: do set_state() in callee, not caller. This means we sometimes do it redundantly, but this means it's done in fewer places. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-10-10 21:48:52 -05:00
Rusty Russell	f4a6986d72	gossipd: remove unknown short_channel_ids as we ask for them. Otherwise we can get stuck asking for bogus ones over and over! Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-10-10 21:48:52 -05:00
Rusty Russell	d2a5f056a8	gossipd: restore dev-suppress-gossip functionality. Don't start new peers, and don't check on existing peers. This should get rid of most gossip. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-10-10 21:48:52 -05:00
Rusty Russell	7d207c50fa	gossipd: remove some spammy debug messages. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-10-10 21:48:52 -05:00
Rusty Russell	a1644c1b6e	seeker: start doing a channel probe if we see unknown node_announcement msgs. It usually means we're missing something, but there's no way to ask what. Simply start a broad scid probe. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-10-10 21:48:52 -05:00
Rusty Russell	f7cffbad98	seeker: try asking peer which gave us unknown data first. This should give more reliable results, though it risks us getting suckered into always consulting the same peer. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-10-10 21:48:52 -05:00
Rusty Russell	82a5efa932	gossipd: start streaming gossip from last gossip timestamp minus 10 minutes. We assume that the time for gossip propagation is < 10 minutes, so by going back that far from last gossip we won't miss anything, Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-10-10 21:48:52 -05:00
Rusty Russell	70e88b0dfb	gossipd: have seeker control which peers gossip, reduce to 3 and rotate. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-10-10 21:48:52 -05:00
Rusty Russell	877d1eaab3	gossipd: don't request channel_updates if we're being spammed. It's simple: if we wouldn't accept the timestamp we see, don't put the channel in the stale_scid_map. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-10-10 21:48:52 -05:00
Rusty Russell	b33a73ced6	gossipd: don't hang up on slow peers. Just try to choose another. Under Travis, this causes many failures due to slowness (they only get 10 seconds in -dev mode). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-10-10 21:48:52 -05:00
Rusty Russell	869b5e40b5	gossipd: simplify seeker state machine. We eliminate the "need peer" states and instead check if the random_peer_softref has been cleared. We can also unify our restart handlers for all these cases; even the probe_scids case, by giving gossip credit for the scids as they come in (at a discount, since scids are 8 bytes vs the ~200 bytes for normal gossip messages). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-10-10 21:48:52 -05:00
Rusty Russell	918478b0ef	gossipd: use timestamp information to detect stale scids. If we have nothing better to do, ask about stale channels. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-10-10 21:48:52 -05:00
Rusty Russell	af3bc4d11f	gossipd: use timestamp information to detect stale scids. Build up a map of short_channel_ids which we have old info for (only if peer supports gossip_query_ex). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-10-10 21:48:52 -05:00
Rusty Russell	1f2a03f019	gossipd: hand (any) timestamps through to callback for query_channel_range. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-10-10 21:48:52 -05:00
Rusty Russell	4dcb7df83e	gossipd: add query_option_flags suport for asking query_channel_range. This asks peers to append the timestamps or checksums: if it has gossip_query_ex support, it will, otherwise it's ignored. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-10-10 21:48:52 -05:00
Rusty Russell	270368e50b	gossipd: use gossip_query_ex to query only nodes when node probing. If the peer supports `gossip_query_ex` we can use query_flags to simply request the node_announcements when probing for nodes, rather than getting everything. If a peer doesn't support `gossip_query_ex` then it's harmless to add it. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-10-10 21:48:52 -05:00
Rusty Russell	d33ebd3629	seeker: probe for node announcements. We pick some nodes which don't seem to have node_announcements and we ask a channel associated with them. Again, if this reveals more node_announcements, we probe for twice as many next time. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-10-10 21:48:52 -05:00
Rusty Russell	ba3c79e560	seeker: seek unknown scids. If we have any unknown short_channel_ids, we ask a random peer for those channels. Once it responds, we probe again for a small random range in case more are missing, again enlarging if we find some. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-10-10 21:48:52 -05:00
Rusty Russell	4b13c92802	seeker: use hash table for unknown short_channel_ids. Instead of a linear array which is fairly inefficient if it turns out we know nothing at all. We remove the gossip_missing() call by changing the api to remove_unknown_scid() to include a flag as to whether the scid turned out to be real or not. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-10-10 21:48:52 -05:00
Rusty Russell	83575f27a1	seeker: add code to check range of scids. Once we've finished streaming gossip from the first peer, we ask a random peer (maybe the same one) for all short_channel_ids in the last 6 blocks from the latest channel we know about. If this reveals new channels we didn't know about, we expand the probe by a factor of 2 each time. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-10-10 21:48:52 -05:00
Rusty Russell	521c7f7121	seeker: take over gossip control. The seeker starts by asking a peer (the first peer!) for all gossip since a minute before the modified time of the gossip store. This algorithm is enhanced in successive patches. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-10-10 21:48:52 -05:00
Rusty Russell	55323ec385	gossipd: move gossip seeking routines into new file seeker.c Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-10-10 21:48:52 -05:00
Rusty Russell	79ca9bf998	gossipd: use per-peer information to make messages clearer. We can (usually) indicate what peer caused the bad gossip error. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-10-10 21:48:52 -05:00
Rusty Russell	0091300ee3	gossipd: track what peer gave us gossip msgs so we can credit it. Since we have to validate, there can be a delay (and peer might vanish) between receiving the gossip and actually confirming it, hence the use of softref. We will use this information to check that the peers are making progress as we start asking them for specific information. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-10-10 21:48:52 -05:00
Rusty Russell	296868daf4	gossipd: have gossip_store_load() return a timestamp. This is the modified-time of the file. We have to store it internally since we overwrite the gossip file with compaction on startup. This means the "are we behind on gossip?" heuristic is no longer inside gossip_store.c, which is cleaner. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-10-10 21:48:52 -05:00
Rusty Russell	d75302deba	gossipd: random_peer() selector. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-10-10 21:48:52 -05:00
darosior	7fd2f6db6d	gossipd: query_messages: fail the connection if peer says it does not have up-to-date infos It is most likely not on the same network, and in any case not a good peer to gossip with.	2019-10-09 16:54:39 -05:00
darosior	2638947ddc	gossipd: query_scid: respond with complete to 0 on wrong chain_hash	2019-10-09 16:54:39 -05:00
darosior	d3c8225968	gossipd: add a BOLT#7 comment when wrong chain_hash in 'query_channel_range' And correct some typos	2019-10-09 16:54:39 -05:00
Rusty Russell	33c658ecfb	gossipd: advertize all our features in node_announcement. This preempts the acceptance of https://github.com/lightningnetwork/lightning-rfc/pull/666 but it's clear that feature bits are going to be distinct, so this is safe to do anyway. See https://github.com/lightningnetwork/lightning-rfc/pull/680 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-10-07 05:51:05 +00:00
Christian Decker	ef7a63d8f8	elements: Move from a global is_elements to a global chainparams We now have a pointer to chainparams, that fails valgrind if we do anything chain-specific before setting it. Suggested-by: Rusty Russell <@rustyrussell>	2019-10-03 04:32:57 +00:00
willcl-ark	8d4203e9a6	[dev-suppress-gossip] - Set new peers to GOSSIP_NONE with flag enabled Signed-off-by: willcl-ark <will8clark@gmail.com>	2019-10-03 04:13:55 +00:00
Rusty Russell	aab9e9f010	gossipd: remove internal dev helpers for queries. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-09-30 07:08:07 +00:00
Rusty Russell	8b3a298ce6	gossipd: generalize query_short_channel_ids to use a callback. We currently use a flag, but that's inflexible. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-09-30 07:08:07 +00:00
Rusty Russell	c07dff21dc	gossipd: generalize query_channel_range to use a callback. This means we'll be able to call it for internal reasons, not just dev testing as now. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-09-30 07:08:07 +00:00
Rusty Russell	4bf0bc1f28	gossipd: age txout_failures map. We do this by keeping a current and an old map, and moving the current to old every hour or 10,000 entries. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-09-27 02:32:53 +00:00
Rusty Russell	1450a13c1f	gossipd: don't expose scids of unannounced channels. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-09-27 02:32:53 +00:00
Rusty Russell	722b4942ed	common: rename decode_short_channel_ids.{c,h} to decode_array.{c.h} This encoding scheme is no longer just used for short_channel_ids, so make the names more generic. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-09-27 02:32:53 +00:00
Rusty Russell	aa9024db51	gossipd: fix memleak false-positive 2019-09-26T02:00:47.832Z DEBUG lightning_hsmd(89822): Client: Received message 33 from client' 2019-09-26T02:00:47.837Z BROKEN lightning_gossipd(89828): MEMLEAK: 0x55eddc5d1fd8 2019-09-26T02:00:47.838Z BROKEN lightning_gossipd(89828): label=gossipd/routing.c:1579:struct unupdated_channel 2019-09-26T02:00:47.838Z DEBUG lightning_gossipd(89828): backtrace: 2019-09-26T02:00:47.838Z DEBUG lightning_gossipd(89828): ccan/ccan/tal/tal.c:437 (tal_alloc_) 2019-09-26T02:00:47.838Z DEBUG lightning_gossipd(89828): gossipd/routing.c:1579 (routing_add_channel_announcement) 2019-09-26T02:00:47.838Z DEBUG lightning_gossipd(89828): gossipd/routing.c:1867 (handle_pending_cannouncement) 2019-09-26T02:00:47.838Z DEBUG lightning_gossipd(89828): gossipd/gossipd.c:1543 (handle_txout_reply) 2019-09-26T02:00:47.838Z DEBUG lightning_gossipd(89828): gossipd/gossipd.c:1726 (recv_req) 2019-09-26T02:00:47.838Z DEBUG lightning_gossipd(89828): common/daemon_conn.c:31 (handle_read) 2019-09-26T02:00:47.838Z DEBUG lightning_gossipd(89828): ccan/ccan/io/io.c:59 (next_plan) 2019-09-26T02:00:47.838Z DEBUG lightning_gossipd(89828): ccan/ccan/io/io.c:407 (do_plan) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-09-27 00:01:34 +00:00
Rusty Russell	d24c850899	gossipd: restore a flag for fast pruning I was seeing some accidental pruning under load / Travis, and in particular we stopped accepting channel_updates because they were 103 seconds old. But making it too long makes the prune test untenable, so restore a separate flag that this test can use. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-09-27 00:01:34 +00:00
Rusty Russell	2b47922ea5	gossipd: move query functions into their own file. The only real change is dump_gossip() used to call maybe_create_next_scid_reply(), but now I've simply renamed that to maybe_send_query_responses() and we call it directly. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-09-25 04:01:56 +00:00
Rusty Russell	38124ec287	gossipd: don't ask peers for gossip until we're synced with bitcoind. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-09-25 04:01:56 +00:00
Rusty Russell	a071a754b3	gossipd: place limit on pending announcements. Now we queue them, we should place a limit. It's not the worst thing in the world if we discard them (we'll catch up eventually), but we should try not to in case we're just a bit behind. Our behaviour here is also O(n^2) so we don't want a massive queue anyway. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-09-25 04:01:56 +00:00
Rusty Russell	fd2d74aa9b	gossipd: defer asking about txouts until we're synced or they're 6 deep. The first one means we don't discard channels just because we're not synced, and the second is implied by the spec: don't accept channel_announcement if the channel isn't 6 deep. Since LND defers in such cases, we do too (unless it's newer than the current block, in which case we simply discard). Otherwise there's a risk that a slow node might discard valid gossip. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-09-25 04:01:56 +00:00
Rusty Russell	2a74d53841	Move gossip_constants.h into common/ Turns out we weren't checking the BOLT comments before, so they needed an overhaul. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-09-25 04:01:56 +00:00
Rusty Russell	6f9c5f2936	gossipd: get fed the blockheight from lightningd when we know it. This will let gossipd be more intelligent about gossiping before we're synced, and also it might know how far behind we are. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-09-25 04:01:56 +00:00
trueptolemy	d8dce6e61f	cleanup: Use `u32` as the type of `max_hops` in `gossipd`	2019-09-24 16:01:24 +02:00
Rusty Russell	b55ff34f93	gossipd: fix corner case where gossip msg too old after pending delay. Happened under Travis with --dev-fast-gossip (90 second prune time), but can happen anyway if gossip is almost 2 weeks old when we receive it: 2019-09-20T19:16:51.367Z DEBUG lightning_gossipd(20972): Received node_announcement for node 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59 2019-09-20T19:16:51.376Z DEBUG lightning_gossipd(20972): Ignoring node_announcement timestamp 1569006918 for 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59 2019-09-20T19:16:51.669Z BROKEN lightning_gossipd(20972): pending node_announcement 01013094af771d60f4de69bb39ce045e4edf4a06fe6c80078dfa4fab58ab5617d6ad4fa34b6d3437380db0a8293cea348bbc77f714ef71fcd8515bfc82336667441f00005d852546022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59022d2253494c454e544152544953542d633961313734610000000000000000000000000000 malformed? (version c9a174a) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-09-22 20:56:11 +02:00
Rusty Russell	27790832a5	gossipd: gossip_queries_ex is not longer experimental. The master spec has some typos which make it not parse, so I created a PR and generated the CSV from that: https://github.com/lightningnetwork/lightning-rfc/pull/673 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-09-22 01:17:11 +00:00
Rusty Russell	895e552475	BOLT: update to master with gossip_queries_ex. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-09-22 01:17:11 +00:00
Rusty Russell	e5564173e7	BOLT: update to cover minor changes. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-09-22 01:17:11 +00:00
Rusty Russell	6a8d18c7e3	gossipd: naming cleanups. Suggested-by: @cdecker. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-09-20 06:55:00 +00:00
Rusty Russell	39c9dcbafc	ratelimit: adjust based on --dev-fast-gossip, test. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-09-20 06:55:00 +00:00
Rusty Russell	147eaced2e	developer: consolidiate gossip timing options into one --dev-fast-gossip. It's generally clearer to have simple hardcoded numbers with an #if DEVELOPER around it, than apparent variables which aren't, really. Interestingly, our pruning test was always kinda broken: we have to pass two cycles, since l2 will refresh the channel once to avoid pruning. Do the more obvious thing, and cut the network in half and check that l1 and l3 time out. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-09-20 06:55:00 +00:00
Rusty Russell	8139164aa0	gossipd: disallow far future (+1 day) or far past (2 weeks) timestamps. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-09-20 06:55:00 +00:00
Rusty Russell	76860683aa	gossipd: only allow one channel_update per direction per day. And similar for node_announcement. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-09-20 06:55:00 +00:00
Rusty Russell	a92ead48bf	gossipd: ignore redundant channel_update and node_announcement. If you send a message which simply changes timestamp and signature, we drop it. You shouldn't be doing that, and the door to ignoring them was opened by by option_gossip_query_ex, which would allow clients to ignore updates with the same checksum. This is more aggressive at reducing spam messages, but we allow refreshes (to be conservative, we allow them even when 1/2 of the way through the refresh period). I dropped the now-unnecessary sleep from test_gossip_pruning, too. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-09-20 06:55:00 +00:00
Rusty Russell	46e0f1efcc	gossipd: refresh every 13 days, not every 7. One day is plenty of time to propagate the update. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-09-20 06:55:00 +00:00
Rusty Russell	06afb408d8	gossipd: bias lower bit of timestamp to ensure alternation. This is useful for various "partial timestamp" forms of propagation in future, esp. minisketch. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-09-20 06:55:00 +00:00
Rusty Russell	21a6d502db	gossipd: move gossip message generation into its own file. gossipd.c is doing too many things: this is a start. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-09-20 06:55:00 +00:00
Rusty Russell	0bab2580fc	gossipd: clean up local channel updates. Make update_local_channel use a timer if it's too soon to make another update. 1. Implement cupdate_different() which compares two updates. 2. make update_local_channel() take a single arg for timer usage. 3. Set timestamp of non-disable update back 5 minutes, so we can always generate a disable update if we need to. 4. Make update_local_channel() itself do the "unchanged update" suppression. gossipd: clean up local channel updates. 5. Keep pointer to the current timer so we override any old updates with a new one, to avoid a race. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-09-20 06:55:00 +00:00
Rusty Russell	e1c431d278	gossipd: use local_chan_map more. We can look up local channels directly now, which offers simplifcations. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-09-20 06:55:00 +00:00
Rusty Russell	27d9b75456	gossipd: add shadow structure for local chans. Normally we'd put a pointer into struct half_chan for local information, but it would be NULL on 99.99% of nodes. Instead, keep a separate hash table. This immediately subsumes the previous "map of local-disabled channels", and will be enhanced further. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-09-20 06:55:00 +00:00
Rusty Russell	70c4ac6d74	gossipd: suppress our own too-close node_announcement messages. Never make them less than gossip_min_interval apart. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-09-20 06:55:00 +00:00
Rusty Russell	178baeba6c	gossipd: get gossip_min_interval from lightningd. Default is 5 x gossip interval == 5 minutes. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-09-20 06:55:00 +00:00
Rusty Russell	4cfd0524eb	gossipd: simplify duplicate node_announcement check. Write helpers to split it into non-timestamp, non-signature parts, and simply compare those. We extract a helper to do channel_update, too. This is more generic than our previous approach, and simpler. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-09-20 06:55:00 +00:00
Rusty Russell	5ddd7866e4	gossipd: make create_node_announcement const-correct. sig is only non-const so we can override if NULL, but talz helps us here. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-09-20 06:55:00 +00:00
trueptolemy	5361a5d059	JSON-API: `getroute` now also support `exclude` nodes	2019-09-16 12:22:06 +08:00
trueptolemy	090a43fd3d	gossip: Add the `struct exclude_entry` and `enum exclude_entry_type`	2019-09-16 12:22:06 +08:00
Rusty Russell	a46e880f1d	gossipd: in DEVELOPER mode, catch missing free_chan() For memory-usage reasons, struct chan doesn't use a tal destructor, in favor of us calling free_chan in the right places. In DEVELOPER mode, we should check that is the case. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-09-12 05:11:56 +00:00
Rusty Russell	91072f56b0	developer: add 'dev-gossip-set-time' call to manipulate gossipd's time. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-09-12 05:11:56 +00:00
Rusty Russell	768d293149	gossipd: don't get upset if we can't add channel_update. In particular, the timestamp might be wrong once we start checking that. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-09-12 05:11:56 +00:00
Rusty Russell	2577ad87d5	gossipd: use gossip_time_now() everywhere. We've been slack, but it's going to be important for testing ratelimiting. And it currently has a minor memory leak. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-09-12 05:11:56 +00:00
lisa neigut	fe6c7f8f80	gossip queries: patch up valgrind errors in tests These were giving me valgrind errors locally; fixed now.	2019-09-11 23:56:27 +00:00
Rusty Russell	afbed94a6c	gossipd: work around missing pwritev(). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-09-11 05:58:36 +00:00
darosior	0b0ad4c22d	transition from status_trace() to status_debug	2019-09-10 02:02:51 +00:00
darosior	ea6c95b2b3	gossipd: don't ignore wrong chain in 'query_channel_range' Give a NULL reply with the 'complete' flag to 0 instead	2019-09-10 02:02:51 +00:00
darosior	9be28fe40f	daemons tour: minor typos correction	2019-09-10 02:02:51 +00:00
Rusty Russell	c99906a9a9	per-peer-daemons: tie in gossip filter. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-09-06 14:35:01 +02:00
Rusty Russell	aca2e4f722	common/memleak: add dynamic hooks for assisting memleak. Rather than reaching into data structures, let them register their own callbacks. This avoids us having to expose "memleak_remove_xxx" functions, and call them manually. Under the hood, this is done by having a specially-named tal child of the thing we want to assist, containing the callback. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-09-06 14:35:01 +02:00
Rusty Russell	837e6232c3	common: reduce header differences for DEVELOPER vs non-DEVELOPER. `make update-mocks` is usually run in DEVELOPER mode, but then it includes definitions for functions which aren't declared in non-DEVELOPER mode. We hacked this in a few places, but it's fragile, and worst, now we have EXPERIMENTAL_FEATURES as well, it's complex. Instead, declare developer-only functions (but don't define them). This is a bit more awkward if you accidentally use one in non-DEVELOPER code (link error rather than compile error), but makes autogenerating test mocks much easier. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-09-06 14:35:01 +02:00
Rusty Russell	acf3952acc	JSON: remove handling of pre-Adelaide (B:T:N) short_channel_ids. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-09-06 14:19:14 +02:00
Rusty Russell	02b9b7f6e6	tests: update mocks for --enable-experimental-features builds. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-08-29 15:51:36 +02:00
Rusty Russell	28185c397c	gossipd: fix gossip send in case query_flags cause no output. Fortunately, again, only happens with EXPERIMENTAL_FEATURES. If the query causes us not to actually send anything, we won't get called again. This can validly happen if they only asked for the node_announcements, for example. (Found by protocol tests). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-08-27 12:35:25 +02:00
Rusty Russell	855dff704c	gossipd: test crc32 routines using test vectors from PR. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-08-27 12:35:25 +02:00
Rusty Russell	d1a1592cc8	gossipd: fix calculation of crc32 of update. Currently EXPERIMENTAL_FEATURES only, fortunately. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-08-27 12:35:25 +02:00
Rusty Russell	0ec8304901	gossipd: fix premature towire_reply_short_channel_ids_end if no node_announcement. Our "are we finished?" logic was wrong: it tested if there are no more node_announcements, but it's possible that there were no node_announcements for either end of the channel whose information we sent. This is actually quite unusual on the real network: looking at mainnet statis from last May, 4301 of 4337 nodes have node_announcements. However, with query flags it's much more likely, since they might not ask for node announcements at all. (Found by gossip protocol tests) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-08-26 23:09:00 +00:00
Rusty Russell	2f1e116510	gossipd: use htable_count() rather than reaching into htable struct. Now ccan/htable provides the helper, let's use it. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-08-26 08:44:22 +00:00
Rusty Russell	51541f53d8	gossipd: test vectors for https://github.com/lightningnetwork/lightning-rfc/pull/557 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-08-10 02:48:34 +00:00
Rusty Russell	4de47f6db5	gossipd: use default zlib compression, hack for zlib expansion. These both allow us to reproduce the test vectors in the next patch. But using Z_DEFAULT_COMPRESSION is a reasonable idea anyway. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-08-10 02:48:34 +00:00
Rusty Russell	f6cf4bf62a	spec: remove encoding byte from checksums. Make the TLV element a simple array. This is a bit neater, in fact, and makes the test vectors in that 557 PR work. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-08-10 02:48:34 +00:00
Rusty Russell	8abd850d3c	gossipd: append timestamps & checksums to reply_channel_range if asked (EXPERIMENTAL) In fact, we always generate them, we only send them if asked. And we set the flags to 0 if not --enable-experimental-features, so we never send in that case. Generating checksums involves pulling the channel_update from the gossip_store, which is suboptimal: there's a FIXME to store the checksum in memory. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-08-10 02:48:34 +00:00
Rusty Russell	c7853197ae	gossipd: generalize encoding functions We're about to use the for gossip extended info too, which don't put the encoding byte at the beginning of the data stream. So this removes some "scids" from function names and separates out the "prepend a byte" case from the "external encoding_type" case. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-08-10 02:48:34 +00:00
Rusty Russell	0de11da5e4	gossipd: decode and obey query_short_channel_ids's TLV query_flags (EXPERIMENTAL) These indicate what fields we are to return. If there's now TLV, or we haven't got --enable-experimental-features, it's set to all 1s so behaviour is unchanged. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-08-10 02:48:34 +00:00
Rusty Russell	d2030539e1	EXPERIMENTAL: pull in PR 557 (with minor fixes): range query support. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-08-10 02:48:34 +00:00
Rusty Russell	f9ecc76d99	gossipd: check that we don't try to access a deleted gossip entry. We ignored this before, which meant that the DEVELOPER-mode check that we delete the correct record didn't check that it wasn't already deleted. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-08-09 08:58:05 +02:00
Rusty Russell	f57f068592	gossipd: don't use O_APPEND on the gossip_store. We always know the length, so we don't need it. It causes much extra work when we want to delete a record, which I suspect may cause issues amongst some users who've been seeing gossip_store corruption. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-08-03 12:50:51 +02:00
ZmnSCPxj	3e74ca4b86	gossipd/routing.c: Correctly handle a duplicated entry in `exclude` of `getroute`.	2019-08-02 16:06:15 +02:00
Rusty Russell	cc70b9c4ec	wire: use common/bigsize routines Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-07-31 23:25:59 +00:00
Rusty Russell	6bb8525e5d	gossipd: fix crash when we prune old, un-updated channel announcements. We added a random channel to the list, but we can just free it immediately (since traversal of a uintmap isn't altered by deletion). This was introduced in `d1f43d993a` where we explicitly call free_chan rather than relying on destructors. Fixes: #2837 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-07-28 14:15:32 +02:00
lisa neigut	32eaae0cb9	wire-gen: move in-house wire delcarations to new format tidying things up!	2019-07-24 06:31:46 +00:00
Rusty Russell	b3215a866b	gossipd: fix inverted test in debug print. ==1503== Use of uninitialised value of size 8 ==1503== at 0x566786B: _itoa_word (_itoa.c:179) ==1503== by 0x566AF0D: vfprintf (vfprintf.c:1642) ==1503== by 0x569790F: vsnprintf (vsnprintf.c:114) ==1503== by 0x156CCB: do_vfmt (str.c:66) ==1503== by 0x156DB1: tal_vfmt_ (str.c:92) ==1503== by 0x1289CD: status_vfmt (status.c:141) ==1503== by 0x128AAC: status_fmt (status.c:151) ==1503== by 0x118E05: route_prune (routing.c:2495) ==1503== by 0x11DE2D: gossip_refresh_network (gossipd.c:1997) ==1503== by 0x1292B8: timer_expired (timeout.c:39) ==1503== by 0x12088C: main (gossipd.c:3075) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-07-17 20:16:55 -05:00
lisa neigut	5c07afac7d	bolt: update to BOLT spec changes (extract format + type specifications) updates the bolt version to 6639cef095a2ecc7b8f0c48c6e7f2f906fbfbc58. this requires us to use the new bolt parser at generate-bolt.py and updates to all of the type specifications (ie. from u8 -> byte)	2019-07-16 06:10:58 +00:00
Rusty Russell	c95b4eedf4	gossipd: fail clearly if we can't open/create gossip_store. Otherwise we fail at the write, and then it's not clear why we couldn't open file: lightning_gossipd: Writing version to store: Bad file descriptor (version v0.7.1-16-g7ea5c5c) 0x560dcf1a3779 send_backtrace common/daemon.c:40 0x560dcf1a634d status_failed common/status.c:192 0x560dcf19726a gossip_store_new gossipd/gossip_store.c:195 0x560dcf199fd0 new_routing_state gossipd/routing.c:177 0x560dcf1a098b gossip_init gossipd/gossipd.c:2113 0x560dcf1a197a recv_req gossipd/gossipd.c:2946 0x560dcf1a38cd handle_read common/daemon_conn.c:31 0x560dcf1bae2c next_plan ccan/ccan/io/io.c:59 0x560dcf1bb314 do_plan ccan/ccan/io/io.c:407 0x560dcf1bb341 io_ready ccan/ccan/io/io.c:417 0x560dcf1bcb13 io_loop ccan/ccan/io/poll.c:445 0x560dcf1a1ba0 main gossipd/gossipd.c:3073 Reported-by: @JavierRSobrino Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-07-04 16:10:20 +02:00
lisa neigut	7046d0220c	makefiles: move all unit tests under `make check-units` Isolate unit tests under their own make directive.	2019-06-30 16:41:30 +09:30
Rusty Russell	c303d7d534	gossipd: only do (automatic) store compaction at startup. Rewriting the gossip_store is much more trivial when we don't have any pointers into it, so add some simple offline compaction code and disable the automatic compaction code. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-21 20:03:10 -05:00
Rusty Russell	c15d9ed37c	gossip_store: make copy of corrupt gossip_store on failure. This should help debugging vastly. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-21 22:03:35 +00:00
Rusty Russell	8928f0b5f9	gossipd: remove gossip entirely if we hit a problem on load. The crashes in #2750 are mostly caused by us trying to partially truncate the store. The simplest fix for release is to discard the whole thing if we detect a problem. This is a workaround: it'd be far nicer to try to recover. Fixes: #2750 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-21 22:03:35 +00:00
Rusty Russell	8ce3b86aa5	gossipd: tighter correctness checks during gossip_store load. We shouldn't be loading old timestamps, either. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-21 22:03:35 +00:00
Rusty Russell	fc27250f80	gossipd: be more verbose and less assert()ive on bad node_announcement. We hit the timestamp assert on #2750; it shouldn't happen, but crashing doesn't leave much information. Reported-by: @m-schmook Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-21 22:03:35 +00:00
Rusty Russell	f1b57063f7	bitcoin/tx: use fromwire_fail in pull_bitcoin_tx. This is the correct way to mark failure: it also sets *max to 0. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-21 03:56:59 +00:00
Rusty Russell	47b5f2e837	gossipd: truncate gossip_store.tmp for compaction. If something went wrong and there was an old one, we were appending to it! Reported-by: @SimonVrouwe Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-20 02:53:52 +00:00
Rusty Russell	5e3690b3c5	gossipd: delete channel_amount from the store when we delete channel_announcement. Otherwise we slowly build up cruft: compaction simply moves them since they're not deleted. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-15 10:52:05 +02:00
Rusty Russell	10c503b4b4	gossip_store: clean up a truncated store. We might have channel_announcements which have no channel_update: normally these don't get written into the store until there is one, but if the store was truncated it can happen. We then get upset on compaction, since we don't have an in-memory representation of the channel_announcement. Similarly, we leave the node_announcement pending until after that channel_announcement, leading to a similar case. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-15 10:52:05 +02:00
Rusty Russell	24cc371cdf	gossipd: gossip_store errors after rewrite are fatal. We can't continue, since we've moved the indexes. We'll just crash anyway, as seen from bugs #2742 and #2743. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-14 02:17:32 +00:00
Rusty Russell	eb5cc47bdd	gossipd: count deleted records correctly when loading gossip_store. The result of an incorrect count was that we failed on next compaction. Fixes: #2743 Fixes: #2742 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-14 02:17:32 +00:00
Rusty Russell	745634d9b9	gossipd: don't catch pending node_announcements more than once. We catch node_announcements for nodes where we haven't finished analyzing the channel_announcement yet (either because we're still checking UTXO, or in this case, because we're waiting for a channel_update). But we reference count the pending_node_announce, so if we have multiple channels pending, we might try to insert it twice. Clear it so this doesn't happen. There's a second bug where we continue to catch node_announcements until all the channel_announcements are no longer pending; this is fixed by removing it from the map. Fixes: #2735 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-13 05:58:09 +00:00
Rusty Russell	1e32b4ab29	gossipd: adjust gossip filters if we discover we're missing gossip. We pick up to three random peers and ask them to gossip more. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-12 00:37:46 +00:00
Rusty Russell	6830233d0b	gossipd: control gossip level so we don't get flooded by peers. We seek a certain number of peers at each level of gossip; 3 "flood" if we're missing gossip, 2 at 24 hours past to catch recent gossip, and 8 with current gossip. The rest are given a filter which causes them not to gossip to us at all. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-12 00:37:46 +00:00
Rusty Russell	f5ea57d4c0	gossipd: reset gossip_missing if no reports for 10 minutes. An arbitrary timeout. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-12 00:37:46 +00:00
Rusty Russell	b9053767e7	gossipd: query unknown short_channel_ids, note if they were really missing. The first sign that we're missing gossip is that we get a channel_update for an unknown channel. The peer might be wrong (or lying), but if it turns out to be a real channel, we were definitely missing something. This patch does two things: queries when we get an unknown channel_update, and then notes that a channel_announcement was from such an update when it's finally processed. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-12 00:37:46 +00:00
Rusty Russell	18069ab3da	gossipd: APIs return more information about routing message handling. In particular, we'll need to know the short_channel_id if a channel_update is unknown (implies we're missing a channel), and whether processing a pending channel_announcement was successful (implies that the channel was real). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-12 00:37:46 +00:00
Rusty Russell	5ef7aa70d2	gossipd: prepare for internally-generated short-channel-id queries. Up until now we only generated these in dev mode for testing. Hoist into common code, turn counter into a flag (we're only allowed one!) and note if query is internal or not. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-12 00:37:46 +00:00
Rusty Russell	21c920a8e8	gossipd: note if loaded store seems reasonably up-to-date. If not, we can ask peers for full gossip (for now we just set a flag). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-12 00:37:46 +00:00
Rusty Russell	0d2a4830ed	ccan: update to faster and correct crc32c implementation. I decided to try a faster implementation, only to find our crc32c was not correct! Ouch. I removed the crc32c functions from ccan/crc, and added a new crc32c module which has the Mark Adler x86-64-optimized variants. We bump gossip_store version again, since csums have changed. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-11 23:40:10 +00:00
Rusty Russell	ab31f40aa2	gossipd: don't charge ourselves fees when calculating route. This means there's now a semantic difference between the default `fromid` and setting `fromid` explicitly to our own node_id. In the default case, it means we don't charge ourselves fees on the route. This means we can spend the full channel balance. We still want to consider the pricing of local channels, however: there's a reason to discount one over another, and that is to bias things. So we add the first-hop fee to the risk value instead. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-11 23:19:11 +00:00
Rusty Russell	b48c644e7a	listchannels: add `htlc_minimum_msat` and `htlc_maximum_msat` fields. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-11 23:19:11 +00:00
Rusty Russell	1a3886c116	wallet: keep a list of unreleased transactions. We're going to use this in the next patch. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-06 04:47:44 +00:00
Rusty Russell	628b65fb40	gossip_store: don't leave dangling channel_announce if we truncate. (Or, if we crashed before we got to write out the channel_update). It's a corner case, but one reported by @darosior and reproduced on my test node (both with bad gossip_store due to previous iterations of this patchset!). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-04 01:29:39 +00:00
Rusty Russell	f8b98e032c	gossipd: Don't abort() on duplicate entries in gossip_store. Triggered by a previous variant of this PR, but a goo1d idea to simply discard the store in general when we get a duplicate entry. We crash trying to delete old ones, which means writing to the store. But they should have already been deleted. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-04 01:29:39 +00:00
Rusty Russell	34c113a17a	gossipd: trivial clean up of routing_add_channel_update. For some reason I was reluctant to use the hc local variable; I even re-declared it! Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-04 01:29:39 +00:00
Rusty Russell	3e733afb2b	gossipd: remove broadcast map altogether. This clarifies things a fair bit: we simply add and remove from the gossip_store directly. Before this series: (--disable-developer, -Og) store_load_msec:20669-20902(20822.2+/-82) vsz_kb:439704-439712(439706+/-3.2) listnodes_sec:0.890000-1.000000(0.92+/-0.04) listchannels_sec:11.960000-13.380000(12.576+/-0.49) routing_sec:3.070000-5.970000(4.814+/-1.2) peer_write_all_sec:28.490000-30.580000(29.532+/-0.78) After: (--disable-developer, -Og) store_load_msec:19722-20124(19921.6+/-1.4e+02) vsz_kb:288320 listnodes_sec:0.860000-0.980000(0.912+/-0.056) listchannels_sec:10.790000-12.260000(11.65+/-0.5) routing_sec:2.540000-4.950000(4.262+/-0.88) peer_write_all_sec:17.570000-19.500000(18.048+/-0.73) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-04 01:29:39 +00:00
Rusty Russell	dd83453b2f	gossipd/gossip_store: fix compacting, don't use broadcast ordering. We have a problem: if we get halfway through writing the compacted store and run out of disk space, we've already changed half the indexes. This changes it so we do nothing until writing is finished: then we iterate through and update indexes. It also weans us off broadcast ordering, which we can now eliminated. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-04 01:29:39 +00:00
Rusty Russell	5161b79bfc	gossipd/gossip_store: keep count of deleted entries, don't use bs->count. We didn't count some records before, so we could compare the two counters. This is much simpler, and avoids reliance on bs. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-04 01:29:39 +00:00
Rusty Russell	728bb4e662	common/gossip_store: handle timestamp filtering. This means we intercept the peer's gossip_timestamp_filter request in the per-peer subdaemon itself. The rest of the semantics are fairly simple however. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-04 01:29:39 +00:00
Rusty Russell	948490ec58	gossipd: add timestamp in gossip store header. (We don't increment the gossip_store version, since there are only a few commits since the last time we did this). This lets the reader simply filter messages; this is especially nice since the channel_announcement timestamp is derived, not in the actual message. This also creates a 'struct gossip_hdr' which makes the code a bit clearer. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-04 01:29:39 +00:00
Rusty Russell	bad9734dc7	gossip_store: remove redundant copy_message. The single caller can easily use transfer_store_msg instead. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-04 01:29:39 +00:00
Rusty Russell	5591c0b5d8	gossipd: don't send gossip stream, let per-peer daemons read it themselves. Keeping the uintmap ordering all the broadcastable messages is expensive: 130MB for the million-channels project. But now we delete obsolete entries from the store, we can have the per-peer daemons simply read that sequentially and stream the gossip itself. This is the most primitive version, where all gossip is streamed; successive patches will bring back proper handling of timestamp filtering and initial_routing_sync. We add a gossip_state field to track what's happening with our gossip streaming: it's initialized in gossipd, and currently always set, but once we handle timestamps the per-peer daemon may do it when the first filter is sent. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-04 01:29:39 +00:00
Rusty Russell	4399faf57c	gossipd: make writes to gossip_store atomic. There's a corner case where otherwise a reader could see the header and not the body of a message. It could handle that in various ways, but simplest (and most efficient) is to avoid it happening. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-04 01:29:39 +00:00
Rusty Russell	a5f6ef385a	gossipd: don't wrap messages when we send them to the peer. They already send us gossip messages, so they have to be distinct anyway. Why make us both do extra work? Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-04 01:29:39 +00:00
Rusty Russell	df00f20e4a	gossipd: erase old entries from the store, don't just append. We use the high bit of the length field: this way we can still check that the checksums are valid on deleted fields. Once this is done, serially reading the gossip_store file will result in a complete, ordered, minimal gossip broadcast. Also, the horrible corner case where we might try to delete things from the store during load time is completely gone: we only load non-deleted things. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-04 01:29:39 +00:00
Rusty Russell	696dc6b597	gossipd: disable gossip_store upgrade. We're about to bump version again, and the code to upgrade it was quite hairy (and buggy!). It's not worthwhile for such a poorly-tested path: I will just add code to limit how much incoming gossip we get to avoid flooding when we upgrade, however. I also use a modern gossip_store version in our test_gossip_store_load test, instead of relying on the upgrade path. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-04 01:29:39 +00:00
Rusty Russell	43f2cbd250	gossipd: track gossip_store locations of local channels. We currently don't care, but the next patch means we have to find them again. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-04 01:29:39 +00:00
Rusty Russell	180a552fba	gossip_store: mark private updates separately from normal ones. They're really gossipd-internal, and we don't want per-peer daemons to confuse them with normal updates. I don't bump the gossip_store version; that's coming with another update anyway. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-04 01:29:39 +00:00
Rusty Russell	763697eb4c	gossipd: fix gossip_store calling delete. Now we handle node_announcements properly, we have a failure case where we try to move them when a channel is deleted while loading the store. We're going to remove this soon, in favor of in-place delete, so workaround this for now to avoid an assert() when we try to write to the store while loading. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-03 11:04:25 -07:00
Rusty Russell	21fe518513	gossip_store: fix 'bad node_announcement' by allowing node_announcement on un-updated channel. When we first receive a channel_update, we write both the channel_announcement and that channel_update to the store: we need that first update so we can set the channel_announcement timestamp. However, the channel_update can be replaced later. This means we can have a channel_announcement, a node_update which relies on it, then the channel_update later. So move the "this applies to a pending announcement" check lower, where gossip_store can use it too. Has a nice side-effect of avoiding one lookup of the node id. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-03 11:04:25 -07:00
Rusty Russell	c233fc5063	gossipd: fix spurious unused error with gcc-9 -O3. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-03 00:07:11 +00:00
Rusty Russell	c091a4ee40	gossipd: fix spurious gcc warning. It turns out that we don't look at type when we return 0, but gcc isn't quite smart enough for that. Initializing to -1 is good practice anyway for the failure path. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-03 00:07:11 +00:00
William Casarin	3f035cb3cc	gossipd: fix uninitialized free on short_route in goto path Fix a path where tal_free is called on an uninitialized variable If the first `goto bad_total` executes, then that path has uninitialized `short_route` but bad_total passes through to `out` whose first call is tal_free(short_route). This was noticed by a maybe-uninitialized heuristic on gcc 7.4.0: gossipd/routing.c: In function ‘find_shorter_route’: gossipd/routing.c:1096:2: error: ‘short_route’ may be used uninitialized in this function [-Werror=maybe-uninitialized] tal_free(short_route); Reported-by: @ZmnSCPxj <https://github.com/ElementsProject/lightning/pull/2674#issuecomment-495617253> Signed-off-by: William Casarin <jb55@jb55.com>	2019-06-03 00:07:11 +00:00
Rusty Russell	654e89b5fc	gossipd: free channels in routing_state destructor. Cleans up the tests. Suggested-by: @ZmnSCPxj Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-05-22 11:28:44 +00:00
Rusty Russell	d1f43d993a	gossipd: use explicit destructor for struct chan. Each destructor2 costs 40 bytes, and struct chan is only 120 bytes. So this drops our memory usage quite a bit: MCP bench results change: -vsz_kb:580004-580016(580006+/-4.8) +vsz_kb:533148 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-05-22 11:28:44 +00:00
Rusty Russell	59e75f1b2c	gossipd: reply to large listchannels in parts. This has two effects: most importantly, it avoids the problem where lightningd creates a 800MB JSON blob in response to listchannels, which causes OOM on the Raspberry Pi (our previous max allocation was 832MB). This is because lightning-cli can start draining the JSON while we're filling the buffer, so we end up with a max allocation of 68MB. But despite being less efficient (multiple queries to gossipd), it actually speeds things up due to the parallelism: MCP with -O3 -flto before vs after: -listchannels_sec:8.980000-9.330000(9.206+/-0.14) +listchannels_sec:7.500000-7.830000(7.656+/-0.11) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-05-22 11:28:44 +00:00
Rusty Russell	cb9c44ef27	gossipd: remove unnecessary dev_unknown_channel_satoshis arg. We now have a test blockchain for MCP which has the correct channels, so this is not needed. Also fix a benchmark script bug where 'mv "$DIR"/log "$DIR"/log.old.$$' would fail if you log didn't exist from a previous run. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-05-22 11:28:44 +00:00
Rusty Russell	85d8848ede	gossipd: neaten insert_broadcast a little. Suggested-by: @cdecker. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-05-22 11:28:44 +00:00
darosior	d9db9dc1ae	gossipd: fix listnodes crash on non existing id 'node_arr' was not instanciated if an id was passed to listnodes and we could not get a node from it	2019-05-16 19:30:10 +02:00
Rusty Russell	f5a218f9d1	gossipd: send per-peer daemons offsets into gossip store. Instead of reading the store ourselves, we can just send them an offset. This saves gossipd a lot of work, putting it where it belongs (in the daemon responsible for the specific peer). MCP bench results: store_load_msec:28509-31001(29206.6+/-9.4e+02) vsz_kb:580004-580016(580006+/-4.8) store_rewrite_sec:11.640000-12.730000(11.908+/-0.41) listnodes_sec:1.790000-1.880000(1.83+/-0.032) listchannels_sec:21.180000-21.950000(21.476+/-0.27) routing_sec:2.210000-11.160000(7.126+/-3.1) peer_write_all_sec:36.270000-41.200000(38.168+/-1.9) Signficant savings in streaming gossip: -peer_write_all_sec:48.160000-51.480000(49.608+/-1.1) +peer_write_all_sec:35.780000-37.980000(36.43+/-0.81) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-05-13 05:16:18 +00:00
Rusty Russell	0e37ac2433	common: move gossip_store read routine where subdaemons can access it. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-05-13 05:16:18 +00:00
Rusty Russell	d8db4e871f	gossipd: provide new fd to per-peer daemons when we compact it. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-05-13 05:16:18 +00:00
Rusty Russell	13717c6ebb	gossipd: hand a gossip_store_fd to all subdaemons. This will let them read from the gossip store directly. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-05-13 05:16:18 +00:00
Rusty Russell	89291b930e	gossipd: pass amount into gossip_store, rather than having it fetch. We need to store the channel capacity for channel_announcement: hand it in directly rather than having the gossip_store code do a lookup. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-05-13 05:16:18 +00:00
Rusty Russell	7ede5aac31	gossip_store: change format so we store raw messages. Save some overhead, plus gets us ready for giving subdaemons direct store access. This is the first time we upgrade the gossip_store, rather than just discarding. The downside is that we need to add an extra message after each channel_announcement, containing the channel capacity. After: store_load_msec:28337-30288(28975+/-7.4e+02) vsz_kb:582304-582316(582306+/-4.8) store_rewrite_sec:11.240000-11.800000(11.55+/-0.21) listnodes_sec:1.800000-1.880000(1.84+/-0.028) listchannels_sec:22.690000-26.260000(23.878+/-1.3) routing_sec:2.280000-9.570000(6.842+/-2.8) peer_write_all_sec:48.160000-51.480000(49.608+/-1.1) Differences: -vsz_kb:582320 +vsz_kb:582316 -listnodes_sec:2.100000-2.170000(2.118+/-0.026) +listnodes_sec:1.800000-1.880000(1.84+/-0.028) -peer_write_all_sec:51.600000-52.550000(52.188+/-0.34) +peer_write_all_sec:48.160000-51.480000(49.608+/-1.1) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-05-13 05:16:18 +00:00
Rusty Russell	c7034f271a	gossipd: avoid tal overhead in listnodes We know exactly how many there will be, so allocate an entire array up-front. -listnodes_sec:2.540000-2.610000(2.584+/-0.029) +listnodes_sec:2.100000-2.170000(2.118+/-0.026) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-05-13 05:16:18 +00:00
trueptolemy	fefe7dfbab	Gossipd: cleanup extra repeated code	2019-05-06 08:52:36 +00:00
Rusty Russell	0ca0db765a	gossipd: fix crash if we truncate store. Entries we've already loaded expect to exist in the store. We could go back and remove them all, but instead just truncate at the known-good point. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-05-01 11:59:12 +02:00
Rusty Russell	b248bb155a	tools/bench-gossipd.sh: make it work (where possible) with DEVELOPER=0 Some tests require dev support, but the rest can run. We simplify the gossip_store output so it's the same in non-dev mode too. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-24 13:46:39 -05:00
Rusty Russell	0fc42415c2	gossipd/routing: remove BFG implementation. Now we can benchmark, and remove 500 bytes per node. MCP results from 5 runs, min-max(mean +/- stddev): store_load_msec:35093-37907(36146+/-1.1e+03) vsz_kb:555168 store_rewrite_sec:12.120000-13.750000(12.7+/-0.6) listnodes_sec:1.270000-1.370000(1.322+/-0.039) listchannels_sec:29.770000-31.600000(30.82+/-0.64) routing_sec:0.00 peer_write_all_sec:63.630000-67.850000(65.432+/-1.7) MCP notable changes from pre-Dijkstra (>1 stddev): -vsz_kb:577456 +vsz_kb:555168 -routing_sec:60.70 +routing_sec:12.04 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-18 06:33:09 +00:00
Rusty Russell	cfdb012b30	gossipd: re-add fuzz logic to routing. Do it inside the can_reach() function, which is less optimal for BFG which does 20 ops on the same channel, but fine for Dijkstra. This does have a measurable cost, so we might want to use non-cryptographic fuzz in future: $ gossipd/test/run-bench-find_route 100000 100: Before: 100 (100 succeeded) routes in 100000 nodes in 97346 msec (973461784 nanoseconds per route) After: 100 (100 succeeded) routes in 100000 nodes in 113381 msec (1133813412 nanoseconds per route) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-18 06:33:09 +00:00
Rusty Russell	e197956032	gossipd/routing: Iterate on Dijkstra when route is too long. If a route is too long, we try to bias Dijkstra towards choosing a shorter route by adding a per-hop cost. We do a naive "shortest path" pass, then using that cost as a ceiling on per-hop cost, we do a binary search. There are some subtleties: we use risk rather than total as our counter field (we normally bias this by 1 anyway, so it's easy to make that a variable), and we set riskfactor to a mimimal value once we're iterating. It's good enough to get a solution, we don't need to do a 2-dimensional search on riskfactor and riskbias. Of course, this is extremely slow if we hit it on our benchmark, though it doesn't happen in a more realistic network: $ gossipd/test/run-bench-find_route 100000 100: Before: 100 (79 succeeded) routes in 100000 nodes in 25341 msec (253412314 nanoseconds per route) After: 100 (100 succeeded) routes in 100000 nodes in 97346 msec (973461784 nanoseconds per route) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-18 06:33:09 +00:00
Rusty Russell	f8ffae837d	gossipd: speed Dijkstra a little. Our uintmap can be a little slow with all the reallocation, so leave NULL entries and walk to find the first one. Since we don't clean them up, keep a cache of where the min non-all-NULL value is in the heap. It's clearer benefit on really large tests, so here's 1M nodes: Comparison using gossipd/test/run-bench-find_route 1000000 10: Before: 10 (10 succeeded) routes in 1000000 nodes in 91995 msec (9199532898 nanoseconds per route) After: 10 (10 succeeded) routes in 1000000 nodes in 20605 msec (2060539287 nanoseconds per route) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-18 06:33:09 +00:00
Rusty Russell	7caa37f0f1	gossipd: implement Dijkstra. Use a uintmap as our minheap. Note that Dijkstra can give overlength routes, so some checks are disabled. Comparison using gossipd/test/run-bench-find_route 100000 10: Before: 10 (10 succeeded) routes in 100000 nodes in 120087 msec (12008708402 nanoseconds per route) After: 10 (10 succeeded) routes in 100000 nodes in 2269 msec (226925462 nanoseconds per route) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-18 06:33:09 +00:00
Rusty Russell	4d84a436f5	gossipd: temporarily disable fuzz in routing. This allows precise comparison between Dijkstra and Bellman-Ford without worrying about fuzz. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-18 06:33:09 +00:00
Rusty Russell	594af8049b	gossipd: extract common functionality. This will be needed by Dijkstra as well. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-18 06:33:09 +00:00
Rusty Russell	6dfa46d65a	gossipd/test: add test for handling overlong routes. This is a weakness with Dijkstra, so write an explicit unit test that we can find a short enough (but more expensive) route. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-18 06:33:09 +00:00
trueptolemy	77236caa91	gossipd: fix the check for node announcement in broadcast_state_check() There should check if node_id_1 was stored in pubkeys, other than checking scid.	2019-04-16 00:20:26 +00:00
trueptolemy	274f156b28	gossiped: rename empty_node_map() to new_node_map() empty_node_map() sounds like a destructor. new_node_map() makes sense and is better.	2019-04-14 23:12:00 +00:00
trueptolemy	ee036a2e36	Gossipd: change the pending_cannouncement list to htable	2019-04-14 05:39:31 +00:00
Rusty Russell	261921dee2	gossipd: adjust peers' broadcast_offset when compacting store. When we compact the store, we need to adjust the broadast index for peers so they know where they're up to. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-11 18:31:34 -07:00
Rusty Russell	fdb42c3170	gossipd: don't keep channel_updates in memory. This requires some trickiness when we want to re-add unannounced channels to the store after compaction, so we extract a common "copy_message" to transfer from old store to new. MCP results from 5 runs, min-max(mean +/- stddev): store_load_msec:36034-37853(37109.8+/-5.9e+02) vsz_kb:577456 store_rewrite_sec:12.490000-13.250000(12.862+/-0.27) listnodes_sec:1.250000-1.480000(1.364+/-0.09) listchannels_sec:30.820000-31.480000(31.068+/-0.24) routing_sec:26.940000-27.990000(27.616+/-0.39) peer_write_all_sec:65.690000-68.600000(66.698+/-0.99) MCP notable changes from previous patch (>1 stddev): -vsz_kb:1202316 +vsz_kb:577456 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-11 18:31:34 -07:00
Rusty Russell	0370ed2eca	gossipd: use pread in the store. The next patch causes us to access the store while loading (we read channel_updates for local peers), which messes up loading due to the lseek involved. Using pread() is atomic with seek & read, and also a bit more efficient. Make the header contiguous too, while we're here. We don't need pwrite: we always open with O_APPEND which means the seek-to-end is implicit. MCP results from 5 runs, min-max(mean +/- stddev): store_load_msec:36771-38289(37529.6+/-5.3e+02) vsz_kb:1202316 store_rewrite_sec:12.460000-13.280000(12.784+/-0.29) listnodes_sec:1.240000-1.410000(1.34+/-0.058) listchannels_sec:29.850000-31.840000(30.908+/-0.69) routing_sec:27.800000-31.790000(28.822+/-1.5) peer_write_all_sec:66.200000-68.720000(67.44+/-0.84) MCP notable changes from previous patch (>1 stddev): -store_load_msec:39207-45089(41374.6+/-2.2e+03) +store_load_msec:36771-38289(37529.6+/-5.3e+02) -store_rewrite_sec:15.090000-16.790000(15.654+/-0.63) +store_rewrite_sec:12.460000-13.280000(12.784+/-0.29) -peer_write_all_sec:66.830000-76.850000(71.976+/-3.6) +peer_write_all_sec:66.200000-68.720000(67.44+/-0.84) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-11 18:31:34 -07:00
Rusty Russell	2135c7a024	gossipd: allow reading from the store during load. When we no longer keep channel_updates in memory, there's a path where we access them on load: when we promote a local channel to an announced channel. This breaks at the moment, since gs->fd == -1; change it to a writable flag instead. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-11 18:31:34 -07:00
Rusty Russell	aeb72a05e3	gossipd: remove some fields from struct chan. The txout_script field is unused; the local_disable only applies to the handful of local channels, so move that into a hash table. MCP results from 5 runs, min-max(mean +/- stddev): store_load_msec:39207-45089(41374.6+/-2.2e+03) vsz_kb:1202316 store_rewrite_sec:15.090000-16.790000(15.654+/-0.63) listnodes_sec:1.290000-3.790000(1.938+/-0.93) listchannels_sec:30.190000-32.120000(31.31+/-0.69) routing_sec:28.220000-31.340000(29.314+/-1.2) peer_write_all_sec:66.830000-76.850000(71.976+/-3.6) MCP notable changes from previous patch (>1 stddev): -store_load_msec:35107-37944(36686+/-1e+03) +store_load_msec:39207-45089(41374.6+/-2.2e+03) -vsz_kb:1218036 +vsz_kb:1202316 -listchannels_sec:28.510000-30.270000(29.6+/-0.6) +listchannels_sec:30.190000-32.120000(31.31+/-0.69) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-11 18:31:34 -07:00
Rusty Russell	3280466e19	gossipd: don't keep channel_announcement messages in memory. MCP results from 5 runs, min-max(mean +/- stddev): store_load_msec:35107-37944(36686+/-1e+03) vsz_kb:1218036 store_rewrite_sec:14.060000-17.970000(15.966+/-1.6) listnodes_sec:1.270000-1.350000(1.314+/-0.034) listchannels_sec:28.510000-30.270000(29.6+/-0.6) routing_sec:30.230000-31.510000(30.83+/-0.44) peer_write_all_sec:67.390000-70.710000(68.568+/-1.2) MCP notable changes from previous patch (>1 stddev): -vsz_kb:1780516 +vsz_kb:1218036 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-11 18:31:34 -07:00
Rusty Russell	2fd4a0121f	gossipd: unify is_chan_public / is_chan_announced. We used to have a `struct chan` while we're waiting for an update; now we keep that internally. So a `struct chan` without a channel_announcement in the store is private, and other is public. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-11 18:31:34 -07:00
Rusty Russell	aafc489edb	gossipd: remove info fields from struct node. Reload them from disk if they do listnodes. MCP results from 5 runs, min-max(mean +/- stddev): store_load_msec:35390-38659(37336.4+/-1.3e+03) vsz_kb:1780516 store_rewrite_sec:13.800000-16.800000(15.02+/-0.98) listnodes_sec:1.280000-1.530000(1.382+/-0.096) listchannels_sec:28.700000-30.440000(29.34+/-0.68) routing_sec:30.120000-31.080000(30.526+/-0.35) peer_write_all_sec:65.910000-76.850000(69.462+/-4.1) MCP notable changes from previous patch (>1 stddev): -vsz_kb:1792996 +vsz_kb:1780516 -listnodes_sec:1.030000-1.120000(1.068+/-0.032) +listnodes_sec:1.280000-1.530000(1.382+/-0.096) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-11 18:31:34 -07:00
Rusty Russell	0608c36301	gossipd: don't keep node_announcement messages in memory. MCP results from 5 runs, min-max(mean +/- stddev): store_load_msec:34779-38628(36903.4+/-1.4e+03) vsz_kb:1792996 store_rewrite_sec:14.440000-15.040000(14.672+/-0.24) listnodes_sec:1.030000-1.120000(1.068+/-0.032) listchannels_sec:27.860000-32.850000(30.05+/-1.7) routing_sec:30.020000-31.700000(31.044+/-0.56) peer_write_all_sec:65.100000-70.600000(68.422+/-2) -vsz_kb:1780516 +vsz_kb:1792996 -listnodes_sec:1.280000-1.530000(1.382+/-0.096) +listnodes_sec:1.030000-1.120000(1.068+/-0.032) MCP notable changes from previous patch (>1 stddev): -store_load_msec:30640-33236(32202+/-8.7e+02) +store_load_msec:34779-38628(36903.4+/-1.4e+03) -vsz_kb:1812956 +vsz_kb:1792996 -listnodes_sec:0.590000-0.660000(0.62+/-0.033) +listnodes_sec:1.030000-1.120000(1.068+/-0.032) -peer_write_all_sec:60.380000-61.320000(60.836+/-0.37) +peer_write_all_sec:65.100000-70.600000(68.422+/-2) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-11 18:31:34 -07:00
Rusty Russell	cb297b0a1b	gossipd: free tmpctx children in gossip_store_load loop. We're accumulating children, and we'll get more in the successive patches. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-11 18:31:34 -07:00
Rusty Russell	3ef767fd52	gossipd: don't use cached node_announcement for redundancy checking Re-parse the existing message, since we'e going to get rid of those fields. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-11 18:31:34 -07:00
Rusty Russell	e02f5817fe	gossipd: don't create struct chan for yet-to-be-updated channels. We currently create a struct chan when we receive a `channel_announcement`, but we can only broadcast once we have a `channel_update` (since that provides the timestamp). This means a `struct chan` can be in a weird state where it exists, but is unusable (can't use without an update), and also means we need to keep the channel_announcement message around until an update arrives, so we can put it in the gossip_store. Instead, keep track of these "unupdated" channels separately, and check for them in all the places we search for a specific channel to update. MCP results from 5 runs, min-max(mean +/- stddev): store_load_msec:30640-33236(32202+/-8.7e+02) vsz_kb:1812956 store_rewrite_sec:13.410000-16.970000(14.438+/-1.3) listnodes_sec:0.590000-0.660000(0.62+/-0.033) listchannels_sec:28.140000-29.560000(28.816+/-0.56) routing_sec:29.530000-32.590000(30.352+/-1.1) peer_write_all_sec:60.380000-61.320000(60.836+/-0.37) MCP notable changes from previous patch (>1 stddev): -vsz_kb:1812904 +vsz_kb:1812956 -store_rewrite_sec:21.390000-27.070000(23.596+/-2.4) +store_rewrite_sec:13.410000-16.970000(14.438+/-1.3) -listnodes_sec:1.120000-1.230000(1.176+/-0.044) +listnodes_sec:0.590000-0.660000(0.62+/-0.033) -listchannels_sec:38.900000-50.580000(44.716+/-3.9) +listchannels_sec:28.140000-29.560000(28.816+/-0.56) -routing_sec:45.080000-48.160000(46.814+/-1.1) +routing_sec:29.530000-32.590000(30.352+/-1.1) -peer_write_all_sec:58.780000-87.150000(72.278+/-9.7) +peer_write_all_sec:60.380000-61.320000(60.836+/-0.37) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-11 18:31:34 -07:00
Rusty Russell	d8aee68ba8	gossipd: handle duplicate nodes from unverified channel_announces properly. If we have a channel_announcement, we catch any node_announcement for either end while we validate the channel_announcement. But if we have multiple channel_announcements and the first one failed to verify, it would remove this catch, meaning we'd discard following node_announcements even though there was a pending channel_announcement. The answer is to use a simple reference count, and as a further optimization, only place the `pending_node_announce` if there's no node already. We also move the process_pending_node_announcement() calls lower down, so any new channel creation checks it. This is more robust, and will prove useful for the next patch, where we can use the same mechanism to handle node_announcements on channel_announcements which are verified, but don't yet have a channel_update. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-11 18:31:34 -07:00
Rusty Russell	da884751e8	gossipd: make routing_add_channel_update discard old timestamps. This is currently done higher up, in handle_channel_update(), but that's one reason why handle_channel_update() has to do a channel lookup. Moving the check down means handle_channel_update() can do a minimal "get node id for this channel" so it can check the signature. This helps, because the chan lookup semantics are changing in the next few patches. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-11 18:31:34 -07:00
Rusty Russell	6b9069ee28	broadcast: don't keep payload pointer. If we need the payload, pull it from the gossip store. MCP results from 5 runs, min-max(mean +/- stddev): store_load_msec:30189-52561(39416.4+/-8.8e+03) vsz_kb:1812904 store_rewrite_sec:21.390000-27.070000(23.596+/-2.4) listnodes_sec:1.120000-1.230000(1.176+/-0.044) listchannels_sec:38.900000-50.580000(44.716+/-3.9) routing_sec:45.080000-48.160000(46.814+/-1.1) peer_write_all_sec:58.780000-87.150000(72.278+/-9.7) MCP notable changes from previous patch (>1 stddev): -vsz_kb:2288784 +vsz_kb:1812904 -store_rewrite_sec:38.060000-39.130000(38.426+/-0.39) +store_rewrite_sec:21.390000-27.070000(23.596+/-2.4) -listnodes_sec:0.750000-0.850000(0.794+/-0.042) +listnodes_sec:1.120000-1.230000(1.176+/-0.044) -listchannels_sec:30.740000-31.760000(31.096+/-0.35) +listchannels_sec:38.900000-50.580000(44.716+/-3.9) -routing_sec:29.600000-33.560000(30.472+/-1.5) +routing_sec:45.080000-48.160000(46.814+/-1.1) -peer_write_all_sec:49.220000-52.690000(50.892+/-1.3) +peer_write_all_sec:58.780000-87.150000(72.278+/-9.7) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-11 18:31:34 -07:00
Rusty Russell	da845b660b	gossipd: gossip_store_get() to load a single store entry. This will allow us to load on demand, and not keep all messages in memory. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-11 18:31:34 -07:00
Rusty Russell	1f08cfb3e3	gossipd: use file offset within store as broadcast index. Instead of an arbitrary counter, we can use the file offset for our partial ordering, removing a field. It takes some care when we compact the store, however, as this field changes. MCP results from 5 runs, min-max(mean +/- stddev): store_load_msec:34271-35283(34789.6+/-3.3e+02) vsz_kb:2288784 store_rewrite_sec:38.060000-39.130000(38.426+/-0.39) listnodes_sec:0.750000-0.850000(0.794+/-0.042) listchannels_sec:30.740000-31.760000(31.096+/-0.35) routing_sec:29.600000-33.560000(30.472+/-1.5) peer_write_all_sec:49.220000-52.690000(50.892+/-1.3) MCP notable changes from previous patch (>1 stddev): -store_load_msec:35685-38538(37090.4+/-9.1e+02) +store_load_msec:34271-35283(34789.6+/-3.3e+02) -vsz_kb:2288768 +vsz_kb:2288784 -peer_write_all_sec:51.140000-58.350000(55.69+/-2.4) +peer_write_all_sec:49.220000-52.690000(50.892+/-1.3) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-11 18:31:34 -07:00
Rusty Russell	ec50ec6a71	gossipd: make gossip loading stats accurate. They didn't count the header sizes when reporting bytes, which is misleading. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-11 18:31:34 -07:00
Rusty Russell	eb4564c3cd	gossipd: embed broadcast information into each structure. This is more compact, but also required once we replace the arbitrary "index" with an actual offset into the gossip store. That will let us remove the in-memory variants entirely. MCP results from 5 runs, min-max(mean +/- stddev): store_load_msec:35685-38538(37090.4+/-9.1e+02) vsz_kb:2288768 store_rewrite_sec:35.530000-41.230000(37.904+/-2.3) listnodes_sec:0.720000-0.810000(0.762+/-0.041) listchannels_sec:30.750000-35.990000(32.704+/-2) routing_sec:29.570000-34.010000(31.374+/-1.8) peer_write_all_sec:51.140000-58.350000(55.69+/-2.4) MCP notable changes from previous patch (>1 stddev): -vsz_kb:2621808 +vsz_kb:2288768 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-11 18:31:34 -07:00
Rusty Russell	62918fcb3b	gossip_store: avoid gratuitous copy on load. Doesn't make measurable difference, but an obvious optimization. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-11 18:31:34 -07:00
Rusty Russell	617c23e735	gossipd: use u32 for timestamp. We used an s64 so we could use -1 and save a check, but that's just silly as we have adjacent non-u64 fields: wastes 7 bytes per node and 16 per channel. Interestingly, this seemed to make us a little slower for some reason. MCP results from 5 runs, min-max(mean +/- stddev): store_load_msec:35569-38776(37169.8+/-1.2e+03) vsz_kb:2621808 store_rewrite_sec:35.870000-40.290000(38.14+/-1.6) listnodes_sec:0.740000-0.800000(0.768+/-0.023) listchannels_sec:29.820000-32.730000(30.972+/-0.99) routing_sec:30.110000-30.590000(30.346+/-0.18) peer_write_all_sec:52.420000-59.160000(54.692+/-2.5) MCP notable changes from previous patch (>1 stddev): -store_load_msec:32825-36365(34615.6+/-1.1e+03) +store_load_msec:35569-38776(37169.8+/-1.2e+03) -vsz_kb:2637488 +vsz_kb:2621808 -store_rewrite_sec:35.150000-36.200000(35.59+/-0.4) +store_rewrite_sec:35.870000-40.290000(38.14+/-1.6) -listnodes_sec:0.590000-0.710000(0.682+/-0.046) +listnodes_sec:0.740000-0.800000(0.768+/-0.023) -peer_write_all_sec:49.020000-52.890000(50.376+/-1.5) +peer_write_all_sec:52.420000-59.160000(54.692+/-2.5) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-11 18:31:34 -07:00
Rusty Russell	0b484b111e	gossipd: make more compact getchannels entries. We can save significant space by combining both sides: so much that we can reduce the WIRE_LEN_LIMIT to something sane again. MCP results from 5 runs, min-max(mean +/- stddev): store_load_msec:34467-36764(35517.8+/-7.7e+02) vsz_kb:2637488 store_rewrite_sec:35.310000-36.580000(35.816+/-0.44) listnodes_sec:1.140000-2.780000(1.596+/-0.6) listchannels_sec:55.390000-58.110000(56.998+/-0.99) routing_sec:30.330000-30.920000(30.642+/-0.19) peer_write_all_sec:50.640000-53.360000(51.822+/-0.91) MCP notable changes from previous patch (>1 stddev): -store_rewrite_sec:34.720000-35.130000(34.94+/-0.14) +store_rewrite_sec:35.310000-36.580000(35.816+/-0.44) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-09 12:37:16 -07:00
Rusty Russell	91849dddc4	wire: use struct node_id for node ids. Don't turn them to/from pubkeys implicitly. This means nodeids in the store don't get converted, but bitcoin keys still do. MCP results from 5 runs, min-max(mean +/- stddev): store_load_msec:33934-35251(34531.4+/-5e+02) vsz_kb:2637488 store_rewrite_sec:34.720000-35.130000(34.94+/-0.14) listnodes_sec:1.020000-1.290000(1.146+/-0.086) listchannels_sec:51.110000-58.240000(54.826+/-2.5) routing_sec:30.000000-33.320000(30.726+/-1.3) peer_write_all_sec:50.370000-52.970000(51.646+/-1.1) MCP notable changes from previous patch (>1 stddev): -store_load_msec:46184-47474(46673.4+/-4.5e+02) +store_load_msec:33934-35251(34531.4+/-5e+02) -vsz_kb:2638880 +vsz_kb:2637488 -store_rewrite_sec:46.750000-48.280000(47.512+/-0.51) +store_rewrite_sec:34.720000-35.130000(34.94+/-0.14) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-09 12:37:16 -07:00
Rusty Russell	a2fa699e0e	Use node_id everywhere for nodes. I tried to just do gossipd, but it was uncontainable, so this ended up being a complete sweep. We didn't get much space saving in gossipd, even though we should save 24 bytes per node. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-09 12:37:16 -07:00
Rusty Russell	d4ab0592c5	fixup! gossipd: use simple inline array for nodes with few channels. Suggested-by: @cdecker Suggested-by: @niftynei	2019-04-09 12:37:16 -07:00
Rusty Russell	b6494c1994	gossipd: use simple inline array for nodes with few channels. Allocating a htable is overkill for most nodes; we can fit 11 pointers in the same space (10, since we use 1 to indicate we're using an array). MCP results from 5 runs, min-max(mean +/- stddev): store_load_msec:45947-47016(46683.4+/-4e+02) vsz_kb:2639240 store_rewrite_sec:46.950000-49.830000(48.048+/-0.95) listnodes_sec:1.090000-1.350000(1.196+/-0.095) listchannels_sec:48.960000-57.640000(53.358+/-2.8) routing_sec:29.990000-33.880000(31.088+/-1.4) peer_write_all_sec:49.360000-53.210000(51.338+/-1.4) MCP notable changes from previous patch (>1 stddev): - vsz_kb:2641316 + vsz_kb:2639240 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-09 12:37:16 -07:00
Rusty Russell	417e1bab7d	gossipd: use iterator helpers for iterating node channels. Makes the next step easier. MCP results from 5 runs, min-max(mean +/- stddev): store_load_msec:45791-46917(46330.4+/-3.6e+02) vsz_kb:2641316 store_rewrite_sec:47.040000-48.720000(47.684+/-0.57) listnodes_sec:1.140000-1.340000(1.2+/-0.072) listchannels_sec:50.970000-54.250000(52.698+/-1.3) routing_sec:29.950000-31.010000(30.332+/-0.37) peer_write_all_sec:51.570000-52.970000(52.1+/-0.54) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-09 12:37:16 -07:00
Rusty Russell	891ee20a59	tools/bench-gossipd.sh: rough benchmark for gossipd and the million channels project Outputs CSV. We add some stats for load times in developer mode, so we can easily read them out. peer_read_all_sec doesn't work, since we seem to reject about half the updates for having bad signatures. It's also very slow... routing fails, for unknown reasons, so that failure is ignored in routing_sec. Results from 5 runs, min-max(mean +/- stddev): store_load_msec,vsz_kb,store_rewrite_sec,listnodes_sec,listchannels_sec,routing_sec,peer_write_all_sec 39275-44779(40466.8+/-2.2e+03),2899248,41.010000-44.970000(41.972+/-1.5),2.280000-2.350000(2.304+/-0.025),49.770000-63.390000(59.178+/-5),33.310000-34.260000(33.62+/-0.35),42.100000-44.080000(43.082+/-0.67) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Header from folded patch 'fixup!_tools-bench-gossipd.sh__rough_benchmark_for_gossipd_and_the_million_channels_project-2.patch': fixup! tools/bench-gossipd.sh: rough benchmark for gossipd and the million channels project Suggested-by: @niftynei Header from folded patch 'fixup!_tools-bench-gossipd.sh__rough_benchmark_for_gossipd_and_the_million_channels_project-1.patch': fixup! tools/bench-gossipd.sh: rough benchmark for gossipd and the million channels project MCP filename change. Header from folded patch 'tools-bench-gossipd.sh__dont_print_csv_by_default.patch': tools/bench-gossipd.sh: don't print CSV by default. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Header from folded patch 'fixup!_tools-bench-gossipd.sh__rough_benchmark_for_gossipd_and_the_million_channels_project.patch': fixup! tools/bench-gossipd.sh: rough benchmark for gossipd and the million channels project Make shellcheck happy. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-08 04:41:43 +00:00
Rusty Russell	2bd7df93c6	gossipd: preserve unannounced channels across store compaction. Otherwise we'd forget them on restart, again. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-08 04:41:43 +00:00
Rusty Russell	c424c42668	gossipd: store local channel updates across restart, even if unannounced. Either private or simply not enough confirms. They would have been added on reconnect, but that's not ideal. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-08 04:41:43 +00:00
Rusty Russell	7c8f506a0f	dev-compact-store-gossip: specific RPC so we can test gossip_store rewrite. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-08 04:41:43 +00:00

... 3 4 5 6 7 ...

997 Commits