core-lightning

mirror of https://github.com/ElementsProject/lightning.git synced 2024-11-19 18:11:28 +01:00

Author	SHA1	Message	Date
Rusty Russell	6830233d0b	gossipd: control gossip level so we don't get flooded by peers. We seek a certain number of peers at each level of gossip; 3 "flood" if we're missing gossip, 2 at 24 hours past to catch recent gossip, and 8 with current gossip. The rest are given a filter which causes them not to gossip to us at all. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-12 00:37:46 +00:00
Rusty Russell	f5ea57d4c0	gossipd: reset gossip_missing if no reports for 10 minutes. An arbitrary timeout. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-12 00:37:46 +00:00
Rusty Russell	b9053767e7	gossipd: query unknown short_channel_ids, note if they were really missing. The first sign that we're missing gossip is that we get a channel_update for an unknown channel. The peer might be wrong (or lying), but if it turns out to be a real channel, we were definitely missing something. This patch does two things: queries when we get an unknown channel_update, and then notes that a channel_announcement was from such an update when it's finally processed. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-12 00:37:46 +00:00
Rusty Russell	18069ab3da	gossipd: APIs return more information about routing message handling. In particular, we'll need to know the short_channel_id if a channel_update is unknown (implies we're missing a channel), and whether processing a pending channel_announcement was successful (implies that the channel was real). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-12 00:37:46 +00:00
Rusty Russell	5ef7aa70d2	gossipd: prepare for internally-generated short-channel-id queries. Up until now we only generated these in dev mode for testing. Hoist into common code, turn counter into a flag (we're only allowed one!) and note if query is internal or not. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-12 00:37:46 +00:00
Rusty Russell	21c920a8e8	gossipd: note if loaded store seems reasonably up-to-date. If not, we can ask peers for full gossip (for now we just set a flag). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-12 00:37:46 +00:00
Rusty Russell	0d2a4830ed	ccan: update to faster and correct crc32c implementation. I decided to try a faster implementation, only to find our crc32c was not correct! Ouch. I removed the crc32c functions from ccan/crc, and added a new crc32c module which has the Mark Adler x86-64-optimized variants. We bump gossip_store version again, since csums have changed. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-11 23:40:10 +00:00
Rusty Russell	ab31f40aa2	gossipd: don't charge ourselves fees when calculating route. This means there's now a semantic difference between the default `fromid` and setting `fromid` explicitly to our own node_id. In the default case, it means we don't charge ourselves fees on the route. This means we can spend the full channel balance. We still want to consider the pricing of local channels, however: there's a reason to discount one over another, and that is to bias things. So we add the first-hop fee to the risk value instead. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-11 23:19:11 +00:00
Rusty Russell	b48c644e7a	listchannels: add `htlc_minimum_msat` and `htlc_maximum_msat` fields. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-11 23:19:11 +00:00
Rusty Russell	1a3886c116	wallet: keep a list of unreleased transactions. We're going to use this in the next patch. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-06 04:47:44 +00:00
Rusty Russell	628b65fb40	gossip_store: don't leave dangling channel_announce if we truncate. (Or, if we crashed before we got to write out the channel_update). It's a corner case, but one reported by @darosior and reproduced on my test node (both with bad gossip_store due to previous iterations of this patchset!). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-04 01:29:39 +00:00
Rusty Russell	f8b98e032c	gossipd: Don't abort() on duplicate entries in gossip_store. Triggered by a previous variant of this PR, but a goo1d idea to simply discard the store in general when we get a duplicate entry. We crash trying to delete old ones, which means writing to the store. But they should have already been deleted. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-04 01:29:39 +00:00
Rusty Russell	34c113a17a	gossipd: trivial clean up of routing_add_channel_update. For some reason I was reluctant to use the hc local variable; I even re-declared it! Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-04 01:29:39 +00:00
Rusty Russell	3e733afb2b	gossipd: remove broadcast map altogether. This clarifies things a fair bit: we simply add and remove from the gossip_store directly. Before this series: (--disable-developer, -Og) store_load_msec:20669-20902(20822.2+/-82) vsz_kb:439704-439712(439706+/-3.2) listnodes_sec:0.890000-1.000000(0.92+/-0.04) listchannels_sec:11.960000-13.380000(12.576+/-0.49) routing_sec:3.070000-5.970000(4.814+/-1.2) peer_write_all_sec:28.490000-30.580000(29.532+/-0.78) After: (--disable-developer, -Og) store_load_msec:19722-20124(19921.6+/-1.4e+02) vsz_kb:288320 listnodes_sec:0.860000-0.980000(0.912+/-0.056) listchannels_sec:10.790000-12.260000(11.65+/-0.5) routing_sec:2.540000-4.950000(4.262+/-0.88) peer_write_all_sec:17.570000-19.500000(18.048+/-0.73) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-04 01:29:39 +00:00
Rusty Russell	dd83453b2f	gossipd/gossip_store: fix compacting, don't use broadcast ordering. We have a problem: if we get halfway through writing the compacted store and run out of disk space, we've already changed half the indexes. This changes it so we do nothing until writing is finished: then we iterate through and update indexes. It also weans us off broadcast ordering, which we can now eliminated. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-04 01:29:39 +00:00
Rusty Russell	5161b79bfc	gossipd/gossip_store: keep count of deleted entries, don't use bs->count. We didn't count some records before, so we could compare the two counters. This is much simpler, and avoids reliance on bs. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-04 01:29:39 +00:00
Rusty Russell	728bb4e662	common/gossip_store: handle timestamp filtering. This means we intercept the peer's gossip_timestamp_filter request in the per-peer subdaemon itself. The rest of the semantics are fairly simple however. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-04 01:29:39 +00:00
Rusty Russell	948490ec58	gossipd: add timestamp in gossip store header. (We don't increment the gossip_store version, since there are only a few commits since the last time we did this). This lets the reader simply filter messages; this is especially nice since the channel_announcement timestamp is derived, not in the actual message. This also creates a 'struct gossip_hdr' which makes the code a bit clearer. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-04 01:29:39 +00:00
Rusty Russell	bad9734dc7	gossip_store: remove redundant copy_message. The single caller can easily use transfer_store_msg instead. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-04 01:29:39 +00:00
Rusty Russell	5591c0b5d8	gossipd: don't send gossip stream, let per-peer daemons read it themselves. Keeping the uintmap ordering all the broadcastable messages is expensive: 130MB for the million-channels project. But now we delete obsolete entries from the store, we can have the per-peer daemons simply read that sequentially and stream the gossip itself. This is the most primitive version, where all gossip is streamed; successive patches will bring back proper handling of timestamp filtering and initial_routing_sync. We add a gossip_state field to track what's happening with our gossip streaming: it's initialized in gossipd, and currently always set, but once we handle timestamps the per-peer daemon may do it when the first filter is sent. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-04 01:29:39 +00:00
Rusty Russell	4399faf57c	gossipd: make writes to gossip_store atomic. There's a corner case where otherwise a reader could see the header and not the body of a message. It could handle that in various ways, but simplest (and most efficient) is to avoid it happening. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-04 01:29:39 +00:00
Rusty Russell	a5f6ef385a	gossipd: don't wrap messages when we send them to the peer. They already send us gossip messages, so they have to be distinct anyway. Why make us both do extra work? Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-04 01:29:39 +00:00
Rusty Russell	df00f20e4a	gossipd: erase old entries from the store, don't just append. We use the high bit of the length field: this way we can still check that the checksums are valid on deleted fields. Once this is done, serially reading the gossip_store file will result in a complete, ordered, minimal gossip broadcast. Also, the horrible corner case where we might try to delete things from the store during load time is completely gone: we only load non-deleted things. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-04 01:29:39 +00:00
Rusty Russell	696dc6b597	gossipd: disable gossip_store upgrade. We're about to bump version again, and the code to upgrade it was quite hairy (and buggy!). It's not worthwhile for such a poorly-tested path: I will just add code to limit how much incoming gossip we get to avoid flooding when we upgrade, however. I also use a modern gossip_store version in our test_gossip_store_load test, instead of relying on the upgrade path. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-04 01:29:39 +00:00
Rusty Russell	43f2cbd250	gossipd: track gossip_store locations of local channels. We currently don't care, but the next patch means we have to find them again. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-04 01:29:39 +00:00
Rusty Russell	180a552fba	gossip_store: mark private updates separately from normal ones. They're really gossipd-internal, and we don't want per-peer daemons to confuse them with normal updates. I don't bump the gossip_store version; that's coming with another update anyway. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-04 01:29:39 +00:00
Rusty Russell	763697eb4c	gossipd: fix gossip_store calling delete. Now we handle node_announcements properly, we have a failure case where we try to move them when a channel is deleted while loading the store. We're going to remove this soon, in favor of in-place delete, so workaround this for now to avoid an assert() when we try to write to the store while loading. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-03 11:04:25 -07:00
Rusty Russell	21fe518513	gossip_store: fix 'bad node_announcement' by allowing node_announcement on un-updated channel. When we first receive a channel_update, we write both the channel_announcement and that channel_update to the store: we need that first update so we can set the channel_announcement timestamp. However, the channel_update can be replaced later. This means we can have a channel_announcement, a node_update which relies on it, then the channel_update later. So move the "this applies to a pending announcement" check lower, where gossip_store can use it too. Has a nice side-effect of avoiding one lookup of the node id. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-03 11:04:25 -07:00
Rusty Russell	c233fc5063	gossipd: fix spurious unused error with gcc-9 -O3. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-03 00:07:11 +00:00
Rusty Russell	c091a4ee40	gossipd: fix spurious gcc warning. It turns out that we don't look at type when we return 0, but gcc isn't quite smart enough for that. Initializing to -1 is good practice anyway for the failure path. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-03 00:07:11 +00:00
William Casarin	3f035cb3cc	gossipd: fix uninitialized free on short_route in goto path Fix a path where tal_free is called on an uninitialized variable If the first `goto bad_total` executes, then that path has uninitialized `short_route` but bad_total passes through to `out` whose first call is tal_free(short_route). This was noticed by a maybe-uninitialized heuristic on gcc 7.4.0: gossipd/routing.c: In function ‘find_shorter_route’: gossipd/routing.c:1096:2: error: ‘short_route’ may be used uninitialized in this function [-Werror=maybe-uninitialized] tal_free(short_route); Reported-by: @ZmnSCPxj <https://github.com/ElementsProject/lightning/pull/2674#issuecomment-495617253> Signed-off-by: William Casarin <jb55@jb55.com>	2019-06-03 00:07:11 +00:00
Rusty Russell	654e89b5fc	gossipd: free channels in routing_state destructor. Cleans up the tests. Suggested-by: @ZmnSCPxj Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-05-22 11:28:44 +00:00
Rusty Russell	d1f43d993a	gossipd: use explicit destructor for struct chan. Each destructor2 costs 40 bytes, and struct chan is only 120 bytes. So this drops our memory usage quite a bit: MCP bench results change: -vsz_kb:580004-580016(580006+/-4.8) +vsz_kb:533148 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-05-22 11:28:44 +00:00
Rusty Russell	59e75f1b2c	gossipd: reply to large listchannels in parts. This has two effects: most importantly, it avoids the problem where lightningd creates a 800MB JSON blob in response to listchannels, which causes OOM on the Raspberry Pi (our previous max allocation was 832MB). This is because lightning-cli can start draining the JSON while we're filling the buffer, so we end up with a max allocation of 68MB. But despite being less efficient (multiple queries to gossipd), it actually speeds things up due to the parallelism: MCP with -O3 -flto before vs after: -listchannels_sec:8.980000-9.330000(9.206+/-0.14) +listchannels_sec:7.500000-7.830000(7.656+/-0.11) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-05-22 11:28:44 +00:00
Rusty Russell	cb9c44ef27	gossipd: remove unnecessary dev_unknown_channel_satoshis arg. We now have a test blockchain for MCP which has the correct channels, so this is not needed. Also fix a benchmark script bug where 'mv "$DIR"/log "$DIR"/log.old.$$' would fail if you log didn't exist from a previous run. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-05-22 11:28:44 +00:00
Rusty Russell	85d8848ede	gossipd: neaten insert_broadcast a little. Suggested-by: @cdecker. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-05-22 11:28:44 +00:00
darosior	d9db9dc1ae	gossipd: fix listnodes crash on non existing id 'node_arr' was not instanciated if an id was passed to listnodes and we could not get a node from it	2019-05-16 19:30:10 +02:00
Rusty Russell	f5a218f9d1	gossipd: send per-peer daemons offsets into gossip store. Instead of reading the store ourselves, we can just send them an offset. This saves gossipd a lot of work, putting it where it belongs (in the daemon responsible for the specific peer). MCP bench results: store_load_msec:28509-31001(29206.6+/-9.4e+02) vsz_kb:580004-580016(580006+/-4.8) store_rewrite_sec:11.640000-12.730000(11.908+/-0.41) listnodes_sec:1.790000-1.880000(1.83+/-0.032) listchannels_sec:21.180000-21.950000(21.476+/-0.27) routing_sec:2.210000-11.160000(7.126+/-3.1) peer_write_all_sec:36.270000-41.200000(38.168+/-1.9) Signficant savings in streaming gossip: -peer_write_all_sec:48.160000-51.480000(49.608+/-1.1) +peer_write_all_sec:35.780000-37.980000(36.43+/-0.81) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-05-13 05:16:18 +00:00
Rusty Russell	0e37ac2433	common: move gossip_store read routine where subdaemons can access it. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-05-13 05:16:18 +00:00
Rusty Russell	d8db4e871f	gossipd: provide new fd to per-peer daemons when we compact it. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-05-13 05:16:18 +00:00
Rusty Russell	13717c6ebb	gossipd: hand a gossip_store_fd to all subdaemons. This will let them read from the gossip store directly. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-05-13 05:16:18 +00:00
Rusty Russell	89291b930e	gossipd: pass amount into gossip_store, rather than having it fetch. We need to store the channel capacity for channel_announcement: hand it in directly rather than having the gossip_store code do a lookup. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-05-13 05:16:18 +00:00
Rusty Russell	7ede5aac31	gossip_store: change format so we store raw messages. Save some overhead, plus gets us ready for giving subdaemons direct store access. This is the first time we upgrade the gossip_store, rather than just discarding. The downside is that we need to add an extra message after each channel_announcement, containing the channel capacity. After: store_load_msec:28337-30288(28975+/-7.4e+02) vsz_kb:582304-582316(582306+/-4.8) store_rewrite_sec:11.240000-11.800000(11.55+/-0.21) listnodes_sec:1.800000-1.880000(1.84+/-0.028) listchannels_sec:22.690000-26.260000(23.878+/-1.3) routing_sec:2.280000-9.570000(6.842+/-2.8) peer_write_all_sec:48.160000-51.480000(49.608+/-1.1) Differences: -vsz_kb:582320 +vsz_kb:582316 -listnodes_sec:2.100000-2.170000(2.118+/-0.026) +listnodes_sec:1.800000-1.880000(1.84+/-0.028) -peer_write_all_sec:51.600000-52.550000(52.188+/-0.34) +peer_write_all_sec:48.160000-51.480000(49.608+/-1.1) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-05-13 05:16:18 +00:00
Rusty Russell	c7034f271a	gossipd: avoid tal overhead in listnodes We know exactly how many there will be, so allocate an entire array up-front. -listnodes_sec:2.540000-2.610000(2.584+/-0.029) +listnodes_sec:2.100000-2.170000(2.118+/-0.026) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-05-13 05:16:18 +00:00
trueptolemy	fefe7dfbab	Gossipd: cleanup extra repeated code	2019-05-06 08:52:36 +00:00
Rusty Russell	0ca0db765a	gossipd: fix crash if we truncate store. Entries we've already loaded expect to exist in the store. We could go back and remove them all, but instead just truncate at the known-good point. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-05-01 11:59:12 +02:00
Rusty Russell	b248bb155a	tools/bench-gossipd.sh: make it work (where possible) with DEVELOPER=0 Some tests require dev support, but the rest can run. We simplify the gossip_store output so it's the same in non-dev mode too. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-24 13:46:39 -05:00
Rusty Russell	0fc42415c2	gossipd/routing: remove BFG implementation. Now we can benchmark, and remove 500 bytes per node. MCP results from 5 runs, min-max(mean +/- stddev): store_load_msec:35093-37907(36146+/-1.1e+03) vsz_kb:555168 store_rewrite_sec:12.120000-13.750000(12.7+/-0.6) listnodes_sec:1.270000-1.370000(1.322+/-0.039) listchannels_sec:29.770000-31.600000(30.82+/-0.64) routing_sec:0.00 peer_write_all_sec:63.630000-67.850000(65.432+/-1.7) MCP notable changes from pre-Dijkstra (>1 stddev): -vsz_kb:577456 +vsz_kb:555168 -routing_sec:60.70 +routing_sec:12.04 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-18 06:33:09 +00:00
Rusty Russell	cfdb012b30	gossipd: re-add fuzz logic to routing. Do it inside the can_reach() function, which is less optimal for BFG which does 20 ops on the same channel, but fine for Dijkstra. This does have a measurable cost, so we might want to use non-cryptographic fuzz in future: $ gossipd/test/run-bench-find_route 100000 100: Before: 100 (100 succeeded) routes in 100000 nodes in 97346 msec (973461784 nanoseconds per route) After: 100 (100 succeeded) routes in 100000 nodes in 113381 msec (1133813412 nanoseconds per route) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-18 06:33:09 +00:00
Rusty Russell	e197956032	gossipd/routing: Iterate on Dijkstra when route is too long. If a route is too long, we try to bias Dijkstra towards choosing a shorter route by adding a per-hop cost. We do a naive "shortest path" pass, then using that cost as a ceiling on per-hop cost, we do a binary search. There are some subtleties: we use risk rather than total as our counter field (we normally bias this by 1 anyway, so it's easy to make that a variable), and we set riskfactor to a mimimal value once we're iterating. It's good enough to get a solution, we don't need to do a 2-dimensional search on riskfactor and riskbias. Of course, this is extremely slow if we hit it on our benchmark, though it doesn't happen in a more realistic network: $ gossipd/test/run-bench-find_route 100000 100: Before: 100 (79 succeeded) routes in 100000 nodes in 25341 msec (253412314 nanoseconds per route) After: 100 (100 succeeded) routes in 100000 nodes in 97346 msec (973461784 nanoseconds per route) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-18 06:33:09 +00:00

1 2 3 4 5 ...

636 Commits