core-lightning

mirror of https://github.com/ElementsProject/lightning.git synced 2024-12-29 10:04:41 +01:00

Author	SHA1	Message	Date
Rusty Russell	8928f0b5f9	gossipd: remove gossip entirely if we hit a problem on load. The crashes in #2750 are mostly caused by us trying to partially truncate the store. The simplest fix for release is to discard the whole thing if we detect a problem. This is a workaround: it'd be far nicer to try to recover. Fixes: #2750 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-21 22:03:35 +00:00
Rusty Russell	8ce3b86aa5	gossipd: tighter correctness checks during gossip_store load. We shouldn't be loading old timestamps, either. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-21 22:03:35 +00:00
Rusty Russell	fc27250f80	gossipd: be more verbose and less assert()ive on bad node_announcement. We hit the timestamp assert on #2750; it shouldn't happen, but crashing doesn't leave much information. Reported-by: @m-schmook Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-21 22:03:35 +00:00
Rusty Russell	f1b57063f7	bitcoin/tx: use fromwire_fail in pull_bitcoin_tx. This is the correct way to mark failure: it also sets *max to 0. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-21 03:56:59 +00:00
Rusty Russell	47b5f2e837	gossipd: truncate gossip_store.tmp for compaction. If something went wrong and there was an old one, we were appending to it! Reported-by: @SimonVrouwe Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-20 02:53:52 +00:00
Rusty Russell	5e3690b3c5	gossipd: delete channel_amount from the store when we delete channel_announcement. Otherwise we slowly build up cruft: compaction simply moves them since they're not deleted. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-15 10:52:05 +02:00
Rusty Russell	10c503b4b4	gossip_store: clean up a truncated store. We might have channel_announcements which have no channel_update: normally these don't get written into the store until there is one, but if the store was truncated it can happen. We then get upset on compaction, since we don't have an in-memory representation of the channel_announcement. Similarly, we leave the node_announcement pending until after that channel_announcement, leading to a similar case. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-15 10:52:05 +02:00
Rusty Russell	24cc371cdf	gossipd: gossip_store errors after rewrite are fatal. We can't continue, since we've moved the indexes. We'll just crash anyway, as seen from bugs #2742 and #2743. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-14 02:17:32 +00:00
Rusty Russell	eb5cc47bdd	gossipd: count deleted records correctly when loading gossip_store. The result of an incorrect count was that we failed on next compaction. Fixes: #2743 Fixes: #2742 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-14 02:17:32 +00:00
Rusty Russell	745634d9b9	gossipd: don't catch pending node_announcements more than once. We catch node_announcements for nodes where we haven't finished analyzing the channel_announcement yet (either because we're still checking UTXO, or in this case, because we're waiting for a channel_update). But we reference count the pending_node_announce, so if we have multiple channels pending, we might try to insert it twice. Clear it so this doesn't happen. There's a second bug where we continue to catch node_announcements until all the channel_announcements are no longer pending; this is fixed by removing it from the map. Fixes: #2735 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-13 05:58:09 +00:00
Rusty Russell	1e32b4ab29	gossipd: adjust gossip filters if we discover we're missing gossip. We pick up to three random peers and ask them to gossip more. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-12 00:37:46 +00:00
Rusty Russell	6830233d0b	gossipd: control gossip level so we don't get flooded by peers. We seek a certain number of peers at each level of gossip; 3 "flood" if we're missing gossip, 2 at 24 hours past to catch recent gossip, and 8 with current gossip. The rest are given a filter which causes them not to gossip to us at all. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-12 00:37:46 +00:00
Rusty Russell	f5ea57d4c0	gossipd: reset gossip_missing if no reports for 10 minutes. An arbitrary timeout. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-12 00:37:46 +00:00
Rusty Russell	b9053767e7	gossipd: query unknown short_channel_ids, note if they were really missing. The first sign that we're missing gossip is that we get a channel_update for an unknown channel. The peer might be wrong (or lying), but if it turns out to be a real channel, we were definitely missing something. This patch does two things: queries when we get an unknown channel_update, and then notes that a channel_announcement was from such an update when it's finally processed. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-12 00:37:46 +00:00
Rusty Russell	18069ab3da	gossipd: APIs return more information about routing message handling. In particular, we'll need to know the short_channel_id if a channel_update is unknown (implies we're missing a channel), and whether processing a pending channel_announcement was successful (implies that the channel was real). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-12 00:37:46 +00:00
Rusty Russell	5ef7aa70d2	gossipd: prepare for internally-generated short-channel-id queries. Up until now we only generated these in dev mode for testing. Hoist into common code, turn counter into a flag (we're only allowed one!) and note if query is internal or not. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-12 00:37:46 +00:00
Rusty Russell	21c920a8e8	gossipd: note if loaded store seems reasonably up-to-date. If not, we can ask peers for full gossip (for now we just set a flag). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-12 00:37:46 +00:00
Rusty Russell	0d2a4830ed	ccan: update to faster and correct crc32c implementation. I decided to try a faster implementation, only to find our crc32c was not correct! Ouch. I removed the crc32c functions from ccan/crc, and added a new crc32c module which has the Mark Adler x86-64-optimized variants. We bump gossip_store version again, since csums have changed. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-11 23:40:10 +00:00
Rusty Russell	ab31f40aa2	gossipd: don't charge ourselves fees when calculating route. This means there's now a semantic difference between the default `fromid` and setting `fromid` explicitly to our own node_id. In the default case, it means we don't charge ourselves fees on the route. This means we can spend the full channel balance. We still want to consider the pricing of local channels, however: there's a reason to discount one over another, and that is to bias things. So we add the first-hop fee to the risk value instead. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-11 23:19:11 +00:00
Rusty Russell	b48c644e7a	listchannels: add `htlc_minimum_msat` and `htlc_maximum_msat` fields. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-11 23:19:11 +00:00
Rusty Russell	1a3886c116	wallet: keep a list of unreleased transactions. We're going to use this in the next patch. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-06 04:47:44 +00:00
Rusty Russell	628b65fb40	gossip_store: don't leave dangling channel_announce if we truncate. (Or, if we crashed before we got to write out the channel_update). It's a corner case, but one reported by @darosior and reproduced on my test node (both with bad gossip_store due to previous iterations of this patchset!). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-04 01:29:39 +00:00
Rusty Russell	f8b98e032c	gossipd: Don't abort() on duplicate entries in gossip_store. Triggered by a previous variant of this PR, but a goo1d idea to simply discard the store in general when we get a duplicate entry. We crash trying to delete old ones, which means writing to the store. But they should have already been deleted. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-04 01:29:39 +00:00
Rusty Russell	34c113a17a	gossipd: trivial clean up of routing_add_channel_update. For some reason I was reluctant to use the hc local variable; I even re-declared it! Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-04 01:29:39 +00:00
Rusty Russell	3e733afb2b	gossipd: remove broadcast map altogether. This clarifies things a fair bit: we simply add and remove from the gossip_store directly. Before this series: (--disable-developer, -Og) store_load_msec:20669-20902(20822.2+/-82) vsz_kb:439704-439712(439706+/-3.2) listnodes_sec:0.890000-1.000000(0.92+/-0.04) listchannels_sec:11.960000-13.380000(12.576+/-0.49) routing_sec:3.070000-5.970000(4.814+/-1.2) peer_write_all_sec:28.490000-30.580000(29.532+/-0.78) After: (--disable-developer, -Og) store_load_msec:19722-20124(19921.6+/-1.4e+02) vsz_kb:288320 listnodes_sec:0.860000-0.980000(0.912+/-0.056) listchannels_sec:10.790000-12.260000(11.65+/-0.5) routing_sec:2.540000-4.950000(4.262+/-0.88) peer_write_all_sec:17.570000-19.500000(18.048+/-0.73) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-04 01:29:39 +00:00
Rusty Russell	dd83453b2f	gossipd/gossip_store: fix compacting, don't use broadcast ordering. We have a problem: if we get halfway through writing the compacted store and run out of disk space, we've already changed half the indexes. This changes it so we do nothing until writing is finished: then we iterate through and update indexes. It also weans us off broadcast ordering, which we can now eliminated. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-04 01:29:39 +00:00
Rusty Russell	5161b79bfc	gossipd/gossip_store: keep count of deleted entries, don't use bs->count. We didn't count some records before, so we could compare the two counters. This is much simpler, and avoids reliance on bs. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-04 01:29:39 +00:00
Rusty Russell	728bb4e662	common/gossip_store: handle timestamp filtering. This means we intercept the peer's gossip_timestamp_filter request in the per-peer subdaemon itself. The rest of the semantics are fairly simple however. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-04 01:29:39 +00:00
Rusty Russell	948490ec58	gossipd: add timestamp in gossip store header. (We don't increment the gossip_store version, since there are only a few commits since the last time we did this). This lets the reader simply filter messages; this is especially nice since the channel_announcement timestamp is derived, not in the actual message. This also creates a 'struct gossip_hdr' which makes the code a bit clearer. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-04 01:29:39 +00:00
Rusty Russell	bad9734dc7	gossip_store: remove redundant copy_message. The single caller can easily use transfer_store_msg instead. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-04 01:29:39 +00:00
Rusty Russell	5591c0b5d8	gossipd: don't send gossip stream, let per-peer daemons read it themselves. Keeping the uintmap ordering all the broadcastable messages is expensive: 130MB for the million-channels project. But now we delete obsolete entries from the store, we can have the per-peer daemons simply read that sequentially and stream the gossip itself. This is the most primitive version, where all gossip is streamed; successive patches will bring back proper handling of timestamp filtering and initial_routing_sync. We add a gossip_state field to track what's happening with our gossip streaming: it's initialized in gossipd, and currently always set, but once we handle timestamps the per-peer daemon may do it when the first filter is sent. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-04 01:29:39 +00:00
Rusty Russell	4399faf57c	gossipd: make writes to gossip_store atomic. There's a corner case where otherwise a reader could see the header and not the body of a message. It could handle that in various ways, but simplest (and most efficient) is to avoid it happening. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-04 01:29:39 +00:00
Rusty Russell	a5f6ef385a	gossipd: don't wrap messages when we send them to the peer. They already send us gossip messages, so they have to be distinct anyway. Why make us both do extra work? Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-04 01:29:39 +00:00
Rusty Russell	df00f20e4a	gossipd: erase old entries from the store, don't just append. We use the high bit of the length field: this way we can still check that the checksums are valid on deleted fields. Once this is done, serially reading the gossip_store file will result in a complete, ordered, minimal gossip broadcast. Also, the horrible corner case where we might try to delete things from the store during load time is completely gone: we only load non-deleted things. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-04 01:29:39 +00:00
Rusty Russell	696dc6b597	gossipd: disable gossip_store upgrade. We're about to bump version again, and the code to upgrade it was quite hairy (and buggy!). It's not worthwhile for such a poorly-tested path: I will just add code to limit how much incoming gossip we get to avoid flooding when we upgrade, however. I also use a modern gossip_store version in our test_gossip_store_load test, instead of relying on the upgrade path. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-04 01:29:39 +00:00
Rusty Russell	43f2cbd250	gossipd: track gossip_store locations of local channels. We currently don't care, but the next patch means we have to find them again. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-04 01:29:39 +00:00
Rusty Russell	180a552fba	gossip_store: mark private updates separately from normal ones. They're really gossipd-internal, and we don't want per-peer daemons to confuse them with normal updates. I don't bump the gossip_store version; that's coming with another update anyway. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-04 01:29:39 +00:00
Rusty Russell	763697eb4c	gossipd: fix gossip_store calling delete. Now we handle node_announcements properly, we have a failure case where we try to move them when a channel is deleted while loading the store. We're going to remove this soon, in favor of in-place delete, so workaround this for now to avoid an assert() when we try to write to the store while loading. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-03 11:04:25 -07:00
Rusty Russell	21fe518513	gossip_store: fix 'bad node_announcement' by allowing node_announcement on un-updated channel. When we first receive a channel_update, we write both the channel_announcement and that channel_update to the store: we need that first update so we can set the channel_announcement timestamp. However, the channel_update can be replaced later. This means we can have a channel_announcement, a node_update which relies on it, then the channel_update later. So move the "this applies to a pending announcement" check lower, where gossip_store can use it too. Has a nice side-effect of avoiding one lookup of the node id. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-03 11:04:25 -07:00
Rusty Russell	c233fc5063	gossipd: fix spurious unused error with gcc-9 -O3. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-03 00:07:11 +00:00
Rusty Russell	c091a4ee40	gossipd: fix spurious gcc warning. It turns out that we don't look at type when we return 0, but gcc isn't quite smart enough for that. Initializing to -1 is good practice anyway for the failure path. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-06-03 00:07:11 +00:00
William Casarin	3f035cb3cc	gossipd: fix uninitialized free on short_route in goto path Fix a path where tal_free is called on an uninitialized variable If the first `goto bad_total` executes, then that path has uninitialized `short_route` but bad_total passes through to `out` whose first call is tal_free(short_route). This was noticed by a maybe-uninitialized heuristic on gcc 7.4.0: gossipd/routing.c: In function ‘find_shorter_route’: gossipd/routing.c:1096:2: error: ‘short_route’ may be used uninitialized in this function [-Werror=maybe-uninitialized] tal_free(short_route); Reported-by: @ZmnSCPxj <https://github.com/ElementsProject/lightning/pull/2674#issuecomment-495617253> Signed-off-by: William Casarin <jb55@jb55.com>	2019-06-03 00:07:11 +00:00
Rusty Russell	654e89b5fc	gossipd: free channels in routing_state destructor. Cleans up the tests. Suggested-by: @ZmnSCPxj Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-05-22 11:28:44 +00:00
Rusty Russell	d1f43d993a	gossipd: use explicit destructor for struct chan. Each destructor2 costs 40 bytes, and struct chan is only 120 bytes. So this drops our memory usage quite a bit: MCP bench results change: -vsz_kb:580004-580016(580006+/-4.8) +vsz_kb:533148 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-05-22 11:28:44 +00:00
Rusty Russell	59e75f1b2c	gossipd: reply to large listchannels in parts. This has two effects: most importantly, it avoids the problem where lightningd creates a 800MB JSON blob in response to listchannels, which causes OOM on the Raspberry Pi (our previous max allocation was 832MB). This is because lightning-cli can start draining the JSON while we're filling the buffer, so we end up with a max allocation of 68MB. But despite being less efficient (multiple queries to gossipd), it actually speeds things up due to the parallelism: MCP with -O3 -flto before vs after: -listchannels_sec:8.980000-9.330000(9.206+/-0.14) +listchannels_sec:7.500000-7.830000(7.656+/-0.11) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-05-22 11:28:44 +00:00
Rusty Russell	cb9c44ef27	gossipd: remove unnecessary dev_unknown_channel_satoshis arg. We now have a test blockchain for MCP which has the correct channels, so this is not needed. Also fix a benchmark script bug where 'mv "$DIR"/log "$DIR"/log.old.$$' would fail if you log didn't exist from a previous run. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-05-22 11:28:44 +00:00
Rusty Russell	85d8848ede	gossipd: neaten insert_broadcast a little. Suggested-by: @cdecker. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-05-22 11:28:44 +00:00
darosior	d9db9dc1ae	gossipd: fix listnodes crash on non existing id 'node_arr' was not instanciated if an id was passed to listnodes and we could not get a node from it	2019-05-16 19:30:10 +02:00
Rusty Russell	f5a218f9d1	gossipd: send per-peer daemons offsets into gossip store. Instead of reading the store ourselves, we can just send them an offset. This saves gossipd a lot of work, putting it where it belongs (in the daemon responsible for the specific peer). MCP bench results: store_load_msec:28509-31001(29206.6+/-9.4e+02) vsz_kb:580004-580016(580006+/-4.8) store_rewrite_sec:11.640000-12.730000(11.908+/-0.41) listnodes_sec:1.790000-1.880000(1.83+/-0.032) listchannels_sec:21.180000-21.950000(21.476+/-0.27) routing_sec:2.210000-11.160000(7.126+/-3.1) peer_write_all_sec:36.270000-41.200000(38.168+/-1.9) Signficant savings in streaming gossip: -peer_write_all_sec:48.160000-51.480000(49.608+/-1.1) +peer_write_all_sec:35.780000-37.980000(36.43+/-0.81) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-05-13 05:16:18 +00:00
Rusty Russell	0e37ac2433	common: move gossip_store read routine where subdaemons can access it. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-05-13 05:16:18 +00:00
Rusty Russell	d8db4e871f	gossipd: provide new fd to per-peer daemons when we compact it. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-05-13 05:16:18 +00:00
Rusty Russell	13717c6ebb	gossipd: hand a gossip_store_fd to all subdaemons. This will let them read from the gossip store directly. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-05-13 05:16:18 +00:00
Rusty Russell	89291b930e	gossipd: pass amount into gossip_store, rather than having it fetch. We need to store the channel capacity for channel_announcement: hand it in directly rather than having the gossip_store code do a lookup. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-05-13 05:16:18 +00:00
Rusty Russell	7ede5aac31	gossip_store: change format so we store raw messages. Save some overhead, plus gets us ready for giving subdaemons direct store access. This is the first time we upgrade the gossip_store, rather than just discarding. The downside is that we need to add an extra message after each channel_announcement, containing the channel capacity. After: store_load_msec:28337-30288(28975+/-7.4e+02) vsz_kb:582304-582316(582306+/-4.8) store_rewrite_sec:11.240000-11.800000(11.55+/-0.21) listnodes_sec:1.800000-1.880000(1.84+/-0.028) listchannels_sec:22.690000-26.260000(23.878+/-1.3) routing_sec:2.280000-9.570000(6.842+/-2.8) peer_write_all_sec:48.160000-51.480000(49.608+/-1.1) Differences: -vsz_kb:582320 +vsz_kb:582316 -listnodes_sec:2.100000-2.170000(2.118+/-0.026) +listnodes_sec:1.800000-1.880000(1.84+/-0.028) -peer_write_all_sec:51.600000-52.550000(52.188+/-0.34) +peer_write_all_sec:48.160000-51.480000(49.608+/-1.1) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-05-13 05:16:18 +00:00
Rusty Russell	c7034f271a	gossipd: avoid tal overhead in listnodes We know exactly how many there will be, so allocate an entire array up-front. -listnodes_sec:2.540000-2.610000(2.584+/-0.029) +listnodes_sec:2.100000-2.170000(2.118+/-0.026) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-05-13 05:16:18 +00:00
trueptolemy	fefe7dfbab	Gossipd: cleanup extra repeated code	2019-05-06 08:52:36 +00:00
Rusty Russell	0ca0db765a	gossipd: fix crash if we truncate store. Entries we've already loaded expect to exist in the store. We could go back and remove them all, but instead just truncate at the known-good point. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-05-01 11:59:12 +02:00
Rusty Russell	b248bb155a	tools/bench-gossipd.sh: make it work (where possible) with DEVELOPER=0 Some tests require dev support, but the rest can run. We simplify the gossip_store output so it's the same in non-dev mode too. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-24 13:46:39 -05:00
Rusty Russell	0fc42415c2	gossipd/routing: remove BFG implementation. Now we can benchmark, and remove 500 bytes per node. MCP results from 5 runs, min-max(mean +/- stddev): store_load_msec:35093-37907(36146+/-1.1e+03) vsz_kb:555168 store_rewrite_sec:12.120000-13.750000(12.7+/-0.6) listnodes_sec:1.270000-1.370000(1.322+/-0.039) listchannels_sec:29.770000-31.600000(30.82+/-0.64) routing_sec:0.00 peer_write_all_sec:63.630000-67.850000(65.432+/-1.7) MCP notable changes from pre-Dijkstra (>1 stddev): -vsz_kb:577456 +vsz_kb:555168 -routing_sec:60.70 +routing_sec:12.04 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-18 06:33:09 +00:00
Rusty Russell	cfdb012b30	gossipd: re-add fuzz logic to routing. Do it inside the can_reach() function, which is less optimal for BFG which does 20 ops on the same channel, but fine for Dijkstra. This does have a measurable cost, so we might want to use non-cryptographic fuzz in future: $ gossipd/test/run-bench-find_route 100000 100: Before: 100 (100 succeeded) routes in 100000 nodes in 97346 msec (973461784 nanoseconds per route) After: 100 (100 succeeded) routes in 100000 nodes in 113381 msec (1133813412 nanoseconds per route) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-18 06:33:09 +00:00
Rusty Russell	e197956032	gossipd/routing: Iterate on Dijkstra when route is too long. If a route is too long, we try to bias Dijkstra towards choosing a shorter route by adding a per-hop cost. We do a naive "shortest path" pass, then using that cost as a ceiling on per-hop cost, we do a binary search. There are some subtleties: we use risk rather than total as our counter field (we normally bias this by 1 anyway, so it's easy to make that a variable), and we set riskfactor to a mimimal value once we're iterating. It's good enough to get a solution, we don't need to do a 2-dimensional search on riskfactor and riskbias. Of course, this is extremely slow if we hit it on our benchmark, though it doesn't happen in a more realistic network: $ gossipd/test/run-bench-find_route 100000 100: Before: 100 (79 succeeded) routes in 100000 nodes in 25341 msec (253412314 nanoseconds per route) After: 100 (100 succeeded) routes in 100000 nodes in 97346 msec (973461784 nanoseconds per route) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-18 06:33:09 +00:00
Rusty Russell	f8ffae837d	gossipd: speed Dijkstra a little. Our uintmap can be a little slow with all the reallocation, so leave NULL entries and walk to find the first one. Since we don't clean them up, keep a cache of where the min non-all-NULL value is in the heap. It's clearer benefit on really large tests, so here's 1M nodes: Comparison using gossipd/test/run-bench-find_route 1000000 10: Before: 10 (10 succeeded) routes in 1000000 nodes in 91995 msec (9199532898 nanoseconds per route) After: 10 (10 succeeded) routes in 1000000 nodes in 20605 msec (2060539287 nanoseconds per route) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-18 06:33:09 +00:00
Rusty Russell	7caa37f0f1	gossipd: implement Dijkstra. Use a uintmap as our minheap. Note that Dijkstra can give overlength routes, so some checks are disabled. Comparison using gossipd/test/run-bench-find_route 100000 10: Before: 10 (10 succeeded) routes in 100000 nodes in 120087 msec (12008708402 nanoseconds per route) After: 10 (10 succeeded) routes in 100000 nodes in 2269 msec (226925462 nanoseconds per route) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-18 06:33:09 +00:00
Rusty Russell	4d84a436f5	gossipd: temporarily disable fuzz in routing. This allows precise comparison between Dijkstra and Bellman-Ford without worrying about fuzz. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-18 06:33:09 +00:00
Rusty Russell	594af8049b	gossipd: extract common functionality. This will be needed by Dijkstra as well. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-18 06:33:09 +00:00
Rusty Russell	6dfa46d65a	gossipd/test: add test for handling overlong routes. This is a weakness with Dijkstra, so write an explicit unit test that we can find a short enough (but more expensive) route. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-18 06:33:09 +00:00
trueptolemy	77236caa91	gossipd: fix the check for node announcement in broadcast_state_check() There should check if node_id_1 was stored in pubkeys, other than checking scid.	2019-04-16 00:20:26 +00:00
trueptolemy	274f156b28	gossiped: rename empty_node_map() to new_node_map() empty_node_map() sounds like a destructor. new_node_map() makes sense and is better.	2019-04-14 23:12:00 +00:00
trueptolemy	ee036a2e36	Gossipd: change the pending_cannouncement list to htable	2019-04-14 05:39:31 +00:00
Rusty Russell	261921dee2	gossipd: adjust peers' broadcast_offset when compacting store. When we compact the store, we need to adjust the broadast index for peers so they know where they're up to. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-11 18:31:34 -07:00
Rusty Russell	fdb42c3170	gossipd: don't keep channel_updates in memory. This requires some trickiness when we want to re-add unannounced channels to the store after compaction, so we extract a common "copy_message" to transfer from old store to new. MCP results from 5 runs, min-max(mean +/- stddev): store_load_msec:36034-37853(37109.8+/-5.9e+02) vsz_kb:577456 store_rewrite_sec:12.490000-13.250000(12.862+/-0.27) listnodes_sec:1.250000-1.480000(1.364+/-0.09) listchannels_sec:30.820000-31.480000(31.068+/-0.24) routing_sec:26.940000-27.990000(27.616+/-0.39) peer_write_all_sec:65.690000-68.600000(66.698+/-0.99) MCP notable changes from previous patch (>1 stddev): -vsz_kb:1202316 +vsz_kb:577456 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-11 18:31:34 -07:00
Rusty Russell	0370ed2eca	gossipd: use pread in the store. The next patch causes us to access the store while loading (we read channel_updates for local peers), which messes up loading due to the lseek involved. Using pread() is atomic with seek & read, and also a bit more efficient. Make the header contiguous too, while we're here. We don't need pwrite: we always open with O_APPEND which means the seek-to-end is implicit. MCP results from 5 runs, min-max(mean +/- stddev): store_load_msec:36771-38289(37529.6+/-5.3e+02) vsz_kb:1202316 store_rewrite_sec:12.460000-13.280000(12.784+/-0.29) listnodes_sec:1.240000-1.410000(1.34+/-0.058) listchannels_sec:29.850000-31.840000(30.908+/-0.69) routing_sec:27.800000-31.790000(28.822+/-1.5) peer_write_all_sec:66.200000-68.720000(67.44+/-0.84) MCP notable changes from previous patch (>1 stddev): -store_load_msec:39207-45089(41374.6+/-2.2e+03) +store_load_msec:36771-38289(37529.6+/-5.3e+02) -store_rewrite_sec:15.090000-16.790000(15.654+/-0.63) +store_rewrite_sec:12.460000-13.280000(12.784+/-0.29) -peer_write_all_sec:66.830000-76.850000(71.976+/-3.6) +peer_write_all_sec:66.200000-68.720000(67.44+/-0.84) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-11 18:31:34 -07:00
Rusty Russell	2135c7a024	gossipd: allow reading from the store during load. When we no longer keep channel_updates in memory, there's a path where we access them on load: when we promote a local channel to an announced channel. This breaks at the moment, since gs->fd == -1; change it to a writable flag instead. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-11 18:31:34 -07:00
Rusty Russell	aeb72a05e3	gossipd: remove some fields from struct chan. The txout_script field is unused; the local_disable only applies to the handful of local channels, so move that into a hash table. MCP results from 5 runs, min-max(mean +/- stddev): store_load_msec:39207-45089(41374.6+/-2.2e+03) vsz_kb:1202316 store_rewrite_sec:15.090000-16.790000(15.654+/-0.63) listnodes_sec:1.290000-3.790000(1.938+/-0.93) listchannels_sec:30.190000-32.120000(31.31+/-0.69) routing_sec:28.220000-31.340000(29.314+/-1.2) peer_write_all_sec:66.830000-76.850000(71.976+/-3.6) MCP notable changes from previous patch (>1 stddev): -store_load_msec:35107-37944(36686+/-1e+03) +store_load_msec:39207-45089(41374.6+/-2.2e+03) -vsz_kb:1218036 +vsz_kb:1202316 -listchannels_sec:28.510000-30.270000(29.6+/-0.6) +listchannels_sec:30.190000-32.120000(31.31+/-0.69) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-11 18:31:34 -07:00
Rusty Russell	3280466e19	gossipd: don't keep channel_announcement messages in memory. MCP results from 5 runs, min-max(mean +/- stddev): store_load_msec:35107-37944(36686+/-1e+03) vsz_kb:1218036 store_rewrite_sec:14.060000-17.970000(15.966+/-1.6) listnodes_sec:1.270000-1.350000(1.314+/-0.034) listchannels_sec:28.510000-30.270000(29.6+/-0.6) routing_sec:30.230000-31.510000(30.83+/-0.44) peer_write_all_sec:67.390000-70.710000(68.568+/-1.2) MCP notable changes from previous patch (>1 stddev): -vsz_kb:1780516 +vsz_kb:1218036 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-11 18:31:34 -07:00
Rusty Russell	2fd4a0121f	gossipd: unify is_chan_public / is_chan_announced. We used to have a `struct chan` while we're waiting for an update; now we keep that internally. So a `struct chan` without a channel_announcement in the store is private, and other is public. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-11 18:31:34 -07:00
Rusty Russell	aafc489edb	gossipd: remove info fields from struct node. Reload them from disk if they do listnodes. MCP results from 5 runs, min-max(mean +/- stddev): store_load_msec:35390-38659(37336.4+/-1.3e+03) vsz_kb:1780516 store_rewrite_sec:13.800000-16.800000(15.02+/-0.98) listnodes_sec:1.280000-1.530000(1.382+/-0.096) listchannels_sec:28.700000-30.440000(29.34+/-0.68) routing_sec:30.120000-31.080000(30.526+/-0.35) peer_write_all_sec:65.910000-76.850000(69.462+/-4.1) MCP notable changes from previous patch (>1 stddev): -vsz_kb:1792996 +vsz_kb:1780516 -listnodes_sec:1.030000-1.120000(1.068+/-0.032) +listnodes_sec:1.280000-1.530000(1.382+/-0.096) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-11 18:31:34 -07:00
Rusty Russell	0608c36301	gossipd: don't keep node_announcement messages in memory. MCP results from 5 runs, min-max(mean +/- stddev): store_load_msec:34779-38628(36903.4+/-1.4e+03) vsz_kb:1792996 store_rewrite_sec:14.440000-15.040000(14.672+/-0.24) listnodes_sec:1.030000-1.120000(1.068+/-0.032) listchannels_sec:27.860000-32.850000(30.05+/-1.7) routing_sec:30.020000-31.700000(31.044+/-0.56) peer_write_all_sec:65.100000-70.600000(68.422+/-2) -vsz_kb:1780516 +vsz_kb:1792996 -listnodes_sec:1.280000-1.530000(1.382+/-0.096) +listnodes_sec:1.030000-1.120000(1.068+/-0.032) MCP notable changes from previous patch (>1 stddev): -store_load_msec:30640-33236(32202+/-8.7e+02) +store_load_msec:34779-38628(36903.4+/-1.4e+03) -vsz_kb:1812956 +vsz_kb:1792996 -listnodes_sec:0.590000-0.660000(0.62+/-0.033) +listnodes_sec:1.030000-1.120000(1.068+/-0.032) -peer_write_all_sec:60.380000-61.320000(60.836+/-0.37) +peer_write_all_sec:65.100000-70.600000(68.422+/-2) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-11 18:31:34 -07:00
Rusty Russell	cb297b0a1b	gossipd: free tmpctx children in gossip_store_load loop. We're accumulating children, and we'll get more in the successive patches. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-11 18:31:34 -07:00
Rusty Russell	3ef767fd52	gossipd: don't use cached node_announcement for redundancy checking Re-parse the existing message, since we'e going to get rid of those fields. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-11 18:31:34 -07:00
Rusty Russell	e02f5817fe	gossipd: don't create struct chan for yet-to-be-updated channels. We currently create a struct chan when we receive a `channel_announcement`, but we can only broadcast once we have a `channel_update` (since that provides the timestamp). This means a `struct chan` can be in a weird state where it exists, but is unusable (can't use without an update), and also means we need to keep the channel_announcement message around until an update arrives, so we can put it in the gossip_store. Instead, keep track of these "unupdated" channels separately, and check for them in all the places we search for a specific channel to update. MCP results from 5 runs, min-max(mean +/- stddev): store_load_msec:30640-33236(32202+/-8.7e+02) vsz_kb:1812956 store_rewrite_sec:13.410000-16.970000(14.438+/-1.3) listnodes_sec:0.590000-0.660000(0.62+/-0.033) listchannels_sec:28.140000-29.560000(28.816+/-0.56) routing_sec:29.530000-32.590000(30.352+/-1.1) peer_write_all_sec:60.380000-61.320000(60.836+/-0.37) MCP notable changes from previous patch (>1 stddev): -vsz_kb:1812904 +vsz_kb:1812956 -store_rewrite_sec:21.390000-27.070000(23.596+/-2.4) +store_rewrite_sec:13.410000-16.970000(14.438+/-1.3) -listnodes_sec:1.120000-1.230000(1.176+/-0.044) +listnodes_sec:0.590000-0.660000(0.62+/-0.033) -listchannels_sec:38.900000-50.580000(44.716+/-3.9) +listchannels_sec:28.140000-29.560000(28.816+/-0.56) -routing_sec:45.080000-48.160000(46.814+/-1.1) +routing_sec:29.530000-32.590000(30.352+/-1.1) -peer_write_all_sec:58.780000-87.150000(72.278+/-9.7) +peer_write_all_sec:60.380000-61.320000(60.836+/-0.37) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-11 18:31:34 -07:00
Rusty Russell	d8aee68ba8	gossipd: handle duplicate nodes from unverified channel_announces properly. If we have a channel_announcement, we catch any node_announcement for either end while we validate the channel_announcement. But if we have multiple channel_announcements and the first one failed to verify, it would remove this catch, meaning we'd discard following node_announcements even though there was a pending channel_announcement. The answer is to use a simple reference count, and as a further optimization, only place the `pending_node_announce` if there's no node already. We also move the process_pending_node_announcement() calls lower down, so any new channel creation checks it. This is more robust, and will prove useful for the next patch, where we can use the same mechanism to handle node_announcements on channel_announcements which are verified, but don't yet have a channel_update. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-11 18:31:34 -07:00
Rusty Russell	da884751e8	gossipd: make routing_add_channel_update discard old timestamps. This is currently done higher up, in handle_channel_update(), but that's one reason why handle_channel_update() has to do a channel lookup. Moving the check down means handle_channel_update() can do a minimal "get node id for this channel" so it can check the signature. This helps, because the chan lookup semantics are changing in the next few patches. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-11 18:31:34 -07:00
Rusty Russell	6b9069ee28	broadcast: don't keep payload pointer. If we need the payload, pull it from the gossip store. MCP results from 5 runs, min-max(mean +/- stddev): store_load_msec:30189-52561(39416.4+/-8.8e+03) vsz_kb:1812904 store_rewrite_sec:21.390000-27.070000(23.596+/-2.4) listnodes_sec:1.120000-1.230000(1.176+/-0.044) listchannels_sec:38.900000-50.580000(44.716+/-3.9) routing_sec:45.080000-48.160000(46.814+/-1.1) peer_write_all_sec:58.780000-87.150000(72.278+/-9.7) MCP notable changes from previous patch (>1 stddev): -vsz_kb:2288784 +vsz_kb:1812904 -store_rewrite_sec:38.060000-39.130000(38.426+/-0.39) +store_rewrite_sec:21.390000-27.070000(23.596+/-2.4) -listnodes_sec:0.750000-0.850000(0.794+/-0.042) +listnodes_sec:1.120000-1.230000(1.176+/-0.044) -listchannels_sec:30.740000-31.760000(31.096+/-0.35) +listchannels_sec:38.900000-50.580000(44.716+/-3.9) -routing_sec:29.600000-33.560000(30.472+/-1.5) +routing_sec:45.080000-48.160000(46.814+/-1.1) -peer_write_all_sec:49.220000-52.690000(50.892+/-1.3) +peer_write_all_sec:58.780000-87.150000(72.278+/-9.7) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-11 18:31:34 -07:00
Rusty Russell	da845b660b	gossipd: gossip_store_get() to load a single store entry. This will allow us to load on demand, and not keep all messages in memory. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-11 18:31:34 -07:00
Rusty Russell	1f08cfb3e3	gossipd: use file offset within store as broadcast index. Instead of an arbitrary counter, we can use the file offset for our partial ordering, removing a field. It takes some care when we compact the store, however, as this field changes. MCP results from 5 runs, min-max(mean +/- stddev): store_load_msec:34271-35283(34789.6+/-3.3e+02) vsz_kb:2288784 store_rewrite_sec:38.060000-39.130000(38.426+/-0.39) listnodes_sec:0.750000-0.850000(0.794+/-0.042) listchannels_sec:30.740000-31.760000(31.096+/-0.35) routing_sec:29.600000-33.560000(30.472+/-1.5) peer_write_all_sec:49.220000-52.690000(50.892+/-1.3) MCP notable changes from previous patch (>1 stddev): -store_load_msec:35685-38538(37090.4+/-9.1e+02) +store_load_msec:34271-35283(34789.6+/-3.3e+02) -vsz_kb:2288768 +vsz_kb:2288784 -peer_write_all_sec:51.140000-58.350000(55.69+/-2.4) +peer_write_all_sec:49.220000-52.690000(50.892+/-1.3) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-11 18:31:34 -07:00
Rusty Russell	ec50ec6a71	gossipd: make gossip loading stats accurate. They didn't count the header sizes when reporting bytes, which is misleading. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-11 18:31:34 -07:00
Rusty Russell	eb4564c3cd	gossipd: embed broadcast information into each structure. This is more compact, but also required once we replace the arbitrary "index" with an actual offset into the gossip store. That will let us remove the in-memory variants entirely. MCP results from 5 runs, min-max(mean +/- stddev): store_load_msec:35685-38538(37090.4+/-9.1e+02) vsz_kb:2288768 store_rewrite_sec:35.530000-41.230000(37.904+/-2.3) listnodes_sec:0.720000-0.810000(0.762+/-0.041) listchannels_sec:30.750000-35.990000(32.704+/-2) routing_sec:29.570000-34.010000(31.374+/-1.8) peer_write_all_sec:51.140000-58.350000(55.69+/-2.4) MCP notable changes from previous patch (>1 stddev): -vsz_kb:2621808 +vsz_kb:2288768 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-11 18:31:34 -07:00
Rusty Russell	62918fcb3b	gossip_store: avoid gratuitous copy on load. Doesn't make measurable difference, but an obvious optimization. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-11 18:31:34 -07:00
Rusty Russell	617c23e735	gossipd: use u32 for timestamp. We used an s64 so we could use -1 and save a check, but that's just silly as we have adjacent non-u64 fields: wastes 7 bytes per node and 16 per channel. Interestingly, this seemed to make us a little slower for some reason. MCP results from 5 runs, min-max(mean +/- stddev): store_load_msec:35569-38776(37169.8+/-1.2e+03) vsz_kb:2621808 store_rewrite_sec:35.870000-40.290000(38.14+/-1.6) listnodes_sec:0.740000-0.800000(0.768+/-0.023) listchannels_sec:29.820000-32.730000(30.972+/-0.99) routing_sec:30.110000-30.590000(30.346+/-0.18) peer_write_all_sec:52.420000-59.160000(54.692+/-2.5) MCP notable changes from previous patch (>1 stddev): -store_load_msec:32825-36365(34615.6+/-1.1e+03) +store_load_msec:35569-38776(37169.8+/-1.2e+03) -vsz_kb:2637488 +vsz_kb:2621808 -store_rewrite_sec:35.150000-36.200000(35.59+/-0.4) +store_rewrite_sec:35.870000-40.290000(38.14+/-1.6) -listnodes_sec:0.590000-0.710000(0.682+/-0.046) +listnodes_sec:0.740000-0.800000(0.768+/-0.023) -peer_write_all_sec:49.020000-52.890000(50.376+/-1.5) +peer_write_all_sec:52.420000-59.160000(54.692+/-2.5) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-11 18:31:34 -07:00
Rusty Russell	0b484b111e	gossipd: make more compact getchannels entries. We can save significant space by combining both sides: so much that we can reduce the WIRE_LEN_LIMIT to something sane again. MCP results from 5 runs, min-max(mean +/- stddev): store_load_msec:34467-36764(35517.8+/-7.7e+02) vsz_kb:2637488 store_rewrite_sec:35.310000-36.580000(35.816+/-0.44) listnodes_sec:1.140000-2.780000(1.596+/-0.6) listchannels_sec:55.390000-58.110000(56.998+/-0.99) routing_sec:30.330000-30.920000(30.642+/-0.19) peer_write_all_sec:50.640000-53.360000(51.822+/-0.91) MCP notable changes from previous patch (>1 stddev): -store_rewrite_sec:34.720000-35.130000(34.94+/-0.14) +store_rewrite_sec:35.310000-36.580000(35.816+/-0.44) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-09 12:37:16 -07:00
Rusty Russell	91849dddc4	wire: use struct node_id for node ids. Don't turn them to/from pubkeys implicitly. This means nodeids in the store don't get converted, but bitcoin keys still do. MCP results from 5 runs, min-max(mean +/- stddev): store_load_msec:33934-35251(34531.4+/-5e+02) vsz_kb:2637488 store_rewrite_sec:34.720000-35.130000(34.94+/-0.14) listnodes_sec:1.020000-1.290000(1.146+/-0.086) listchannels_sec:51.110000-58.240000(54.826+/-2.5) routing_sec:30.000000-33.320000(30.726+/-1.3) peer_write_all_sec:50.370000-52.970000(51.646+/-1.1) MCP notable changes from previous patch (>1 stddev): -store_load_msec:46184-47474(46673.4+/-4.5e+02) +store_load_msec:33934-35251(34531.4+/-5e+02) -vsz_kb:2638880 +vsz_kb:2637488 -store_rewrite_sec:46.750000-48.280000(47.512+/-0.51) +store_rewrite_sec:34.720000-35.130000(34.94+/-0.14) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-09 12:37:16 -07:00
Rusty Russell	a2fa699e0e	Use node_id everywhere for nodes. I tried to just do gossipd, but it was uncontainable, so this ended up being a complete sweep. We didn't get much space saving in gossipd, even though we should save 24 bytes per node. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-09 12:37:16 -07:00
Rusty Russell	d4ab0592c5	fixup! gossipd: use simple inline array for nodes with few channels. Suggested-by: @cdecker Suggested-by: @niftynei	2019-04-09 12:37:16 -07:00
Rusty Russell	b6494c1994	gossipd: use simple inline array for nodes with few channels. Allocating a htable is overkill for most nodes; we can fit 11 pointers in the same space (10, since we use 1 to indicate we're using an array). MCP results from 5 runs, min-max(mean +/- stddev): store_load_msec:45947-47016(46683.4+/-4e+02) vsz_kb:2639240 store_rewrite_sec:46.950000-49.830000(48.048+/-0.95) listnodes_sec:1.090000-1.350000(1.196+/-0.095) listchannels_sec:48.960000-57.640000(53.358+/-2.8) routing_sec:29.990000-33.880000(31.088+/-1.4) peer_write_all_sec:49.360000-53.210000(51.338+/-1.4) MCP notable changes from previous patch (>1 stddev): - vsz_kb:2641316 + vsz_kb:2639240 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-09 12:37:16 -07:00
Rusty Russell	417e1bab7d	gossipd: use iterator helpers for iterating node channels. Makes the next step easier. MCP results from 5 runs, min-max(mean +/- stddev): store_load_msec:45791-46917(46330.4+/-3.6e+02) vsz_kb:2641316 store_rewrite_sec:47.040000-48.720000(47.684+/-0.57) listnodes_sec:1.140000-1.340000(1.2+/-0.072) listchannels_sec:50.970000-54.250000(52.698+/-1.3) routing_sec:29.950000-31.010000(30.332+/-0.37) peer_write_all_sec:51.570000-52.970000(52.1+/-0.54) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-09 12:37:16 -07:00
Rusty Russell	891ee20a59	tools/bench-gossipd.sh: rough benchmark for gossipd and the million channels project Outputs CSV. We add some stats for load times in developer mode, so we can easily read them out. peer_read_all_sec doesn't work, since we seem to reject about half the updates for having bad signatures. It's also very slow... routing fails, for unknown reasons, so that failure is ignored in routing_sec. Results from 5 runs, min-max(mean +/- stddev): store_load_msec,vsz_kb,store_rewrite_sec,listnodes_sec,listchannels_sec,routing_sec,peer_write_all_sec 39275-44779(40466.8+/-2.2e+03),2899248,41.010000-44.970000(41.972+/-1.5),2.280000-2.350000(2.304+/-0.025),49.770000-63.390000(59.178+/-5),33.310000-34.260000(33.62+/-0.35),42.100000-44.080000(43.082+/-0.67) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Header from folded patch 'fixup!_tools-bench-gossipd.sh__rough_benchmark_for_gossipd_and_the_million_channels_project-2.patch': fixup! tools/bench-gossipd.sh: rough benchmark for gossipd and the million channels project Suggested-by: @niftynei Header from folded patch 'fixup!_tools-bench-gossipd.sh__rough_benchmark_for_gossipd_and_the_million_channels_project-1.patch': fixup! tools/bench-gossipd.sh: rough benchmark for gossipd and the million channels project MCP filename change. Header from folded patch 'tools-bench-gossipd.sh__dont_print_csv_by_default.patch': tools/bench-gossipd.sh: don't print CSV by default. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Header from folded patch 'fixup!_tools-bench-gossipd.sh__rough_benchmark_for_gossipd_and_the_million_channels_project.patch': fixup! tools/bench-gossipd.sh: rough benchmark for gossipd and the million channels project Make shellcheck happy. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-08 04:41:43 +00:00
Rusty Russell	2bd7df93c6	gossipd: preserve unannounced channels across store compaction. Otherwise we'd forget them on restart, again. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-08 04:41:43 +00:00
Rusty Russell	c424c42668	gossipd: store local channel updates across restart, even if unannounced. Either private or simply not enough confirms. They would have been added on reconnect, but that's not ideal. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-08 04:41:43 +00:00
Rusty Russell	7c8f506a0f	dev-compact-store-gossip: specific RPC so we can test gossip_store rewrite. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-08 04:41:43 +00:00
Rusty Russell	5b12007a4f	gossipd: dev option to allow unknown channels. This lets us benchmark without a valid blockchain. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Header from folded patch 'fixup!_gossipd__dev_option_to_allow_unknown_channels.patch': fixup! gossipd: dev option to allow unknown channels. Suggested-by: @cdecker Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-08 04:41:43 +00:00
Rusty Russell	f8f6533dba	dev: --dev-gossip-time so gossipd doesn't prune old data. This is useful for canned data, such as the million channels project. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-08 04:41:43 +00:00
Rusty Russell	b2c93beaed	gossipd: use htable instead of simple array for node's channels. For giant nodes, it seems we spend a lot of time memmoving this array. Normally we'd go for a linked list, but that's actually hard: each channel has two nodes, so needs two embedded list pointers, and when iterating there's no good way to figure out which embedded pointer we'd be using. So we (ab)use htable; we don't really need an index, but it's good for cache-friendly iteration (our main operation). We can actually change to a hybrid later to avoid the extra allocation for small nodes. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-04-08 04:41:43 +00:00
Christian Decker	f3c234529e	gossip: Cache txout query failures If we asked `bitcoind` for a txout and it failed we were not storing that information anywhere, meaning that when we see the channel announcement the next time we'd be reaching out to `lightningd` and `bitcoind` again, just to see it fail again. This adds an in-memory cache for these failures so we can just ignore these the next time around. Fixes #2503 Signed-off-by: Christian Decker <decker.christian@gmail.com>	2019-04-01 23:54:19 +00:00
Christian Decker	426b22fdcb	gossip: Bump `gossip_getnodes_reply` result count to be u32 as well Otherwise we'll just have the same issue once we reach 65k nodes. Signed-off-by: Christian Decker <decker.christian@gmail.com>	2019-03-27 12:48:52 +01:00
Christian Decker	25e829c7d1	gossip: Make the `listchannels` reply result count a u32 Fixes #2504 Signed-off-by: Christian Decker <decker.christian@gmail.com> Reported-by: Antoine Le Calvez <@alecalve>	2019-03-27 12:48:52 +01:00
Rusty Russell	00f3a84af2	test: fix thinko in gossipd/test/run-bench-find_route.c Reported-by: @cdecker Fixes: #2440 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-03-05 11:42:43 +01:00
Rusty Russell	38e7d19dd5	Makefile: check for direct amount_sat/amount_msat access. We need to do it in various places, but we shouldn't do it lightly: the primitives are there to help us get overflow handling correct. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-02-21 08:01:37 +00:00
Rusty Russell	28f5da7b2f	tools/generate-wire: use amount_msat / amount_sat for peer protocol. Basically we tell it that every field ending in '_msat' is a struct amount_msat, and 'satoshis' is an amount_sat. The exceptions are channel_update's fee_base_msat which is a u32, and final_incorrect_htlc_amount's incoming_htlc_amt which is also a 'struct amount_msat'. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-02-21 08:01:37 +00:00
Rusty Russell	3ac0e814d0	daemons: use amount_msat/amount_sat in all internal wire transfers. As a side-effect of using amount_msat in gossipd/routing.c, we explicitly handle overflows and don't need to pre-prune ridiculous-fee channels. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-02-21 08:01:37 +00:00
Rusty Russell	85b8b25749	bitcoin/chainparams: use amount_sat / amount_msat Simple changes, but ripples through the code. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-02-21 08:01:37 +00:00
Rusty Russell	83adb94583	lightningd and routing: use struct amount_msat. We use it in route_hop, and paper over it in the JSON APIs. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-02-21 03:44:44 +00:00
Rusty Russell	7fad7bccba	common/amount: new types struct amount_msat and struct amount_sat. They're generally used pass-by-copy (unusual for C structs, but convenient they're basically u64) and all possibly problematic operations return WARN_UNUSED_RESULT bool to make you handle the over/underflow cases. The new #include in json.h means we bolt11.c sees the amount.h definition of MSAT_PER_BTC, so delete its local version. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-02-21 00:44:57 +00:00
Michael Schmoock	302a78f4eb	fix: add inline exception for recent cppcheck false positive	2019-02-18 01:06:01 +00:00
Rusty Russell	b99293fbb6	short_channel_id: don't accept :-separated in JSON if --allow-deprecated-apis=false We need to still accept it when parsing the database, but this flag should allow upgrade testing for devs building on top Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-02-08 16:52:30 -08:00
Rusty Russell	3ae0c20026	getroute: change definition (and pay default) for riskfactor. Up until now, riskfactor was useless due to implementation bugs, and also the default setting is wrong (too low to have an effect on reasonable payment scenarios). Let's simplify the definition (by assuming that P(failure) of a node is 1), to make it a simple percentage. I examined the current network fees to see what would work, and under this definition, a default of 10 seems reasonable (equivalent to 1000 under the old definition). It is this change which finally fixes our test case! The riskfactor is now 40msat (1500000 * 14 * 10 / 5259600 = 39.9), comparable with worst-case fuzz is 50msat (1001 * 0.05 = 50). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-02-06 18:39:52 +01:00
Rusty Russell	05f95b59c1	gossipd: take into account risk in final route comparison. We were only comparing by total msatoshis. Note, this still isn't sufficient to fix our indirect problem, as our risk values are all 1 (the minimum): lightning_gossipd(25480): 2 hop solution: 1501990 + 2 lightning_gossipd(25480): 3 hop solution: 1501971 + 3 ... lightning_gossipd(25480): => chose 3 hop solution Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-02-06 18:39:52 +01:00
Rusty Russell	662bb0c565	gossipd: fix riskfactor passing. We used a u16, and a 1000 multiplier, which meant we wrapped at riskfactor 66. We also never undid the multiplier, so we ended up applying 1000x the riskfactor they specified. This changes us to pass the riskfactor with a 1M multiplier. The next patch changes the definition of riskfactor to be more useful. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-02-06 18:39:52 +01:00
Rusty Russell	6a26b0c18d	gossipd: increase randomness in route selection. We have a seed, which is for (future!) unit testing consistency. This makes it change every time, so our pay_direct_test is more useful. I tried restarting the noed around the loop, but it tended to fail rebinding to the same port for some reason? Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-02-06 18:39:52 +01:00
Rusty Russell	afab1f7b3c	gossipd: handle onion errors internally. As a general rule, lightningd shouldn't parse user packets. We move the parsing into gossipd, and have it respond only to permanent failures. Note that we should not unconditionally remove a channel on WIRE_INVALID_ONION_HMAC, as this can be triggered (and we do!) by feeding sendpay a route with an incorrect pubkey. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-01-23 22:08:08 +01:00
Rusty Russell	4eddf57fd9	gossipd: don't mark channels unroutable. For transient failures, the pay plugin should simply exclude those from route considerations. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-01-23 22:08:08 +01:00
Rusty Russell	018a3f1d58	short_channel_id: make mk_short_channel_id return a failure. We had a bug `0ba547ee10` caused by short_channel_id overflow. If we'd caught this, we'd have terminated the peer instead of crashing, so add appropriate checks. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-01-21 12:31:06 +01:00
Rusty Russell	e2777642c0	getroute: add direction to route returned. We also ignore it in sendpay. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-01-17 13:02:24 +01:00
Rusty Russell	0ba547ee10	gossipd: handle overflowing query properly (avoid slow 100% CPU reports) Don't do this: (gdb) bt #0 0x00007f37ae667c40 in ?? () from /lib/x86_64-linux-gnu/libz.so.1 #1 0x00007f37ae668b38 in ?? () from /lib/x86_64-linux-gnu/libz.so.1 #2 0x00007f37ae669907 in deflate () from /lib/x86_64-linux-gnu/libz.so.1 #3 0x00007f37ae674c65 in compress2 () from /lib/x86_64-linux-gnu/libz.so.1 #4 0x000000000040cfe3 in zencode_scids (ctx=0xc1f118, scids=0x2599bc49 "\a\325{", len=176320) at gossipd/gossipd.c:218 #5 0x000000000040d0b3 in encode_short_channel_ids_end (encoded=0x7fff8f98d9f0, max_bytes=65490) at gossipd/gossipd.c:236 #6 0x000000000040dd28 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17290511, number_of_blocks=8) at gossipd/gossipd.c:576 #7 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17290511, number_of_blocks=16) at gossipd/gossipd.c:595 #8 0x000000000040ddee in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17290495, number_of_blocks=32) at gossipd/gossipd.c:596 #9 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17290495, number_of_blocks=64) at gossipd/gossipd.c:595 #10 0x000000000040ddee in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17290431, number_of_blocks=128) at gossipd/gossipd.c:596 #11 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17290431, number_of_blocks=256) at gossipd/gossipd.c:595 #12 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17290431, number_of_blocks=512) at gossipd/gossipd.c:595 #13 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17290431, number_of_blocks=1024) at gossipd/gossipd.c:595 #14 0x000000000040ddee in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17289408, number_of_blocks=2047) at gossipd/gossipd.c:596 #15 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17289408, number_of_blocks=4095) at gossipd/gossipd.c:595 #16 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17289408, number_of_blocks=8191) at gossipd/gossipd.c:595 #17 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17289408, number_of_blocks=16382) at gossipd/gossipd.c:595 #18 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17289408, number_of_blocks=32764) at gossipd/gossipd.c:595 #19 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17289408, number_of_blocks=65528) at gossipd/gossipd.c:595 #20 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17289408, number_of_blocks=131056) at gossipd/gossipd.c:595 #21 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17289408, number_of_blocks=262112) at gossipd/gossipd.c:595 #22 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17289408, number_of_blocks=524225) at gossipd/gossipd.c:595 #23 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17289408, number_of_blocks=1048450) at gossipd/gossipd.c:595 #24 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17289408, number_of_blocks=2096900) at gossipd/gossipd.c:595 #25 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17289408, number_of_blocks=4193801) at gossipd/gossipd.c:595 #26 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17289408, number_of_blocks=8387603) at gossipd/gossipd.c:595 #27 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17289408, number_of_blocks=16775207) at gossipd/gossipd.c:595 #28 0x000000000040ddee in queue_channel_ranges (peer=0x3868fc8, first_blocknum=514201, number_of_blocks=33550414) at gossipd/gossipd.c:596 #29 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=514201, number_of_blocks=67100829) at gossipd/gossipd.c:595 #30 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=514201, number_of_blocks=134201659) at gossipd/gossipd.c:595 #31 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=514201, number_of_blocks=268403318) at gossipd/gossipd.c:595 #32 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=514201, number_of_blocks=536806636) at gossipd/gossipd.c:595 #33 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=514201, number_of_blocks=1073613273) at gossipd/gossipd.c:595 #34 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=514201, number_of_blocks=2147226547) at gossipd/gossipd.c:595 #35 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=514201, number_of_blocks=4294453094) at gossipd/gossipd.c:595 #36 0x000000000040df26 in handle_query_channel_range (peer=0x3868fc8, msg=0x37e0678 "\001\ao\342\214\n\266\361\263r\301\246\242F\256c\367O\223\036\203e\341Z\b\234h\326\031") at gossipd/gossipd.c:625 The cause was that converting a block number to an scid truncates it at 24 bits. When we look through the index from (truncated number) to (real end number) we get every channel, which is too large to encode, so we iterate again. This fixes both that problem, and also the issue that we'd end up dividing into many empty sections until we get to the highest block number. Instead, we just tack the empty blocks on to then end of the final query. (My initial version requested 0xFFFFFFFE blocks, but the dev code which records what blocks were returned can't make a bitmap that big on 32 bit). Reported-by: George Vaccaro Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-01-15 11:34:45 -08:00
Rusty Russell	9f1f79587e	short_channel_id_dir: new primitive for one direction of short_channel_id Currently only used by gossipd for channel elimination. Also print them in canonical form (/[01]), so tests need to be changed. Suggested-by: @cdecker Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-01-15 12:01:38 +01:00
Rusty Russell	80753bfbd5	Feedback from @niftynei. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-01-15 12:01:38 +01:00
Rusty Russell	dc2ee9639b	listchannels: allow source arg to list channels by their source node. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-01-15 12:01:38 +01:00
Rusty Russell	358b7fda91	getroute: allow caller to specify maximum hops. This is required for routeboost. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-01-15 12:01:38 +01:00
Rusty Russell	599ec5efbe	gossipd: allow an array of excluded channels for getroute_request. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-01-15 12:01:38 +01:00
Rusty Russell	be64dd84ca	waitsendpay: indicate which channel direction the error was. You can figure this yourself by knowing the route, but it's better to report it directly here. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-01-15 12:01:38 +01:00
Rusty Russell	c0cfddfa95	test/run-bench-find_route: fix so it runs properly. We didn't populate the channels properly so it always failed. Additionally, somewhere along the line we kept using the single scid so we only created one channel. Also, the next patch will start comparing the pubkeys, so make valid ones: use an array so we don't affect the benchmark too much. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-01-15 12:01:38 +01:00
Rusty Russell	1567238dd9	invoice: option to expose/not-expose private channels. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-01-15 12:01:38 +01:00
Rusty Russell	fe4a600bc7	routeboost: don't use channels to dead-end nodes. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-01-15 12:01:38 +01:00
Rusty Russell	547d6ab878	routeboost: expose private channel in invoice iff we have no public ones. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-01-15 12:01:38 +01:00
Rusty Russell	f321b1d35f	getroute: remove seed arg, document fromid, make default fuzzpercent match docs. seed isn't very useful at this level: I've left it in routing.c because it might be useful for detailed testing. Pretty sure it's unused, so I simply removed it. The fuzzpercent is documented to default at 5%, but actually was 75%. Fix that too. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-01-15 12:01:38 +01:00
Rusty Russell	26dda57cc0	utils: make tal_arr_expand safer. Christian and I both unwittingly used it in form: tal_arr_expand(&x) = tal(x, ...) Since '=' isn't a sequence point, the compiler can (and does!) cache the value of x, handing it to tal after* tal_arr_expand() moves it due to tal_resize(). The new version is somewhat less convenient to use, but doesn't have this problem, since the assignment is always evaluated after the resize. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2019-01-15 12:01:38 +01:00
Christian Decker	659a26ea5a	misc: Update short_channel_id representation to use 'x' separators Reported-by: Alex Bosworth <@alexbosworth> Signed-off-by: Christian Decker <decker.christian@gmail.com>	2019-01-15 03:50:27 +00:00
Christian Decker	94eb2620dc	bolt: Updated the BOLT specification to the latest version This is mainly just copying over the copy-editing from the lightning-rfc repository. [ Split to just perform changes after the UNKNOWN_PAYMENT_HASH change --RR ] Signed-off-by: Christian Decker <decker.christian@gmail.com> Reported-by: Rusty Russell <@rustyrussell>	2019-01-15 02:19:56 +00:00
Christian Decker	65054ae72e	bolt: Updated the BOLT specification to a07dc3df3b4611989e3359f28f96c574f7822850 This is mainly just copying over the copy-editing from the lightning-rfc repository. [ Split to just perform changes prior to the UNKNOWN_PAYMENT_HASH change --RR ] Signed-off-by: Christian Decker <decker.christian@gmail.com> Reported-by: Rusty Russell <@rustyrussell>	2019-01-15 02:19:56 +00:00
Rusty Russell	23540fe956	common: make funding_tx and withdraw_tx share UTXO code. They both do the same thing: convert utxos into tx inputs. Share code. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-12-06 23:11:51 +01:00
Rusty Russell	ab735dcbe6	gossipd: wire up memleak detection. For simplicity we dump leaks to logs, and just return a bool to master. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-11-22 05:15:42 +00:00
Rusty Russell	78771ca371	gossipd: mark timers as not being leaks. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-11-22 05:15:42 +00:00
Rusty Russell	5a81dbd783	common/daemon: enable/cleanup memleak in daemon_setup / daemon_shutdown. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-11-22 05:15:42 +00:00
Rusty Russell	29b672b117	gossipd: hear no wumbo. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-11-21 21:43:37 +00:00
Rusty Russell	9620393109	gossipd: store chainparams internally. We keep a chain_hash in struct daemon, becayse otherwise we end up with `&peer->daemon->rstate->chainparams->genesis_blockhash` which is a bit ridiculous. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-11-21 21:43:37 +00:00
Rusty Russell	5312ec1e34	gossipd: add documentation comments now it's relatively understandable. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-11-21 00:36:31 +00:00
Rusty Russell	ea2c03e2e2	gossipd: don't have code to exit final loop; we always leave via master_gone. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-11-21 00:36:31 +00:00
Rusty Russell	4038061d0f	gossipd: use take() in getroute_req. Trivial optimization. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-11-21 00:36:31 +00:00
Rusty Russell	5c60d7ffb2	gossipd: split wire types into msgs from lightningd and msgs from per-peer daemons This avoids some very ugly switch() statements which mixed the two, but we also take the chance to rename 'towire_gossip_' to 'towire_gossipd_' for those inter-daemon messages; they're messages to gossipd, not gossip messages. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-11-21 00:36:31 +00:00
Rusty Russell	07b16e37d0	daemon_conn: don't rely on outq_empty callback telling us to retry queue. We had at least one bug caused by it not returning true when it had queued something. Instead, just re-check thq queue after it's called. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-11-21 00:36:31 +00:00
Rusty Russell	4e9eba1965	gossipd: rework query_channel_range to accept overlapping range. We shouldn't insist on an exact reponse match: they can batch it and send a whole batch, as long as it overlaps what we ask. We also change to a bitmap to save some memory. This isn't note in the CHANGELOG since we don't actually send gossip range queries except for testing. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-11-21 00:36:31 +00:00
Rusty Russell	363564301f	gossipd: be more rigorous in handling peer messages vs. daemon requests. Messages from a peer may be invalid in many ways: we send an error packet in that case. Rather than internally calling peer_error, however, we make it explicit by having the handle_ functions return NULL or an error packet. Messages from the daemon itself should not be invalid: we log an error and close the fd to them if it is. Previously we logged an error but didn't kill them. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-11-21 00:36:31 +00:00
Rusty Russell	1bd76861fd	gossipd: reorder functions into related groups (MOVEONLY) It's MOVEONLY but for the removal of the '#ifndef TESTING' which was needed for old test code. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-11-21 00:36:31 +00:00
Christian Decker	8e83d43c39	opts: Split early from non-early args so plugins can register theirs The idea is that `plugin` is an early arg that is parsed (from command line or the config file). We can then start the plugins and have them tell us about the options they'd like to add to the mix, before we actually parse them. Signed-off-by: Christian Decker <@cdecker>	2018-11-13 00:44:50 +01:00
Rusty Russell	3c97f3954e	daemon_conn: make it a tal object, typesafe callbacks. It means an extra allocation at startup, but it means we can hide the definition, and use standard patterns (new_daemon_conn and typesafe callbacks). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-10-29 04:06:16 +00:00
Rusty Russell	0e6aec081a	gossipd: make sure that freeing peer closes connection to it. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-10-29 04:06:16 +00:00
Rusty Russell	689d51cba5	common/daemon_conn: remove finished function. For the moment, caller sets it manually. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-10-29 04:06:16 +00:00
Rusty Russell	c236361efd	wireaddr: update bolt version, remove 'padding' from addresses. Nobody used this, so it was removed from the spec. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-10-28 23:51:05 +00:00
Rusty Russell	66dcba099d	gossipd: hand raw pubkeys in getnodes and getchannels entries. We spend quite a bit of time in libsecp256k1 moving them to and from DER encoding. With a bit of care, we can transfer the raw bytes from gossipd and manually decode them so a malformed one can't make us abort(). Before: real 0m0.629000-0.695000(0.64985+/-0.019)s After: real 0m0.359000-0.433000(0.37645+/-0.023)s At this point, the main issues are 11% of time spent in ccan/io's backend_wake (I tried using a hash table there, but that actually makes the small-number-of-fds case slower), and 65% of gossipd's time is in marshalling the response (all those tal_resize add up!). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-10-19 22:02:11 +00:00
Rusty Russell	bbc36a7bec	gossipd: update node announcement even if we change within a second. Usually Travis triggers corner cases because it's so slow, but this time the moons aligned, and it managed to fail test_node_reannounce because it generated the updated node_announcement with the same timestamp as the old one. This is because we only updated "last_announce_timestamp" when we generated the announcement, not when we got it off the wire or loaded it from the gossip store. The fix is to ask the routing code what the latest timestamp is; we could still generate a clashing timestamp if (1) the gossip store is lost, and (2) we restart within one second. Hard to care. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-10-16 04:24:03 +00:00
lisa neigut	0ae1d03513	BOLT7: broadcast `htlc_maximum_msat` in `channel_update s Have c-lightning nodes send out the largest value for `htlc_maximum_msat` that makes sense, ie the lesser of the peer's max_inflight_htlc value or the total channel capacity minus the total channel reserve.	2018-10-16 03:32:27 +00:00
Rusty Russell	afac01380d	gossipd: don't initialize broadcast interval, make field name explicit. We initialize it to 30 seconds, but it's always overridden by the gossip_init message (and usually to 60 seconds, so it's doubly misleading). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-10-15 23:04:17 +00:00
Rusty Russell	3991425111	gossipd: don't accept forwarding short_channel_ids we don't own. Gossipd provided a generic "get endpoints of this scid" and we only use it in one place: to look up htlc forwards. But lightningd just assumed that one would be us. Instead, provide a simpler API which only returns the peer node if any, and now we handle it much more gracefully. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-10-15 23:04:17 +00:00
Rusty Russell	030fe1ce53	gossipd: don't expose private channels for routeboost. We don't create unannouncable channels, but other implementations can. Not only is it rude to expose these via invoices, it's probably not useable anyway. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-10-15 23:04:17 +00:00
lisa neigut	762c795c9b	gossip: reject channel_update with invalid `htlc_max_msat` If the channel update signals an invalid `htlc_maximum_msat` value, we ignore the update.	2018-10-09 23:22:52 +00:00
lisa neigut	1b6bd3fded	wire: add test for parsing optional version of channel_update	2018-10-09 23:22:52 +00:00
lisa neigut	a289282bad	gossipd: use u64 for `htlc_minimum_msat` field It's u64 in the spec, so we should use u64 too.	2018-10-09 23:22:52 +00:00
lisa neigut	b9331e5ac8	gossipd: parse and respect optional `htlc_maximum_msat` If another channel has set the optional `htlc_maximum_msat` field, we should correctly parse that field and respect it when drawing up routes for payments.	2018-10-09 23:22:52 +00:00
Rusty Russell	de37586a97	gossipd: use riskfactor in getroute, not "1". AFAICT, this was there in the original commit by @cdecker. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-10-09 08:40:52 +00:00
Rusty Russell	d946e965a6	gossipd: test that fromwire from lightningd messages succeeds. Also tiny drive-by cleanup for gossip_disable_local_channels to modern form. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-10-09 08:40:52 +00:00
Rusty Russell	864812019f	gossipd: use tal_arr_expand instead of open-coding it. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-10-09 08:40:52 +00:00
Rusty Russell	915ffe35ed	gossipd: clean up getnodes handling. globalfeatures should not be accessed if we haven't received a channel_update. Treat it like the other fields which are only initialized and marshalled/unmarshalled if the timestamp is positive. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-10-09 08:40:52 +00:00
Rusty Russell	df27fc55af	More renaming of gfeatures to globalfeatures. Use the BOLT #1 naming. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-10-09 08:40:52 +00:00
Rusty Russell	bb5e2ffafb	gossipd: don't create redundant node_announcements. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-09-28 18:20:17 +02:00
Rusty Russell	afc92dd757	gossipd: use array[32] not pointer for alias. And use ARRAY_SIZE() everywhere which will break compile if it's not a literal array, plus assertions that it's the same length. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-09-28 18:20:17 +02:00
Rusty Russell	0baa5f7071	gossipd: send node announcement on startup. I suspect this fixes #1660 too, but checking would be good. Fixes: #1781 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-09-28 18:20:17 +02:00
Rusty Russell	2f667c5227	gossipd: routine to get route_info for known incoming channels. For routeboost, we want to select from all our enabled channels with sufficient incoming capacity. Gossipd knows which are enabled (ie. we have received a `channel_update` from the peer), but doesn't know the current incoming capacity. So we get gossipd to give us all the candidates, and lightningd selects from those. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-09-28 15:03:42 +02:00
Rusty Russell	f64eee717d	gossipd: make helpers const-correct. Always be const if you can. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-09-28 15:03:42 +02:00
Rusty Russell	95c9a73fbb	gossipd: set sent flag when sending reply_short_channel_ids_end Otherwise, if we don't announce the last node, we'll not flush this out; it will be delayed until the next time we send gossip! Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-09-28 14:39:25 +02:00
Rusty Russell	fbb7bafc3b	gossipd: don't include channel in query_short_channel_ids reply if no channel_update. This is consistent: we don't broadcast a channel_announce until we've seen a channel_update, so we probably shouldn't advertise it here. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-09-28 14:39:25 +02:00
Rusty Russell	41b0872f58	Use localfeatures and globalfeatures consistently. That's what BOLT #1 calls them; make it easier for people to grep. Reported-by: @niftynei Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-09-28 04:14:28 +00:00
Rusty Russell	96f05549b2	common/utils.h: add tal_arr_expand helper. We do this a lot, and had boutique helpers in various places. So add a more generic one; for convenience it returns a pointer to the new end element. I prefer the name tal_arr_expand to tal_arr_append, since it's up to the caller to populate the new array entry. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-09-27 22:57:19 +02:00
Rusty Russell	e450c6bbdb	gossipd: remove time-delayed local channel_update, produce DISABLE on-demand. We have a lot of infrastructure to delay local channel_updates to avoid spamming on each peer reconnect; we had to keep tracking of pending ones though, in case we needed the very latest for sending an error when failing an HTLC. Instead, it's far simpler to set the local_disabled flag on a channel when we disconnect, but only send a disabling channel_update if we actually fail an HTLC. Note: handle_channel_update() TAKES update (due to tal_arr_dup), but we didn't use that before. Now we do, add annotation. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-09-26 03:21:35 +00:00
Rusty Russell	16e16a725e	gossipd: apply private updates to announce channel. We trade channel_update before channel_announce makes the channel public, and currently forget them when we finally get the channel_announce. We should instead apply them, and not rely on retransmission (which we remove in the next patch!). This earlier channel_update means test_gossip_jsonrpc triggers too early, so have that wait for node_announcement. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-09-26 03:21:35 +00:00
Rusty Russell	66105e83ea	gossipd: simplify "broadcast channel_announcement now we have channel_update" logic It's simpler and more robust to just check that it's not yet announced (the broadcast index will be 0). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-09-26 03:21:35 +00:00
Rusty Russell	8455b12781	Revert "gossipd: handle premature node_announcements in the store." This reverts commit `e2f426903d`. With the new store version, this can't happen. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-09-21 17:56:15 +02:00
Rusty Russell	48de77d56e	gossipd: invalidate old gossip_stores. Incrementing version number means stores which were prior to the previous commit will be removed, and refreshed. The simplest fix, if not the most efficient. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-09-21 17:56:15 +02:00
lisa neigut	b1ceaf9910	gossipd: Update BOLT-split flags in channel_update BOLT 7's been updated to split the flags field in `channel_update` into two: `channel_flags` and `message_flags`. This changeset does the minimal necessary to get to building with the new flags.	2018-09-21 00:24:12 +00:00
Rusty Russell	e012e94ab2	hsmd: rename hsm_client_wire_csv to hsm_wire.csv That matches the other CSV names (HSM was the first, so it was written before the pattern emerged). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-09-20 09:49:39 +02:00
Rusty Russell	8f1f1784b3	hsmd: remove hsmd/client.c It was only used by handshake.c. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-09-20 09:49:39 +02:00
Rusty Russell	704d30edce	ping: complete JSON RPC ping commands even if one ping gets no response. We would never complete further ping commands if we had < responses than pings. Oops. Fixes: #1928 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-09-14 22:11:23 +02:00
Rusty Russell	97c7ba2f80	gossipd: fix reordering of node_announcements in presence of a unannounced channel. If we receive a channel_announce but not a channel_update, we store the announce but don't put it in the broadcast map. When we delete a channel, we check if the node_announcement broadcast now preceeds all channel_announcements, and if so, we move it to the end of the map. However, with a channel_announcement at index '0', this test fails. This is at least one potential cause of the node map getting out of order. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-09-04 14:36:05 +02:00
Rusty Russell	e2f426903d	gossipd: handle premature node_announcements in the store. These happen after we compact the store; every log I've seen of a restart on a real node has a message about truncating the store, because node_announcements predate channel_announcements. I extracted one such case from testnet, and reduced it to test here. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-09-04 14:36:05 +02:00
Rusty Russell	0d46a3d6b0	Put the 'd' back in the daemons. @renepickhardt: why is it actually lightningd.c with a d but hsm.c without d ? And delete unused gossipd/gossip.h. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-09-03 05:01:40 +00:00
Rusty Russell	317a830e94	devtools: dump-gossipstore. Not very useful by itself, but when combined with decodemsg it can tell us quite a bit. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-09-03 00:39:06 +00:00
Rusty Russell	f80955c932	broadcast: don't leak in broadcast_del. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-08-24 19:54:32 +02:00
Rusty Russell	5d1f71c3c0	gossipd: don't leak fields in create_node_announcement. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-08-24 19:54:32 +02:00
Rusty Russell	a475098928	gossipd: fix leak in gossip_store_add_channel_delete. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-08-24 19:54:32 +02:00
Rusty Russell	1c81486b48	routing: fix falsely flagged leak. pending goes away on a timer, sure, but might as well use tmpctx here. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-08-24 19:54:32 +02:00
Rusty Russell	b10bae1ceb	gossipd: use ctx arg in create_channel_update. Turns out it was always `tmpctx` anyway, so this isn't a real bug right now. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-08-24 19:54:32 +02:00
Rusty Russell	2db77f5d1d	gossipd: minor modifications for memleak detection to work. 1. Move the list to the start of `struct peer`: memleak walks the list correctly this way. 2. Don't create tal parent loop daemon->conn->daemon. The second one is silly anyway: we exit via master_gone when the master conn is closed. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-08-24 19:54:32 +02:00
Rusty Russell	83eadb3548	gossipd: fix SUPERVERBOSE usage, enhance, when turned on. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-08-23 14:46:22 +02:00
Rusty Russell	74521b3fb7	gossipd: don't delay the very first channel_update. Lightning charge tests stopped working without a timeout, being unable to find a route. The 15 second delay doesn't matter in real life, but in these scenarios it does. This fixes it by making sure the channel is usable immediately. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-08-21 00:49:12 +02:00
conanoc	b1900b18ab	Fix DEVELOPER guard for ping ping_req() should be outside of DEVELOPER guard now.	2018-08-15 06:48:55 +00:00
Christian Decker	6627da5eb5	routing: Do not consider risk when capping transfers Reported-by: Rusty Russell <@rustyrussell> Signed-off-by: Christian Decker <@cdecker>	2018-08-06 22:46:02 +02:00
Christian Decker	84905eac2b	routing: Make the capacity a parameter to new_chan As pointed out by @rustyrussell the capacity is now always defined, so we can fold that into the construction of the channel itself. Reported-by: Rusty Russell <@rustyrussell> Signed-off-by: Christian Decker <@cdecker>	2018-08-06 22:46:02 +02:00
Christian Decker	8201764117	routing: Skip channels that require larger HTLCs than we are routing The `htlc_minimum_msat` parameter was ignored so far, and we'd be attempting to pay and hitting a brick wall by doing so. This patch just skips channels that are not eligible anyway.	2018-08-06 22:46:02 +02:00
Christian Decker	14000a22bc	routing: Skip channels that don't have sufficient capacity We know the total channel capacity after checking for its existence on-chain, so we can actually make use of that information to discard channels that don't have a sufficient capacity anyway, reducing the number of failed attempts.	2018-08-06 22:46:02 +02:00
Christian Decker	8a34933c1a	gossip: Annotate locally added channels with their capacity We were adding channels without their capacity, and eventually annotated them when we exchanged `channel_update`s. This worked as long as we weren't considering the channel capacity, but would result in local-only channels to be unusable once we start checking.	2018-08-06 22:46:02 +02:00
Rusty Russell	584ee26200	gossipd: fix thinko in node_announcement address parsing which made us miss final address 'cursor < ser + max' isn't valid because we reduce 'max' as we go! Effectively we'll stop once we're past halfway, which can only happen with ipv6 + a torv2 address. Ths fix is one-line, but we rename 'max' to 'len' which makes its purpose clearer. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-08-06 19:33:46 +02:00
Rusty Russell	0b08601951	sync_crypto_write/sync_crypto_read: just fail, don't return NULL. There's only one thing the caller ever does, just do that internally. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-08-05 02:03:58 +00:00
practicalswift	7969cc335e	Allocate off ctx instead of tmpctx in encode_short_channel_ids_start(const tal_t *ctx)	2018-08-01 13:09:16 +09:30
practicalswift	b5682a773b	Remove dead stores	2018-07-31 12:45:02 +02:00
Rusty Russell	5cf34d6618	Remove tal_len, use tal_count() or tal_bytelen(). tal_count() is used where there's a type, even if it's char or u8, and tal_bytelen() is going to replace tal_len() for clarity: it's only needed where a pointer is void. We shim tal_bytelen() for now. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-07-30 11:31:17 +02:00
Rusty Russell	36730ddb6d	gossipd: dev-suppress-gossip. Useful for testing that we only get an update via the error message. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-07-27 14:12:00 +02:00
Rusty Russell	73b3782943	gossipd: send latest update in error message, even if delayed. We delay internally to reduce broadcastig route flap, but errors are a special case: we want to send the latest, otherwise we might send an old (non-disabled) update. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-07-27 14:12:00 +02:00
Rusty Russell	3c66d5fa03	gossipd: add flag for locally disabling channel. We used to just manually set ROUTING_FLAGS_DISABLED, but that means we then suppressed the real channel_update because we thought it was a duplicate! So use a local flag: set it for the channel when the peer disconnects, and clear it when channeld sends a local update. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-07-27 14:12:00 +02:00
Rusty Russell	d241bd762c	connectd: don't use gossip_getnodes_entry. gossip_getnodes_entry was used by gossipd for reporting nodes, and for reporting peers. But the local_features field is only available for peers, and most other fields are only available from node_announcement. Note that the connectd change actually means we get less information about peers: gossipd used to do the node lookup for peers and include the node_announcement information if it had it. Since generate_wire.py can't create arrays-of-arrays, we add a 'struct peer_features' to encapsulate the two feature arrays for each peer, and for convenience we add it to lightningd/gossip_msg. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-07-25 02:13:52 +00:00
Rusty Russell	7b2641ed0d	gossipd: remove peer-related fields and wire messages. This completes the removal of peer-related messages. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-07-25 02:13:52 +00:00
Rusty Russell	0d442b5ff2	gossipd: move files into connectd. These source files are only used for peer-related things, so move them. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-07-25 02:13:52 +00:00
Rusty Russell	dba7f9002f	gossipd: provide connectd with address resolution. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-07-25 02:13:52 +00:00
Rusty Russell	3d3d2ef9af	gossipd: remove connectd functionality, enable connectd. This patch guts gossipd of all peer-related functionality, and hands all the peer-related requests to channeld instead. gossipd now gets the final announcable addresses in its init msg, since it doesn't handle socket binding any more. lightningd now actually starts connectd, and activates it. The init messages for both gossipd and connectd still contain redundant fields which need cleaning up. There are shims to handle the fact that connectd's wire messages are still (mostly) gossipd messages. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-07-25 02:13:52 +00:00
Rusty Russell	92d66a5451	gossipd: take connectd fd on initialization. connectd has a dedicated fd to gossipd, so it can ask for a new gossip_fd for a peer. gossipd has a standalone routine to create a remote peer (this will eventually be the only way gossipd creates a new peer). For now lightningd creates a socketpair but doesn't run connectd, so gossipd never sees any requests on this fd. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-07-25 02:13:52 +00:00
Rusty Russell	e1dfb1b178	gossipd: simplify per-peer features. Store the two we care about as booleans. Once connectd is complete we won't even have the feature bitmaps for peers. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-07-25 02:13:52 +00:00
Rusty Russell	f747ad8f73	common/daemon_conn: add daemon_conn_wake() helper. We've been open-coding it. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-07-25 02:13:52 +00:00
Rusty Russell	16b8f1eb83	gossipd: actually use global features to create our own node_announcement. It's currently empty, but I was surprised we still used "NULL". Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-07-25 02:13:52 +00:00
Rusty Russell	a52d522525	gossipd: handle ping messages for remote peers too. This simplifies our ping handling: make gossipd always do it. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-07-25 02:13:52 +00:00
Rusty Russell	9bf238e001	hsmd: provide message for master to get basepoints & funding pubkey for a channel This is only used by the master daemon, but it's not secret information. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-07-24 00:40:01 +02:00
Rusty Russell	dfaf74d972	hsmd: add routines to sign onchain transactions, part 1. This handles the "to-us" transactions which return funds to the wallet. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-07-24 00:40:01 +02:00
Rusty Russell	019ba86b91	gossipd: use optional fields. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-07-17 12:32:00 +02:00
Christian Decker	14c6310a4f	gossip: Fix concurrent PR merge issue with structeq PR #1618 in parallel with the migration to macro `structeq` created this. Fixes #1674	2018-07-08 19:04:46 +02:00
Rusty Russell	ed83bbe623	pytest: fix flaky race in test_gossip_query_channel_range. We weren't waiting for gossipd to actually process the dev_set_max_scids_encode_size message, so under Travis it sometimes split the reply before processing that. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-07-07 16:26:23 +02:00
Rusty Russell	57794b9285	gossipd: also delay locally-generated disables when peer vanishes. Note that we mark both directions of the channel disabled immediately, it's just the broadcast of the update which is delayed, just like the ones generated when channeld tells us to. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-07-07 16:07:53 +02:00
Rusty Russell	f9b8237d50	gossipd: delay generation of local updates. We disable the channel every time the peer disconnects; if it reconnects we get two updates. The simplest solution: delay all updates by 15 seconds. Replace any pending delayed update. If update is redundant after 15 seconds, discard. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-07-07 16:07:53 +02:00
Rusty Russell	ef59a8f4aa	gossipd: suppress redundant local updates which we would generate. This doesn't do anything for us now, since we actually tend to produce DISABLE/ENABLE update pairs. But the infrastructure is useful for the next patch. We also add more details to the trace message in the core update code. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-07-07 16:07:53 +02:00
Rusty Russell	8e571ba688	listnodes: expose global features. Since nobody sets these yet, it's a bit moot, but it will be great in future. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-07-07 16:07:53 +02:00
Rusty Russell	9fa738a741	listpeers: expose peer features as 'local_features' and 'global_features' For now, just the connected peers. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-07-07 16:07:53 +02:00
Rusty Russell	7b735fbeee	gossipd: fix json_listpeers printing node information. json_listpeers returns an array of peers, and an array of nodes: the latter is a subset of the former, and is used for printing alias/color information. This changes it so there is a 1:1 correspondance between the peer information and nodes, meaning no more O(n^2) search. If there is no node_announce for a peer, we use a negative timestamp (already used to indicate that the rest of the gossip_getnodes_entry is not valid). Other fixes: 1. Use get_node instead of iterating through the node map. 2. A node without addresses is perfectly valid: we have to use the timestamp to see if the alias/color are set. Previously we wouldn't print that if it didn't also advertize an address. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-07-07 16:07:53 +02:00
Rusty Russell	fed5a117e7	Update ccan/structeq. structeq() is too dangerous: if a structure has padding, it can fail silently. The new ccan/structeq instead provides a macro to define foo_eq(), which does the right thing in case of padding (which none of our structures currently have anyway). Upgrade ccan, and use it everywhere. Except run-peer-wire.c, which is only testing code and can use raw memcmp(): valgrind will tell us if padding exists. Interestingly, we still declared short_channel_id_eq, even though we didn't define it any more! Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-07-04 23:57:00 +02:00
Rusty Russell	4a1ca0fb99	gossipd: don't use raw secp256k1_pubkey in routing. We wrap it in 'struct pubkey' for typesafety and consistency, and the next patch takes advantage of that when we move to pubkey_eq. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-07-04 23:57:00 +02:00
Rusty Russell	82ff891202	Update to latest BOLT version. And remove the FIXMEs now that the gossip_query extension is merged. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-07-01 17:37:03 +02:00
Rusty Russell	f67182ff20	gossipd: order node_announcement addresses correctly, remove duplicate types. Fixes: #1596 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-07-01 15:03:21 +02:00
Rusty Russell	284f0a04c9	gossipd: don't announce bound address if given with --bind-addr, even if public. Only --addr implies announce-if-public: --bind-addr does not. It's also possible to have --bind-addr to an automatic Tor address: you'd have to dig the onion address out of the logs or getinfo to use it, but it's possible. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-07-01 15:03:21 +02:00
Rusty Russell	9d3ce87700	decode_short_ids: move to common. We want to use it in devtools/decodemsg. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-07-01 14:55:29 +02:00
arowser	25f60f9456	remove unused return value	2018-06-30 04:27:34 +00:00
Christian Decker	4a5cff8490	gossip: Try to detect broken ISP resolvers and discard broken replies This is a best effort attempt to skip connection attempts if we detect a broken ISP resolver. A broken ISP resolver is a resolver that will replace NXDOMAIN replies with a dummy response. This is best effort in that it'll only detect a single fixed dummy reply, it'll check only on startup, and will not detect if we switched networks. It should be good enough for most cases, and in the worst case it will result in a connection attempt that does not complete. Signed-off-by: Christian Decker <decker.christian@gmail.com> Reported-by: Glenn Willen <@gwillen>	2018-06-21 11:21:16 +02:00
Christian Decker	91c2416657	gossip: Do not use DNS if we were told not to Signed-off-by: Christian Decker <decker.christian@gmail.com>	2018-06-21 11:21:16 +02:00
Christian Decker	ceef61dbbd	gossip: Pass use_dns option down to gossipd Signed-off-by: Christian Decker <decker.christian@gmail.com>	2018-06-21 11:21:16 +02:00
William Casarin	d7aa0528b8	gossipd: fix compile error, uninitialized variable Seems to be a problem with gcc 6.4+? Fixes #1527 Signed-off-by: William Casarin <jb55@jb55.com>	2018-06-20 21:25:03 +00:00
Rusty Russell	833e8387aa	gossipd: fix up BOLT references. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-06-18 12:31:09 +02:00
Christian Decker	71ec8193b2	gossip: Avoid integer count overflow in gossip_store Signed-off-by: Christian Decker <decker.christian@gmail.com>	2018-06-18 12:04:25 +02:00
Rusty Russell	f5ebf8e231	gossipd: send correct channel_update in response to query_short_channel_ids Cut & paste means we sometimes sent NULL: ``` 2018-06-15T00:13:51.908Z lightningd(23653): lightning_closingd-03864ef025fde8fb587d989186ce6a4a186895ee44a926bfc370e2c366597a3f8f chan #436: Gossipd gave us bad send_gossip message 0bc80000 ``` Fixes: #1581 Reported-by: @Xian001 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-06-15 15:39:30 +02:00
Rusty Russell	60b3f0e376	gossipd: remove oververbose logging when we uncompress short_channel_id array Reported-by: Xian001 (#1581) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-06-15 15:39:30 +02:00
Rusty Russell	9d721ecb99	gossipd: add assertions to try to catch mysterious crash. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-06-15 11:53:47 +02:00
Rusty Russell	5c19c55841	gossipd: fix take leak when peer is dying. In this case, local and remote are both NULL; so if someone tries to send a packet with take(), we need to free it. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-06-15 11:53:47 +02:00
Rusty Russell	a7e6cdb418	gossipd: peer->local->peer_out queue should have lifetime of peer->local. The current code attaches it to peer, which is a slight leak. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-06-15 11:53:47 +02:00
Rusty Russell	e098578731	gossipd: fix leak when we fail to dup fds. In this case, peer would stay around, but conn would be freed. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-06-15 11:53:47 +02:00
Rusty Russell	f6ff89e596	gossipd: fix use-after-free when we fail to make connection. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-06-15 11:53:47 +02:00
Christian Decker	4279e5cdbd	gossip: Fix "already reaching" issue I think this is what is causing #1536: getting disconnected causes gossipd to attempt to reach the peer again, unconditionally setting the flag to tell the master. At the same time the master also issues a reaching command (which is allowed since it is its first), but then it clashes on the already set flag. Setting this flag only when the master actually needs to be told should fix this. Signed-off-by: Christian Decker <decker.christian@gmail.com>	2018-06-15 01:06:42 +00:00
Christian Decker	985af483cf	gossip: Wrap insert_broadcast and gossip_store_add in persistent_broadcast They should sync up nicely otherwise we may be overestimating the stale rate. Signed-off-by: Christian Decker <decker.christian@gmail.com>	2018-06-09 13:38:46 +02:00
Christian Decker	6632f44133	gossip: Disable gossip_store temporarily while replaying messages Signed-off-by: Christian Decker <decker.christian@gmail.com>	2018-06-09 13:38:46 +02:00
Christian Decker	2b5e1ee65f	gossip: Enable the consistency check only when really pedantic Signed-off-by: Christian Decker <decker.christian@gmail.com>	2018-06-09 13:38:46 +02:00
Christian Decker	8a5bebed59	gossip: Disable future compactions if we fail a compaction A failed compaction shouldn't be deadly, but we should also not attempt to do one on every gossip message after the first one fails. Signed-off-by: Christian Decker <decker.christian@gmail.com>	2018-06-09 13:38:46 +02:00
Christian Decker	74a1cbd877	gossip: Implement gossip_store compaction Signed-off-by: Christian Decker <decker.christian@gmail.com>	2018-06-09 13:38:46 +02:00
Christian Decker	b9a2400a5f	gossip: Simplify message handling in gossip_store `gossip_store_add` is the entry point for messages from the network, so it should do the bookkeeping and disable on failures. `gossip_store_append` is the shared function that wraps messages and writes it to the given file. This is shared between the from network path and the compaction path, so we don't directly use the `gossip_store` instance, but `fd`s. Signed-off-by: Christian Decker <decker.christian@gmail.com>	2018-06-09 13:38:46 +02:00
Christian Decker	60efa314fe	gossip: Separate writing to gossip_store fd from append We write both when coming from outside, as well as when compacting, so we extract the write functionality to use it in both cases. Signed-off-by: Christian Decker <decker.christian@gmail.com>	2018-06-09 13:38:46 +02:00
Christian Decker	e6ab594904	gossip: Have gossip_store annotate gossip messages This makes the exposed interface much smaller, cleaner and will allow us to just replay gossip messages from the broadcast. Signed-off-by: Christian Decker <decker.christian@gmail.com>	2018-06-09 13:38:46 +02:00
Christian Decker	0546ca446d	gossip: Pass routing_state to the gossip_store We'll need it later to annotate the raw gossip messages, e.g., the capacity of a channel. Signed-off-by: Christian Decker <decker.christian@gmail.com>	2018-06-09 13:38:46 +02:00
Christian Decker	eaba5a249a	gossip: Introduce bookkeeping into gossip_store for rewrite Signed-off-by: Christian Decker <decker.christian@gmail.com>	2018-06-09 13:38:46 +02:00
Christian Decker	552ddb8dfd	gossip: Pass broadcast_state to gossip_store We'll be sourcing messages from this `broadcast_state` when rewriting the `gossip_store`. Signed-off-by: Christian Decker <decker.christian@gmail.com>	2018-06-09 13:38:46 +02:00
Christian Decker	37dc458b4d	gossip: Have the broadcast_state track its message count This is far more precise than bolting on the stale tracking in the `gossip_store`. Signed-off-by: Christian Decker <decker.christian@gmail.com>	2018-06-09 13:38:46 +02:00
Christian Decker	4e7fc99ae1	gossip: Duplicate removes can result in null pointers in broadcast Signed-off-by: Christian Decker <decker.christian@gmail.com>	2018-06-08 20:00:27 +02:00
Rusty Russell	5d6a9f3fb0	gossipd: check consistency. This is a hack to check that our gossip state is consistent on every insert and delete. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-06-08 17:53:34 +02:00
Rusty Russell	da55d3c0ff	gossipd: handle node_announcement when channel_announcement removed. Two cases: 1. Node no longer has any public channels: remove node_announcement. 2. Node's node_announcement now preceeds all the channel_announcements: move node_announcement to the end. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-06-08 17:53:34 +02:00
Rusty Russell	def18a7bc1	gossipd: implement broadcast_del to delete a specific index. Required if we want to reorder node_announcement broadcasts. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-06-08 17:53:34 +02:00
Rusty Russell	a38c619486	gossipd: keep index of node and channel announcements. This lets detect if a node announce preceeds a channel announce once we delete the node announcement. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-06-08 17:53:34 +02:00
Rusty Russell	1bb7713274	gossipd: minor cleanups. Suggested-by: @cdecker Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-06-06 03:25:56 +00:00
Rusty Russell	035d6067e4	Rename consider_own_node_announce to maybe_send_own_node_announce. Suggested-by: @cdecker Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-06-06 03:25:56 +00:00
Rusty Russell	5ec454c7b2	gossipd: don't queue node_announce unless we've queued channel_announce. We accept a node_announce if we have a channel_announce, but we can't queue it until we queue the channel_announce, which we only do once we have recieved a channel_update. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-06-06 03:25:56 +00:00
Rusty Russell	f52245d442	gossipd: support and use zlib encoding in short_channel_id encoding. We still use uncompressed if zlib turns out to be larger. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-06-06 03:25:56 +00:00
Rusty Russell	9e51e196c1	gossipd: dev-set-max-scids-encode-size to artificially force "full" replies. We cap each reply at a single one, which forces the code into our recursion logic. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-06-06 03:25:56 +00:00
Rusty Russell	118f099dd8	gossip: dev-query-channel-range to test query_channel_range. We keep a crappy bitmap, and finish when their replies cover everything we asked. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-06-06 03:25:56 +00:00
Rusty Russell	0dda5d4e1c	gossipd: handle query_channel_range We send them all the short_channel_ids we have in a given range. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-06-06 03:25:56 +00:00
Rusty Russell	c34b49c356	gossipd: add dev-send-timestamp-filter command for testing timestamp filtering. Since we currently only (ab)use it to send everything, we need a way to generate boutique queries for testing. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-06-06 03:25:56 +00:00
Rusty Russell	db6a6442cb	gossipd: single-thread the gossip timer. We have a function called 'wake_pkt_out' which is really 'start gossiping', so rename it to 'wake_gossip_out'. In addition, it's fired both on a timer, and in response to our first gossip_timestamp_filter, which leads to very confusing (though, technically, not incorrect) behavior. Keep a single timer at all times, which now doubles as the flag to indicating we're syncing right now. Set it once we're done syncing gossip. Technically this means we got from once-every-60-seconds to quiet-for-60-seconds-between-gossip, but that's OK. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-06-06 03:25:56 +00:00
Rusty Russell	531c82b6ad	gossipd: handle gossip_timestamp_filter message. And initialize filter (to "never") when we negotiated LOCAL_GOSSIP_QUERIES, and send initial filter message. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-06-06 03:25:56 +00:00
Rusty Russell	97bb6c5a28	gossipd: ensure incoming timestamps are reasonable. This is kind of orthogonal to the other changes, but makes sense: if we would instantly or never prune the message, don't accept it. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-06-06 03:25:56 +00:00
Rusty Russell	7a32637b5f	gossipd: add timestamp to each broadcast message. This lets us filter by timestamp. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-06-06 03:25:56 +00:00
Rusty Russell	4d8b29089b	gossipd: wire up infrastructure to generate query_short_channel_ids msg. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-06-06 03:25:56 +00:00
Rusty Russell	7ee5da858c	gossipd: handle query_short_channel_ids message. This doesn't handle zlib yet. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-06-06 03:25:56 +00:00
Rusty Russell	32c39c2979	gossipd: send node announcements after short_channel_id replies. We use the same system as for gossip: we trickle out replies when we're otherwise idle. As we trickle out replies to query_short_channel_ids, we remember the pubkeys of nodes we mention. At the end, we sort and uniquify, and then send any node_announcements we have for those. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-06-06 03:25:56 +00:00
Rusty Russell	5864415d31	gossipd: infrastructure to handle short_channel_id replies. We use the same system as for gossip: we trickle out replies when we're otherwise idle. This is minimal infrastructure: we don't actually process the query_short_channel_ids message yet, nor do we append node announcements. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-06-06 03:25:56 +00:00
Rusty Russell	6c6da45f53	wire: Update to lastest BOLT draft. This includes the gossip query messages. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-06-06 03:25:56 +00:00
Rusty Russell	803e4f8895	gossipd: announce nodes after channel announcement. In general, we need to only publish node announcements after publishing channel announcements, though we can accept node announcements as soon as we see channel announcements. So we keep a flag for those node_announcement which haven't been broadcast yet. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-06-06 03:25:56 +00:00
Rusty Russell	c2cc3823db	gossipd: announce own node only after channel announcement actually broadcast. handle_pending_cannouncement might not actually add the announcment, as it could be waiting for a channel_update. We need to wait for the actual announcement before considering announcing our node. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-06-06 03:25:56 +00:00
Rusty Russell	c2189229ca	gossipd: only broadcast channel_announcement once we have a channel_update. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-06-06 03:25:56 +00:00
Rusty Russell	2431742285	gossipd: don't publish private updates after channel_announce. We generate new ones anyway; removing this code changes fixes coming up which now only need to change one place. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-06-06 03:25:56 +00:00
Christian Decker	c550fd1752	gossip: Clean up the code to disable a local channel Signed-off-by: Christian Decker <decker.christian@gmail.com>	2018-05-31 02:30:27 +00:00
Christian Decker	c17848a3f3	gossip: Disable local channels after loading the gossip_store We don't have any connection yet, so how could they be active? Disable both sides to avoid trying to route through them or telling others to use them as `contact_points` in invoices. Signed-off-by: Christian Decker <decker.christian@gmail.com>	2018-05-31 02:30:27 +00:00
Christian Decker	f2dc406172	moveonly: Hoist gossip_disable_channel higher up We'll need it in the next commit Signed-off-by: Christian Decker <decker.christian@gmail.com>	2018-05-31 02:30:27 +00:00
Christian Decker	ba31dd2d9d	gossip: Avoid sending duplicate disable messages Signed-off-by: Christian Decker <decker.christian@gmail.com>	2018-05-31 02:30:27 +00:00
Christian Decker	8e278044e3	gossip: Disable channels when we lose the connection to the peer We're telling gossipd about disconnections anyway, so let's just use that signal to disable both sides of the channel. Signed-off-by: Christian Decker <decker.christian@gmail.com>	2018-05-31 02:30:27 +00:00
Christian Decker	3e5b798c60	gossip: Fix disable flags in handle_disable_channel Signed-off-by: Christian Decker <decker.christian@gmail.com>	2018-05-31 02:30:27 +00:00
Christian Decker	9982e24a1c	gossip: Add local_channel_close message to disable channels upon close This was failing some of our integration tests, i.e., the ones closing a channel and not waiting for sigexchange. The remote node would often not be quick enough to send us its disabling channel_update, and hence we'd still remember the incoming direction. That could then be sent out as part of an invoice, and fail subsequently. So just set both directions to be disabled and let the onchain spend clean up once it happens. Signed-off-by: Christian Decker <decker.christian@gmail.com>	2018-05-31 02:30:27 +00:00
Christian Decker	402125a70e	gossip: Add CRC32 checksum to the gossip_store Signed-off-by: Christian Decker <decker.christian@gmail.com> Reported-by: Rusty Russell @rustyrussell	2018-05-29 12:16:00 +00:00
Rusty Russell	88053bd1ca	gossipd: remove too-loose timestamp workaround. Now timestamps always increment, we don't have to allow them to do the wrong thing. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-05-21 09:17:57 -07:00
Rusty Russell	6454d7af84	gossip: cleanup keepalive updates to use the same create_channel_update() code. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-05-21 09:17:57 -07:00
Rusty Russell	fca5a9ef30	channeld: tell gossipd to generate channel_updates. This resolves the problem where both channeld and gossipd can generate updates, and they can have the same timestamp. gossipd is always able to generate them, so can ensure timestamp moves forward. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-05-21 09:17:57 -07:00
Rusty Russell	adbe02c6be	gossip: temporarily allow replacement of updates with same timestamp. We erroneously create updates with the same timestamps when tests run quickly, and the second one is ignored. We've already noted that this should be fixed: gossipd should generate all the updates, as it already has to do the case where channeld crashed, for example. But that's a bigger change. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-05-19 15:52:56 -04:00
Rusty Russell	c546b1bbb6	gossipd: specify origin of updates in errors. @cdecker points out that in test_forward, where we manually create a route, we get an error back which contains an update for an unknown channel. We should still note this, but it's not an error for testing. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-05-19 15:52:56 -04:00
Rusty Russell	8ee60e2d8e	testing: make sure we don't see gossip in bad order. This is something which generally shouldn't happen, but we didn't notice it previously. We ignore this warning in the case where a channel was deleted: this happens because one side can send an update while the other notices that the channel is closed. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-05-19 15:52:56 -04:00
Rusty Russell	177a1fc88e	gossipd: handle local channel creation separately from update. Note: this will break the gossip_store if they have current channels, but it will fail to parse and be discarded. Have local_add_channel do just that: the update is logically separate and can be sent separately. This removes the ugly 'bool add_to_store' flag. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-05-19 15:52:56 -04:00
Rusty Russell	540c68d7ca	gossipd/gossip_constants.h: Single place for BOLT constants. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-05-19 15:52:56 -04:00
Rusty Russell	b965ef7d1d	routing: make sure we fail if we can't unmarshal announcements. This is how we notice if the gossip store is corrupt! Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-05-19 15:52:56 -04:00
practicalswift	fab3b214b4	Avoid static analyzer warning about integer wraparound	2018-05-15 05:26:29 +00:00
Rusty Russell	1125682ceb	wireaddr: new type, ADDR_INTERNAL_FORPROXY, use it if we can't/wont resolve. Tor wasn't actually working for me to connect to anything, but it worked for 'ssh -D' testing. Note that the resulting 'netaddr' is a bit weird, but I guess it's honest. $ ./cli/lightning-cli connect 021f2cbffc4045ca2d70678ecf8ed75e488290874c9da38074f6d378248337062b { "id": "021f2cbffc4045ca2d70678ecf8ed75e488290874c9da38074f6d378248337062b" } $ ./cli/lightning-cli listpeers { "peers": [ { "state": "GOSSIPING", "id": "021f2cbffc4045ca2d70678ecf8ed75e488290874c9da38074f6d378248337062b", "netaddr": [ "ln1qg0je0lugpzu5ttsv78vlrkhteyg9yy8fjw68qr57mfhsfyrxurzkq522ah.lseed.bitcoinstats.com:9735" ], "connected": true, "owner": "lightning_gossipd" } ] } Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-05-11 09:15:54 +00:00
Rusty Russell	2a0acd3492	tor: log proxy communications using status_io. Good for debugging (you have to send SIGUSR1 to lightning_gossipd to turn it on though, and --log-level=io on the lightningd cmdline to have it output IO messages by default). I also noticed that io_tor_connect_after_req_host() does a useless test on reach->buffer[0] after it's written: remove it. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-05-11 09:15:54 +00:00
Rusty Russell	570283bc76	gossipd: don't use fake addrhint for non-addrhint resolutions. Use a wireaddr_internal directly (which is what we want). Also, don't hardcode 9735, use DEFAULT_PORT internally in seed_resolve_addr(). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-05-11 09:15:54 +00:00
Rusty Russell	de063edb54	gossip: extract function to derive seedname. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-05-11 09:15:54 +00:00
Rusty Russell	0d23f4fb4a	gossipd: hand io_tor_connect the host as a string. Previously it converted the wireaddr to a string internally: to support unresolved names we need that done externally. We actually tell the SOCKS5 proxy to do a domain lookup already, even though we give use IP/IPv6 address, so this change is sufficient to support connect-by-name. Note replacement of assert() with an explicit case statement, which has the benefit that the compiler complains when we add new ADDR_INTERNAL types. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-05-11 09:15:54 +00:00
Rusty Russell	a1dc4eef56	wireaddr: tell caller that we failed due to wanting DNS lookup, don't try. This is useful for the next patch, where we want to hand the unresolved name through to the proxy. This also addresses @Saibato's worry that we still called getaddrinfo() (with the AI_NUMERICHOST option) even if we didn't want a lookup. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-05-11 09:15:54 +00:00
Rusty Russell	5345e43354	gossipd: rename use_tor to use_proxy, Not all of them, but it's really about using the SOCKS proxy rather than really using Tor at this level. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-05-11 09:15:54 +00:00
Rusty Russell	bcb047a729	gossipd: fix uninitialized var. We assert() that it's set by one of the branches (it should be!) but if we don't hit one it's uninitialized, not NULL. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-05-11 09:15:54 +00:00
Rusty Russell	cca791d1cb	routing: clean up channel public/active states. 1. If we have a channel_announcement, the channel is public, otherwise it's not. Not all channels are public, as they can be local: those have a NULL channel_announcement. 2. If we don't have a channel_update, we know nothing about that half of the channel, and no other fields are valid. 3. We can tell if a half channel is disabled by the flags field directly. Note that we never send halfchannels without an update over gossip_getchannels_reply so that marshalling/unmarshalling can be vastly simplified. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-05-10 21:35:53 +02:00
Rusty Russell	9d1e496b11	gossipd: use a real update in local_add_channel. We generate one now, so let's use it. That lets us simplify the code, too. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-05-10 21:35:53 +02:00
Rusty Russell	c71e16f784	broadcast: invert ownership of messages. Make the update/announce messages own the element in the broadcast map not the other way around. Then we keep a pointer to the message, and when we free it (eg. channel closed, update replaces it), it gets freed from the broadcast map automatically. The result is much nicer! Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-05-10 21:35:53 +02:00
Rusty Russell	8940528bdb	gossipd: don't include private announcements into broadcast map. Basically, if we don't have an announcement for the channel, stash it, and once we get an announcement, replay if necessary. Fixes: #1485 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-05-10 21:35:53 +02:00
Rusty Russell	d1b28f832d	gossipd: when reconnecting, make sure we free old connection. Looks like old connection got a callback, and we blew up since the old peer was freed: 2018-05-06T10:57:11.865Z lightning_gossipd(14387): ...will try again in 300 seconds 2018-05-06T10:57:16.397Z lightning_gossipd(14387): peer_out WIRE_INIT 2018-05-06T10:57:16.405Z lightning_gossipd(14387): peer_in WIRE_INIT 2018-05-06T10:57:16.406Z lightning_gossipd(14387): peer 03b30e131241fe28fc923d74a060a8c7abfcc91323c485f8a9cf964575cb4fd3f4: reconnect for local peer 2018-05-06T10:57:16.406Z lightning_gossipd(14387): peer 03b30e131241fe28fc923d74a060a8c7abfcc91323c485f8a9cf964575cb4fd3f4 now remote 2018-05-06T10:57:16.406Z lightning_gossipd(14387): UPDATE WIRE_GOSSIP_PEER_CONNECTED 2018-05-06T10:57:16.406Z lightning_gossipd(14387): UPDATE WIRE_GOSSIP_PEER_CONNECTED 2018-05-06T10:57:16.406Z lightning_gossipd(14387): Handing back peer 03b30e131241fe28fc923d74a060a8c7abfcc91323c485f8a9cf964575cb4fd3f4 to master 2018-05-06T10:57:16.420Z lightning_gossipd(14387): hand_back_peer 03b30e131241fe28fc923d74a060a8c7abfcc91323c485f8a9cf964575cb4fd3f4: now local again 2018-05-06T10:57:16.420Z lightning_gossipd(14387): FATAL SIGNAL 11 2018-05-06T10:57:16.420Z lightning_gossipd(14387): backtrace: common/daemon.c:42 (crashdump) 0x416991 2018-05-06T10:57:16.420Z lightning_gossipd(14387): backtrace: (null):0 ((null)) 0x7f70cf57a4af 2018-05-06T10:57:16.420Z lightning_gossipd(14387): backtrace: common/msg_queue.c:38 (msg_dequeue) 0x418232 2018-05-06T10:57:16.421Z lightning_gossipd(14387): backtrace: gossipd/gossip.c:816 (peer_pkt_out) 0x404ac4 2018-05-06T10:57:16.421Z lightning_gossipd(14387): backtrace: ccan/ccan/io/io.c:59 (next_plan) 0x4316db 2018-05-06T10:57:16.421Z lightning_gossipd(14387): backtrace: ccan/ccan/io/io.c:427 (io_do_always) 0x4322ce 2018-05-06T10:57:16.421Z lightning_gossipd(14387): backtrace: ccan/ccan/io/poll.c:228 (handle_always) 0x433abd 2018-05-06T10:57:16.421Z lightning_gossipd(14387): backtrace: ccan/ccan/io/poll.c:249 (io_loop) 0x433b48 2018-05-06T10:57:16.421Z lightning_gossipd(14387): backtrace: gossipd/gossip.c:2407 (main) 0x4093aa 2018-05-06T10:57:16.421Z lightning_gossipd(14387): backtrace: (null):0 ((null)) 0x7f70cf56582f 2018-05-06T10:57:16.421Z lightning_gossipd(14387): backtrace: (null):0 ((null)) 0x402ad8 2018-05-06T10:57:16.421Z lightning_gossipd(14387): backtrace: (null):0 ((null)) 0xffffffffffffffff 2018-05-06T10:57:16.421Z lightning_gossipd(14387): STATUS_FAIL_INTERNAL_ERROR: FATAL SIGNAL Fixes: #1469 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-05-10 21:11:00 +02:00
Rusty Russell	89c76a5a78	Move always-use-proxy auto-override to master daemon. This means it will effect connect commands too (though it's too late to stop DNS lookups caused by commandline options). We also warn that this is one case where we allow forcing through Tor without a proxy set: it just means all connections will fail. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-05-10 02:28:44 +00:00
Rusty Russell	1106c40217	tor: add new 'autotor:' address option. This takes the Tor service address in the same option, rather than using a separate one. Gossipd now digests this like any other type. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-05-10 02:28:44 +00:00
Rusty Russell	a8c0bca6a8	gossipd: take over negotiation of autogenerated Tor addresses. For the moment, this is a straight handing of current parameters through from master to the gossip daemon. Next we'll change that. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-05-10 02:28:44 +00:00
Rusty Russell	5a0bc83b20	Tor: don't do seed queries if we're supposed to always use proxy. Risks leakage. We could do lookup via the proxy, but that's a TODO. There's only one occurance of getaddrinfo (and no gethostbyname), so we add a flag to the callers. Note: the use of --always-use-proxy suppresses all DNS lookups, even those from connect commands and the command line. FIXME: An implicit setting of use_proxy_always is done in gossipd if it determines that we are announcing nothing but Tor addresses, but that does not suppress 'connect'. This is fixed in a later patch. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-05-10 02:28:44 +00:00
Rusty Russell	9d8e3cf3da	gossip: handle Tor proxy better. 1. Only force proxy use if we don't announce any non-TOR address. There's no option to turn it off, so this makes more sense. 2. Don't assume we want an IPv4 socket to reach proxy, use the family from the struct addrinfo. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-05-10 02:28:44 +00:00
Rusty Russell	c3ccc14f19	Tor: remove --tor prefix from SOCKS5 options. It's usually for Tor, but we can use a socks5 proxy without it. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-05-10 02:28:44 +00:00
Rusty Russell	003cd29733	tor: clean up io_tor_connect. Instead of storing a wireaddr and converting to an addrinfo every time, just convert once (which also avoids the memory leak in the current code). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-05-10 02:28:44 +00:00
Rusty Russell	d87a6c3a48	wireaddr: more helpers, to convert to addrinfo. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-05-10 02:28:44 +00:00
Rusty Russell	c1e0a4d572	gossip/tor: rearrange functions to avoid predeclarations. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-05-10 02:28:44 +00:00
Rusty Russell	e229f113b9	gossipd: don't try to reach tor if we don't have a proxy. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-05-10 02:28:44 +00:00
Rusty Russell	11db7ca9e6	options: use NULL for unset Tor settings. Rename tor_proxyaddrs and tor_serviceaddrs to tor_proxyaddr and tor_serviceaddr: the 's' at the end suggests that there can be more than one. Make them NULL or non-NULL, rather than using all-zero if unset. Hand them the same way to gossipd; it's a bit of a hack since we don't have optional fields, so we use a counter which is always 0 or 1. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-05-10 02:28:44 +00:00
Rusty Russell	d9f13230cf	gossip/tor.c: new file for socks proxy code. All gossipd needs from common/tor is do_we_use_tor_addr(), so move that and the rest of the tor-specific handshake code into gossip/tor.c Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-05-10 02:28:44 +00:00
Rusty Russell	6d69e7b066	netaddress: fix up IsTor() We don't actually use it, but let's fix it anyway. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-05-10 02:28:44 +00:00
Saibato	877f63e99e	Initial TOR v2/v3 support. This is a rebased and combined patch for Tor support. It is extensively reworked in the following patches, but the basis remains Saibato's work, so it seemed fairest to begin with this. Minor changes: 1. Use --announce-addr instead of --tor-external. 2. I also reverted some whitespace and unrelated changes from the patch. 3. Removed unnecessary ';' after } in functions. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-05-10 02:28:44 +00:00
Christian Decker	81dc82de14	gossip: Clean up stale `store` argument to `handle_gossip_msg` This is a leftover from before splitting the `gossip_store` injection path from the handling of gossip messages. Signed-off-by: Christian Decker <decker.christian@gmail.com>	2018-05-08 22:06:04 +02:00
Rusty Russell	d40d22b68e	gossipd: don't try to connect to non-routable addresses. Someone could try to announce an internal address, and we might probe it. This breaks tests, so we add '--dev-allow-localhost' for our tests, so we don't eliminate that one. Of course, now we need to skip some more tests in non-developer mode. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-05-07 22:37:28 +02:00
Rusty Russell	af065417e1	gossipd: handle wildcard addresses correctly. If we're given a wildcard address, we can't announce it like that: we need to try to turn it into a real address (using guess_address). Then we use that address. As a side-effect of this cleanup, we only announce any '--addr' if it's routable. This fix means that our tests have to force '--announce-addr' because otherwise localhost isn't routable. This means that gossipd really controls the addresses now, and breaks them into two arrays: what we bind to, and what we announce. That is now what we return to the master for json_getinfo(), which prints them as 'bindings' and 'addresses' respectively. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-05-07 22:37:28 +02:00
Rusty Russell	52917ff6c9	More flexible address wildcards, only add wildcard if nothing else. 1. Add special option where an empty host means 'wildcard for IPv4 and/or IPv6' which means ':1234' can be used to set only the portnum. 2. Only add this protocol wildcard if --autolisten=1 (default) and no other addresses specified. 3. Pass it down to gossipd, so it can handle errors correctly: in most cases, it's fatal not to be able to bind to a port, but for this case, it's OK if we can only bind to one of IPv4/v6 (fatal iff neither). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-05-07 22:37:28 +02:00
Rusty Russell	73cd009a4c	gossipd/lightningd: use wireaddr_internal. This replacement is a little menial, but it explicitly catches all the places where we allow a local socket. The actual implementation of opening a AF_UNIX socket is almost hidden in the patch. The detection of "valid address" is now more complex: p->addr.itype != ADDR_INTERNAL_WIREADDR \|\| p->addr.u.wireaddr.type != ADDR_TYPE_PADDING But most places we do this, we should audit: I'm pretty sure we can't get an invalid address any more from gossipd (they may be in db, but we should fix that too). Closes: #1323 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-05-07 22:37:28 +02:00
Rusty Russell	e6c678e5df	gossipd: take over address determination, from master. It does all the other address handling, do this too. It also proves useful as we clean up wildcard address handling. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-05-07 22:37:28 +02:00
Rusty Russell	356e5dcea8	wireaddr: helpers to convert to/from IPv4/v6 addresses. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-05-07 22:37:28 +02:00
Rusty Russell	fe96fe10c7	Clean up network options. It's become clear that our network options are insufficient, with the coming addition of Tor and unix domain support. Currently: 1. We always bind to local IPv4 and IPv6 sockets, unless --port=0, --offline, or any address is specified explicitly. If they're routable, we announce. 2. --addr is used to announce, but not to control binding. After this change: 1. --port is deprecated. 2. --addr controls what we bind to and announce. 3. --bind-addr/--announce-addr can be used to control one and not the other. 4. Unless --autolisten=0, we add local IPv4 & IPv6 port 9735 (and announce if they are routable). 5. --offline still overrides listening (though announcing is still the same). This means we can bind to as many ports/interfaces as we want, and for special effects we can announce different things (eg. we're sitting behind a port forward or a proxy). What remains to implement is semi-automatic binding: we should be able to say '--addr=0.0.0.0:9999' and have the address resolve at bind time, or even '--addr=0.0.0.0:0' and have the port autoresolve too (you could determine what it was from 'lightning-cli getinfo'. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-05-07 22:37:28 +02:00
Rusty Russell	ed466a8523	lightningd: make explicit listen and reconnect flags. We set no_reconnect with --offline, but that doesn't work if !DEVELOPER. Make the flag positive, and non-DEVELOPER mode for gossipd. We also don't override portnum with --offline, but have an explicit 'listen' flag. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-05-07 22:37:28 +02:00
Christian Decker	9cfd09dc4a	gossip: HalfChans are public if we have an update and the Chan is Signed-off-by: Christian Decker <decker.christian@gmail.com>	2018-05-07 01:10:48 +00:00
Christian Decker	b028a363d8	gossip: Make sure we never add a channel twice Signed-off-by: Christian Decker <decker.christian@gmail.com>	2018-05-07 01:10:48 +00:00
practicalswift	8cc02f63bc	gossipd: Handle failed lseek(...)	2018-05-06 20:45:10 +02:00
practicalswift	5db73c6e27	Avoid static analyzer warnings about potentially uninitialized values	2018-05-01 17:14:33 +02:00
Rusty Russell	f083a699e2	gossipd: separate init and activate. This means gossipd is live and we can tell it things, but it won't receive incoming connections. The split also means that the main daemon continues (eg. loading peers from db) while gossipd is loading from the store, potentially speeding startup. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-04-30 12:01:36 +02:00
practicalswift	abf510740d	Force the use of the POSIX C locale for all commands and their subprocesses	2018-04-27 14:02:59 +02:00
ZmnSCPxj	69cdfba3c8	gossip: Use gossiped node_announcement to locate nodes. So we can get via address hint, DNS seed, or node_announcement gossip.	2018-04-26 11:45:38 +00:00
Rusty Russell	83e847575c	gossipd: don't handle multiple connect requests, combine them in lightningd. Christian points out that this is the pattern used elsewhere, for example. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-04-26 05:47:57 +00:00
Rusty Russell	8a16963f22	channeld: get told when announce depth already reached. If channeld dies for some reason (eg, reconnect) and we didn't yet announce the channel, we can miss doing so. This is unusual, because if lightningd restarts it rearms the callback which gives us funding_locked, so it only happens if just channel dies before sending the announcement message. This problem applies to both temporary announcement (for gossipd) and the real one. For the temporary one, simply re-send on startup, and remote the error msg gossipd gives if it sees a second one. For the real one, we need a flag to tell us the depth is sufficient; the peer will ignore re-sends anyway. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-04-26 05:47:57 +00:00
Rusty Russell	3b29d2b75a	gossipd: don't create a new chain of timers on every connect command. When a connect fails, if it's an important peer, we set a timer. If we have a manual connect command, this means we do this again, leading to another timer. For a manual command, free any existing timer; the normal fail logic will start another if necessary. Reported-by: @ZmnSCPxj Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-04-26 05:47:57 +00:00
Rusty Russell	c6483a57d0	gossipd: give more distinct errors. At least say whether we failed to connect at all, or failed cryptographic handshake, or failed reading/writing init messages. The errno can be "Operation now in progress" if the other end closes the socket on us: this happens when we handshake with the wrong key and it hangs up on us. Fixing this would require work on ccan/io though. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-04-26 05:47:57 +00:00
Rusty Russell	a134ca9659	gossipd: use exponential backoff on reconnect for important peers. We start at 1 second, back off to 5 minutes. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-04-26 05:47:57 +00:00
Rusty Russell	bc4809aa85	gossipd: make sure master only ever sees one active connection. When we get a reconnection, kill the current remote peer, and wait for the master to tell us it's dead. Then we hand it the new peer. Previously, we would end up with gossipd holding multiple peers, and the logging was really hard to interpret; I'm not completely convinced that we did the right thing when one terminated, either. Note that this now means we can have peers with neither ->local nor ->remote populated, so we check that more carefully. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-04-26 05:47:57 +00:00
Rusty Russell	be1f33b265	gossipd: have master explicitly tell us when peer is disconnected. Currently we intuit it from the fd being closed, but that may happen out of order with when the master thinks it's dead. So now if the gossip fd closes we just ignore it, and we'll get a notification from the master when the peer is disconnected. The notification is slightly ugly in that we have to disable it for a channel when we manually hand the channel back to gossipd. Note: as stands, this is racy with reconnects. See the next patch. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-04-26 05:47:57 +00:00
Rusty Russell	ab9d9ef3b8	gossipd: drain fd instead of passing around gossip index. (This was sitting in my gossip-enchancement patch queue, but it simplifies this set too, so I moved it here). In `94711969f` we added an explicit gossip_index so when gossipd gets peers back from other daemons, it knows what gossip it has sent (since gossipd can send gossip after the other daemon is already complete). This solution is insufficient for the more general case where gossipd wants to send other messages reliably, so replace it with the other solution: have gossipd drain the "gossip fd" which the daemon returns. This turns out to be quite simple, and is probably how I should have done it originally :( Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-04-26 05:47:57 +00:00
Rusty Russell	72c459dd6c	gossipd: keep reaching struct only when we're actively connecting, and don't retry 1. Lifetime of 'struct reaching' now only while we're actively doing connect. 2. Always free after a single attempt: if it's an important peer, retry on a timer. 3. Have a single response message to master, rather than relying on peer_connected on success and other msgs on failure. 4. If we are actively connecting and we get another command for the same id, just increment the counter The result is much simpler in the master daemon, and much nicer for reconnection: if they say to connect they get an immediate response, rather than waiting for 10 retries. Even if it's an important peer, it fires off another reconnect attempt, unless it's actively connecting now. This removes exponential backoff: that's restored in next patch. It also doesn't handle multiple addresses for a single peer. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-04-26 05:47:57 +00:00
Rusty Russell	20e3a18af5	gossipd: maintain a separate structure to track important peers. Rather than using a flag in reaching/peer; we make it self-contained as the next patch puts it straight into a timer callback. Also remove unused 'succeeded' field from struct peer. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-04-26 05:47:57 +00:00
Rusty Russell	8c2c1fe1c2	openingd: tell gossipd that the peer is important once funding tx in place. And on channel_fail_permanent and closing (the two places we drop to chain), we tell gossipd it's no longer important. Fixes: #1316 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-04-26 05:47:57 +00:00
Rusty Russell	c9fa9817f6	gossipd: explicitly track which peers are important. These don't have a maximum number of reconnect attempts, and ensure that we try to reconnect when the peer dies. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-04-26 05:47:57 +00:00
Rusty Russell	b1498f07c5	gossipd: exponential backoff for reconnect (5 minute ceiling). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-04-26 05:47:57 +00:00
Christian Decker	b84804009a	gossip: Use the DNS seeds to look up nodes if we don't have an addr Signed-off-by: Christian Decker <decker.christian@gmail.com>	2018-04-25 12:34:55 +02:00
Christian Decker	c635396766	common: Moving some bech32 related utilities to bech32_util These were so far only used for bolt11 construction, but we'll need them for the DNS seed as well, so here we just pull them out into their own unit and prefix them. Signed-off-by: Christian Decker <decker.christian@gmail.com>	2018-04-25 12:34:55 +02:00
Rusty Russell	5551c161ca	gossipd: finish startup before master prints that it's ready. We're about to remove automatic retrying of connect, and that uncovered that we actually print out our "Server started" message before we create the listening socket. Move the init higher (outside the db transaction) and make it a request/response, the loop until it's done. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-04-23 20:18:15 +00:00
Christian Decker	64fbea1528	gossip_store: Save local_add_channel messages and replay them Signed-off-by: Christian Decker <decker.christian@gmail.com>	2018-04-22 12:50:34 +02:00
Christian Decker	7497f972f1	moveonly: Move handle_local_add_channel to routing.h Signed-off-by: Christian Decker <decker.christian@gmail.com>	2018-04-22 12:50:34 +02:00
Christian Decker	ddbf016152	gossip: Pass rstate to handle_local_add_channel directly Signed-off-by: Christian Decker <decker.christian@gmail.com>	2018-04-22 12:50:34 +02:00
conanoc	7170521895	change spaces to tabs, align function parameters	2018-04-21 15:55:00 +02:00
conanoc	0733770559	Adjust indents	2018-04-21 15:55:00 +02:00
Rusty Russell	b0c2e3cd5c	gossipd: use a separate CSV file for the gossip_store types. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-04-11 15:58:18 +02:00
Rusty Russell	57b38cac71	gossip_store: empty, don't truncate, on error. Christian points out that we don't get spend notifications for old channels if we truncate the store. We'd need more work to do this, either validating the channels are still unspent, or replaying old blocks from the truncation point. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-04-11 15:58:18 +02:00
Rusty Russell	d5767fb3bb	gossipd: print stats even if we truncate store. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-04-11 15:58:18 +02:00
Rusty Russell	2b8293c9f6	gossipd: don't use pwrite, better error messaging on init. Since we open with O_APPEND, any write() will append as we want it to. But we want to distinguish a new store creation from a truncation due to bad version. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-04-11 15:58:18 +02:00
Rusty Russell	7d0a76c533	goossipd: make store load truncate on errors. We don't need pread, we just need read, and we can loop internally. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-04-11 15:58:18 +02:00
Rusty Russell	3e1b584e73	gossipd: always add message internally before store. If something goes (fatally) wrong, we won't add it to the store. This reveals a latent bug in routing_add_channel_announcement() and friend which did a take() on msg, which it doesn't own. TAKES means that it will take ownership IF the caller requests, not an unconditional ownership transfer (which is an antipattern). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-04-11 15:58:18 +02:00
Rusty Russell	abbbfac8e2	gossipd: return bool from message announce routines. Now we can tell if they fail, so we can respond appropriately if we're loading from the store. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-04-11 15:58:18 +02:00
Rusty Russell	e8a052eb6d	routing: add more debugging to announcement replaced fail. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-04-11 15:58:18 +02:00
Rusty Russell	30c1ab424f	gossipd: reorder handle_node_announcement I found the logic a bit confusing, so this reworks to bunch the "no node" cases together. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-04-11 15:58:18 +02:00
Rusty Russell	4aca909acb	routing: don't store node_announce unannounced nodes. We enter nodes in the map when we create channels, but those channels could be local and unannounced. This triggered a failure in test_gossip_persistence since the store truncated when it saw the first thing was a node_announce. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-04-11 15:58:18 +02:00
ZmnSCPxj	86290b54d4	routing: Use 64-bit msatoshi for messages to and from routing. Internally both payment and routing use 64-bit, but the interface between them used 32-bit. Since both components already support 64-bit we should use that.	2018-04-09 20:45:26 +02:00
Christian Decker	a121b7dbc3	gossip: Make gossipd less noisy when receiving requests This is very noisy when syncing with the blockchain Signed-off-by: Christian Decker <decker.christian@gmail.com>	2018-04-09 00:21:20 +00:00
Christian Decker	2de7f622cb	gossip: Add an explicit debug message when handing back a peer Signed-off-by: Christian Decker <decker.christian@gmail.com>	2018-04-09 00:21:20 +00:00
practicalswift	693d6fddab	Adjust loglevel for error message "Failed to get peername for incoming conn"	2018-04-03 14:05:27 +02:00
Rusty Russell	1a4a59d221	common/daemon: common routines for all daemons. In particular, the main daemon and subdaemons share the backtrace code, with hooks for logging. The daemon hook inserts the io_poll override, which means we no longer need io_debug.[ch]. Though most daemons don't need it, they still link against ccan/io, so it's harmess (suggested by @ZmnSCPxj). This was tested manually to make sure we get backtraces still. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-04-03 14:03:28 +02:00
Rusty Russell	20bbd92564	utils: add subdaemon_shutdown() to consolidate subdaemon cleanup. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2018-04-03 14:03:28 +02:00
Christian Decker	63f22d70b5	gossip: Store channel deletions so we don't re-add them on restart If we only remember the actions that added channels then we'd restore them when re-reading the gossip_store, so put a tombstone in there to remember to delete it. These will be cleared upon re-writing the store since the announcements wont be written anymore. Signed-off-by: Christian Decker <decker.christian@gmail.com>	2018-03-30 16:35:00 +02:00
Christian Decker	9132a097b5	gossip: Free the channel when notified of its funding being spent Signed-off-by: Christian Decker <decker.christian@gmail.com>	2018-03-30 16:35:00 +02:00
Christian Decker	5571f2143e	gossip: Added message to notify gossipd of outpoint spends Signed-off-by: Christian Decker <decker.christian@gmail.com>	2018-03-30 16:35:00 +02:00
Christian Decker	0e0ad1aa4d	gossip: Check that we have a node before applying changes This was a tricky one to find, it turns out that some nodes are sending node_announcements even if they don't have a channel announced yet. If they are a peer and the channel is currently verifying then we'll have a local channel in the network view, hence accept the node_announcement, but when replaying, the node_announcement will be replayed and we won't have a channel yet. This just skips node_announcements, which is always safe. Reported-by: @laszlohanyecz Signed-off-by: Christian Decker <decker.christian@gmail.com>	2018-03-29 23:15:33 +02:00
practicalswift	7e9750ffee	Reduce variable scopes	2018-03-26 01:31:21 +00:00

... 6 7 8 9 10 ...

997 Commits