Commit Graph

868 Commits

Author SHA1 Message Date
Rusty Russell
bad9734dc7 gossip_store: remove redundant copy_message.
The single caller can easily use transfer_store_msg instead.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-06-04 01:29:39 +00:00
Rusty Russell
5591c0b5d8 gossipd: don't send gossip stream, let per-peer daemons read it themselves.
Keeping the uintmap ordering all the broadcastable messages is expensive:
130MB for the million-channels project.  But now we delete obsolete entries
from the store, we can have the per-peer daemons simply read that sequentially
and stream the gossip itself.

This is the most primitive version, where all gossip is streamed;
successive patches will bring back proper handling of timestamp filtering
and initial_routing_sync.

We add a gossip_state field to track what's happening with our gossip
streaming: it's initialized in gossipd, and currently always set, but
once we handle timestamps the per-peer daemon may do it when the first
filter is sent.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-06-04 01:29:39 +00:00
Rusty Russell
4399faf57c gossipd: make writes to gossip_store atomic.
There's a corner case where otherwise a reader could see the header and
not the body of a message.  It could handle that in various ways,
but simplest (and most efficient) is to avoid it happening.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-06-04 01:29:39 +00:00
Rusty Russell
a5f6ef385a gossipd: don't wrap messages when we send them to the peer.
They already send *us* gossip messages, so they have to be distinct anyway.
Why make us both do extra work?

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-06-04 01:29:39 +00:00
Rusty Russell
df00f20e4a gossipd: erase old entries from the store, don't just append.
We use the high bit of the length field: this way we can still check
that the checksums are valid on deleted fields.

Once this is done, serially reading the gossip_store file will result
in a complete, ordered, minimal gossip broadcast.  Also, the horrible
corner case where we might try to delete things from the store during
load time is completely gone: we only load non-deleted things.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-06-04 01:29:39 +00:00
Rusty Russell
696dc6b597 gossipd: disable gossip_store upgrade.
We're about to bump version again, and the code to upgrade it was
quite hairy (and buggy!).  It's not worthwhile for such a
poorly-tested path: I will just add code to limit how much incoming
gossip we get to avoid flooding when we upgrade, however.

I also use a modern gossip_store version in our test_gossip_store_load
test, instead of relying on the upgrade path.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-06-04 01:29:39 +00:00
Rusty Russell
43f2cbd250 gossipd: track gossip_store locations of local channels.
We currently don't care, but the next patch means we have to find them
again.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-06-04 01:29:39 +00:00
Rusty Russell
180a552fba gossip_store: mark private updates separately from normal ones.
They're really gossipd-internal, and we don't want per-peer daemons
to confuse them with normal updates.

I don't bump the gossip_store version; that's coming with another update
anyway.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-06-04 01:29:39 +00:00
Rusty Russell
763697eb4c gossipd: fix gossip_store calling delete.
Now we handle node_announcements properly, we have a failure case where we
try to move them when a channel is deleted while loading the store.

We're going to remove this soon, in favor of in-place delete, so
workaround this for now to avoid an assert() when we try to write to
the store while loading.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-06-03 11:04:25 -07:00
Rusty Russell
21fe518513 gossip_store: fix 'bad node_announcement' by allowing node_announcement on un-updated channel.
When we first receive a channel_update, we write both the
channel_announcement and that channel_update to the store: we need
that first update so we can set the channel_announcement timestamp.

However, the channel_update can be replaced later.  This means we can
have a channel_announcement, a node_update which relies on it, then
the channel_update later.

So move the "this applies to a pending announcement" check lower, where
gossip_store can use it too.  Has a nice side-effect of avoiding
one lookup of the node id.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-06-03 11:04:25 -07:00
Rusty Russell
c233fc5063 gossipd: fix spurious unused error with gcc-9 -O3.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-06-03 00:07:11 +00:00
Rusty Russell
c091a4ee40 gossipd: fix spurious gcc warning.
It turns out that we don't look at type when we return 0, but gcc isn't
quite smart enough for that.  Initializing to -1 is good practice anyway
for the failure path.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-06-03 00:07:11 +00:00
William Casarin
3f035cb3cc gossipd: fix uninitialized free on short_route in goto path
Fix a path where tal_free is called on an uninitialized variable

If the first `goto bad_total` executes, then that path has
uninitialized `short_route` but bad_total passes through to `out`
whose first call is tal_free(short_route).

This was noticed by a maybe-uninitialized heuristic on gcc 7.4.0:

gossipd/routing.c: In function ‘find_shorter_route’:
gossipd/routing.c:1096:2: error: ‘short_route’ may be used
uninitialized in this function [-Werror=maybe-uninitialized]
  tal_free(short_route);

Reported-by: @ZmnSCPxj <https://github.com/ElementsProject/lightning/pull/2674#issuecomment-495617253>
Signed-off-by: William Casarin <jb55@jb55.com>
2019-06-03 00:07:11 +00:00
Rusty Russell
654e89b5fc gossipd: free channels in routing_state destructor.
Cleans up the tests.

Suggested-by: @ZmnSCPxj
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-05-22 11:28:44 +00:00
Rusty Russell
d1f43d993a gossipd: use explicit destructor for struct chan.
Each destructor2 costs 40 bytes, and struct chan is only 120 bytes.  So
this drops our memory usage quite a bit:

MCP bench results change:
   -vsz_kb:580004-580016(580006+/-4.8)
   +vsz_kb:533148

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-05-22 11:28:44 +00:00
Rusty Russell
59e75f1b2c gossipd: reply to large listchannels in parts.
This has two effects: most importantly, it avoids the problem where
lightningd creates a 800MB JSON blob in response to listchannels,
which causes OOM on the Raspberry Pi (our previous max allocation was
832MB).  This is because lightning-cli can start draining the JSON
while we're filling the buffer, so we end up with a max allocation of
68MB.

But despite being less efficient (multiple queries to gossipd), it
actually speeds things up due to the parallelism:

MCP with -O3 -flto before vs after:
-listchannels_sec:8.980000-9.330000(9.206+/-0.14)
+listchannels_sec:7.500000-7.830000(7.656+/-0.11)

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-05-22 11:28:44 +00:00
Rusty Russell
cb9c44ef27 gossipd: remove unnecessary dev_unknown_channel_satoshis arg.
We now have a test blockchain for MCP which has the correct channels,
so this is not needed.

Also fix a benchmark script bug where 'mv "$DIR"/log
"$DIR"/log.old.$$' would fail if you log didn't exist from a previous run.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-05-22 11:28:44 +00:00
Rusty Russell
85d8848ede gossipd: neaten insert_broadcast a little.
Suggested-by: @cdecker.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-05-22 11:28:44 +00:00
darosior
d9db9dc1ae gossipd: fix listnodes crash on non existing id
'node_arr' was not instanciated if an id was passed to listnodes and we could not get a node from it
2019-05-16 19:30:10 +02:00
Rusty Russell
f5a218f9d1 gossipd: send per-peer daemons offsets into gossip store.
Instead of reading the store ourselves, we can just send them an
offset.  This saves gossipd a lot of work, putting it where it belongs
(in the daemon responsible for the specific peer).

MCP bench results:
   store_load_msec:28509-31001(29206.6+/-9.4e+02)
   vsz_kb:580004-580016(580006+/-4.8)
   store_rewrite_sec:11.640000-12.730000(11.908+/-0.41)
   listnodes_sec:1.790000-1.880000(1.83+/-0.032)
   listchannels_sec:21.180000-21.950000(21.476+/-0.27)
   routing_sec:2.210000-11.160000(7.126+/-3.1)
   peer_write_all_sec:36.270000-41.200000(38.168+/-1.9)

Signficant savings in streaming gossip:
   -peer_write_all_sec:48.160000-51.480000(49.608+/-1.1)
   +peer_write_all_sec:35.780000-37.980000(36.43+/-0.81)

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-05-13 05:16:18 +00:00
Rusty Russell
0e37ac2433 common: move gossip_store read routine where subdaemons can access it.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-05-13 05:16:18 +00:00
Rusty Russell
d8db4e871f gossipd: provide new fd to per-peer daemons when we compact it.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-05-13 05:16:18 +00:00
Rusty Russell
13717c6ebb gossipd: hand a gossip_store_fd to all subdaemons.
This will let them read from the gossip store directly.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-05-13 05:16:18 +00:00
Rusty Russell
89291b930e gossipd: pass amount into gossip_store, rather than having it fetch.
We need to store the channel capacity for channel_announcement: hand it
in directly rather than having the gossip_store code do a lookup.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-05-13 05:16:18 +00:00
Rusty Russell
7ede5aac31 gossip_store: change format so we store raw messages.
Save some overhead, plus gets us ready for giving subdaemons direct
store access.  This is the first time we *upgrade* the gossip_store,
rather than just discarding.

The downside is that we need to add an extra message after each
channel_announcement, containing the channel capacity.

After:
  store_load_msec:28337-30288(28975+/-7.4e+02)
  vsz_kb:582304-582316(582306+/-4.8)
  store_rewrite_sec:11.240000-11.800000(11.55+/-0.21)
  listnodes_sec:1.800000-1.880000(1.84+/-0.028)
  listchannels_sec:22.690000-26.260000(23.878+/-1.3)
  routing_sec:2.280000-9.570000(6.842+/-2.8)
  peer_write_all_sec:48.160000-51.480000(49.608+/-1.1)

Differences:
  -vsz_kb:582320
  +vsz_kb:582316
  -listnodes_sec:2.100000-2.170000(2.118+/-0.026)
  +listnodes_sec:1.800000-1.880000(1.84+/-0.028)
  -peer_write_all_sec:51.600000-52.550000(52.188+/-0.34)
  +peer_write_all_sec:48.160000-51.480000(49.608+/-1.1)

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-05-13 05:16:18 +00:00
Rusty Russell
c7034f271a gossipd: avoid tal overhead in listnodes
We know exactly how many there will be, so allocate an entire array up-front.

  -listnodes_sec:2.540000-2.610000(2.584+/-0.029)
  +listnodes_sec:2.100000-2.170000(2.118+/-0.026)

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-05-13 05:16:18 +00:00
trueptolemy
fefe7dfbab Gossipd: cleanup extra repeated code 2019-05-06 08:52:36 +00:00
Rusty Russell
0ca0db765a gossipd: fix crash if we truncate store.
Entries we've already loaded expect to exist in the store.  We could go
back and remove them all, but instead just truncate at the known-good
point.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-05-01 11:59:12 +02:00
Rusty Russell
b248bb155a tools/bench-gossipd.sh: make it work (where possible) with DEVELOPER=0
Some tests require dev support, but the rest can run.  We simplify
the gossip_store output so it's the same in non-dev mode too.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-24 13:46:39 -05:00
Rusty Russell
0fc42415c2 gossipd/routing: remove BFG implementation.
Now we can benchmark, and remove 500 bytes per node.

MCP results from 5 runs, min-max(mean +/- stddev):
	store_load_msec:35093-37907(36146+/-1.1e+03)
	vsz_kb:555168
	store_rewrite_sec:12.120000-13.750000(12.7+/-0.6)
	listnodes_sec:1.270000-1.370000(1.322+/-0.039)
	listchannels_sec:29.770000-31.600000(30.82+/-0.64)
	routing_sec:0.00
	peer_write_all_sec:63.630000-67.850000(65.432+/-1.7)

MCP notable changes from pre-Dijkstra (>1 stddev):
	-vsz_kb:577456
	+vsz_kb:555168
	-routing_sec:60.70
	+routing_sec:12.04

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-18 06:33:09 +00:00
Rusty Russell
cfdb012b30 gossipd: re-add fuzz logic to routing.
Do it inside the can_reach() function, which is less optimal for BFG
which does 20 ops on the same channel, but fine for Dijkstra.

This does have a measurable cost, so we might want to use
non-cryptographic fuzz in future:

$ gossipd/test/run-bench-find_route 100000 100:

Before:
	100 (100 succeeded) routes in 100000 nodes in 97346 msec (973461784 nanoseconds per route)

After:
	100 (100 succeeded) routes in 100000 nodes in 113381 msec (1133813412 nanoseconds per route)

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-18 06:33:09 +00:00
Rusty Russell
e197956032 gossipd/routing: Iterate on Dijkstra when route is too long.
If a route is too long, we try to bias Dijkstra towards choosing a
shorter route by adding a per-hop cost.  We do a naive "shortest path"
pass, then using that cost as a ceiling on per-hop cost, we do a
binary search.

There are some subtleties: we use risk rather than total as our
counter field (we normally bias this by 1 anyway, so it's easy to make
that a variable), and we set riskfactor to a mimimal value once we're
iterating.  It's good enough to get a solution, we don't need to do a
2-dimensional search on riskfactor and riskbias.

Of course, this is extremely slow if we hit it on our benchmark,
though it doesn't happen in a more realistic network:

$ gossipd/test/run-bench-find_route 100000 100:

Before:
	100 (79 succeeded) routes in 100000 nodes in 25341 msec (253412314 nanoseconds per route)

After:
	100 (100 succeeded) routes in 100000 nodes in 97346 msec (973461784 nanoseconds per route)

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-18 06:33:09 +00:00
Rusty Russell
f8ffae837d gossipd: speed Dijkstra a little.
Our uintmap can be a little slow with all the reallocation, so leave
NULL entries and walk to find the first one.  Since we don't clean
them up, keep a cache of where the min non-all-NULL value is in the
heap.

It's clearer benefit on really large tests, so here's 1M nodes:

Comparison using gossipd/test/run-bench-find_route 1000000 10:

Before:
	10 (10 succeeded) routes in 1000000 nodes in 91995 msec (9199532898 nanoseconds per route)

After:
	10 (10 succeeded) routes in 1000000 nodes in 20605 msec (2060539287 nanoseconds per route)

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-18 06:33:09 +00:00
Rusty Russell
7caa37f0f1 gossipd: implement Dijkstra.
Use a uintmap as our minheap.

Note that Dijkstra can give overlength routes, so some checks are disabled.

Comparison using gossipd/test/run-bench-find_route 100000 10:

Before:
	10 (10 succeeded) routes in 100000 nodes in 120087 msec (12008708402 nanoseconds per route)
After:
	10 (10 succeeded) routes in 100000 nodes in 2269 msec (226925462 nanoseconds per route)

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-18 06:33:09 +00:00
Rusty Russell
4d84a436f5 gossipd: temporarily disable fuzz in routing.
This allows precise comparison between Dijkstra and Bellman-Ford without
worrying about fuzz.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-18 06:33:09 +00:00
Rusty Russell
594af8049b gossipd: extract common functionality.
This will be needed by Dijkstra as well.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-18 06:33:09 +00:00
Rusty Russell
6dfa46d65a gossipd/test: add test for handling overlong routes.
This is a weakness with Dijkstra, so write an explicit unit test that
we can find a short enough (but more expensive) route.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-18 06:33:09 +00:00
trueptolemy
77236caa91 gossipd: fix the check for node announcement in broadcast_state_check()
There should check if node_id_1 was stored in pubkeys, other than checking scid.
2019-04-16 00:20:26 +00:00
trueptolemy
274f156b28 gossiped: rename empty_node_map() to new_node_map()
empty_node_map() sounds like a destructor. new_node_map() makes sense and is better.
2019-04-14 23:12:00 +00:00
trueptolemy
ee036a2e36 Gossipd: change the pending_cannouncement list to htable 2019-04-14 05:39:31 +00:00
Rusty Russell
261921dee2 gossipd: adjust peers' broadcast_offset when compacting store.
When we compact the store, we need to adjust the broadast index for
peers so they know where they're up to.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-11 18:31:34 -07:00
Rusty Russell
fdb42c3170 gossipd: don't keep channel_updates in memory.
This requires some trickiness when we want to re-add unannounced channels
to the store after compaction, so we extract a common "copy_message" to
transfer from old store to new.

MCP results from 5 runs, min-max(mean +/- stddev):
	store_load_msec:36034-37853(37109.8+/-5.9e+02)
	vsz_kb:577456
	store_rewrite_sec:12.490000-13.250000(12.862+/-0.27)
	listnodes_sec:1.250000-1.480000(1.364+/-0.09)
	listchannels_sec:30.820000-31.480000(31.068+/-0.24)
	routing_sec:26.940000-27.990000(27.616+/-0.39)
	peer_write_all_sec:65.690000-68.600000(66.698+/-0.99)

MCP notable changes from previous patch (>1 stddev):
	-vsz_kb:1202316
	+vsz_kb:577456

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-11 18:31:34 -07:00
Rusty Russell
0370ed2eca gossipd: use pread in the store.
The next patch causes us to access the store while loading (we read
channel_updates for local peers), which messes up loading due to the
lseek involved.

Using pread() is atomic with seek & read, and also a bit more
efficient.  Make the header contiguous too, while we're here.

We don't need pwrite: we always open with O_APPEND which means the
seek-to-end is implicit.

MCP results from 5 runs, min-max(mean +/- stddev):
	store_load_msec:36771-38289(37529.6+/-5.3e+02)
	vsz_kb:1202316
	store_rewrite_sec:12.460000-13.280000(12.784+/-0.29)
	listnodes_sec:1.240000-1.410000(1.34+/-0.058)
	listchannels_sec:29.850000-31.840000(30.908+/-0.69)
	routing_sec:27.800000-31.790000(28.822+/-1.5)
	peer_write_all_sec:66.200000-68.720000(67.44+/-0.84)

MCP notable changes from previous patch (>1 stddev):
	-store_load_msec:39207-45089(41374.6+/-2.2e+03)
	+store_load_msec:36771-38289(37529.6+/-5.3e+02)
	-store_rewrite_sec:15.090000-16.790000(15.654+/-0.63)
	+store_rewrite_sec:12.460000-13.280000(12.784+/-0.29)
	-peer_write_all_sec:66.830000-76.850000(71.976+/-3.6)
	+peer_write_all_sec:66.200000-68.720000(67.44+/-0.84)

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-11 18:31:34 -07:00
Rusty Russell
2135c7a024 gossipd: allow reading from the store during load.
When we no longer keep channel_updates in memory, there's a path where
we access them on load: when we promote a local channel to an
announced channel.

This breaks at the moment, since gs->fd == -1; change it to a writable
flag instead.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-11 18:31:34 -07:00
Rusty Russell
aeb72a05e3 gossipd: remove some fields from struct chan.
The txout_script field is unused; the local_disable only applies to
the handful of local channels, so move that into a hash table.

MCP results from 5 runs, min-max(mean +/- stddev):
	store_load_msec:39207-45089(41374.6+/-2.2e+03)
	vsz_kb:1202316
	store_rewrite_sec:15.090000-16.790000(15.654+/-0.63)
	listnodes_sec:1.290000-3.790000(1.938+/-0.93)
	listchannels_sec:30.190000-32.120000(31.31+/-0.69)
	routing_sec:28.220000-31.340000(29.314+/-1.2)
	peer_write_all_sec:66.830000-76.850000(71.976+/-3.6)

MCP notable changes from previous patch (>1 stddev):
	-store_load_msec:35107-37944(36686+/-1e+03)
	+store_load_msec:39207-45089(41374.6+/-2.2e+03)
	-vsz_kb:1218036
	+vsz_kb:1202316
	-listchannels_sec:28.510000-30.270000(29.6+/-0.6)
	+listchannels_sec:30.190000-32.120000(31.31+/-0.69)

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-11 18:31:34 -07:00
Rusty Russell
3280466e19 gossipd: don't keep channel_announcement messages in memory.
MCP results from 5 runs, min-max(mean +/- stddev):
	store_load_msec:35107-37944(36686+/-1e+03)
	vsz_kb:1218036
	store_rewrite_sec:14.060000-17.970000(15.966+/-1.6)
	listnodes_sec:1.270000-1.350000(1.314+/-0.034)
	listchannels_sec:28.510000-30.270000(29.6+/-0.6)
	routing_sec:30.230000-31.510000(30.83+/-0.44)
	peer_write_all_sec:67.390000-70.710000(68.568+/-1.2)

MCP notable changes from previous patch (>1 stddev):
	-vsz_kb:1780516
	+vsz_kb:1218036

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-11 18:31:34 -07:00
Rusty Russell
2fd4a0121f gossipd: unify is_chan_public / is_chan_announced.
We used to have a `struct chan` while we're waiting for an update; now we
keep that internally.  So a `struct chan` without a channel_announcement
in the store is private, and other is public.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-11 18:31:34 -07:00
Rusty Russell
aafc489edb gossipd: remove info fields from struct node.
Reload them from disk if they do listnodes.

MCP results from 5 runs, min-max(mean +/- stddev):
	store_load_msec:35390-38659(37336.4+/-1.3e+03)
	vsz_kb:1780516
	store_rewrite_sec:13.800000-16.800000(15.02+/-0.98)
	listnodes_sec:1.280000-1.530000(1.382+/-0.096)
	listchannels_sec:28.700000-30.440000(29.34+/-0.68)
	routing_sec:30.120000-31.080000(30.526+/-0.35)
	peer_write_all_sec:65.910000-76.850000(69.462+/-4.1)

MCP notable changes from previous patch (>1 stddev):
	-vsz_kb:1792996
	+vsz_kb:1780516
	-listnodes_sec:1.030000-1.120000(1.068+/-0.032)
	+listnodes_sec:1.280000-1.530000(1.382+/-0.096)

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-11 18:31:34 -07:00
Rusty Russell
0608c36301 gossipd: don't keep node_announcement messages in memory.
MCP results from 5 runs, min-max(mean +/- stddev):
store_load_msec:34779-38628(36903.4+/-1.4e+03)
vsz_kb:1792996
store_rewrite_sec:14.440000-15.040000(14.672+/-0.24)
listnodes_sec:1.030000-1.120000(1.068+/-0.032)
listchannels_sec:27.860000-32.850000(30.05+/-1.7)
routing_sec:30.020000-31.700000(31.044+/-0.56)
peer_write_all_sec:65.100000-70.600000(68.422+/-2)

-vsz_kb:1780516
+vsz_kb:1792996
-listnodes_sec:1.280000-1.530000(1.382+/-0.096)
+listnodes_sec:1.030000-1.120000(1.068+/-0.032)

MCP notable changes from previous patch (>1 stddev):
	-store_load_msec:30640-33236(32202+/-8.7e+02)
	+store_load_msec:34779-38628(36903.4+/-1.4e+03)
	-vsz_kb:1812956
	+vsz_kb:1792996
	-listnodes_sec:0.590000-0.660000(0.62+/-0.033)
	+listnodes_sec:1.030000-1.120000(1.068+/-0.032)
	-peer_write_all_sec:60.380000-61.320000(60.836+/-0.37)
	+peer_write_all_sec:65.100000-70.600000(68.422+/-2)

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-11 18:31:34 -07:00
Rusty Russell
cb297b0a1b gossipd: free tmpctx children in gossip_store_load loop.
We're accumulating children, and we'll get more in the successive
patches.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-11 18:31:34 -07:00
Rusty Russell
3ef767fd52 gossipd: don't use cached node_announcement for redundancy checking
Re-parse the existing message, since we'e going to get rid of those
fields.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-11 18:31:34 -07:00
Rusty Russell
e02f5817fe gossipd: don't create struct chan for yet-to-be-updated channels.
We currently create a struct chan when we receive a `channel_announcement`,
but we can only broadcast once we have a `channel_update` (since that
provides the timestamp).

This means a `struct chan` can be in a weird state where it exists,
but is unusable (can't use without an update), and also means we need to
keep the channel_announcement message around until an update arrives, so
we can put it in the gossip_store.

Instead, keep track of these "unupdated" channels separately, and check
for them in all the places we search for a specific channel to update.

MCP results from 5 runs, min-max(mean +/- stddev):
	store_load_msec:30640-33236(32202+/-8.7e+02)
	vsz_kb:1812956
	store_rewrite_sec:13.410000-16.970000(14.438+/-1.3)
	listnodes_sec:0.590000-0.660000(0.62+/-0.033)
	listchannels_sec:28.140000-29.560000(28.816+/-0.56)
	routing_sec:29.530000-32.590000(30.352+/-1.1)
	peer_write_all_sec:60.380000-61.320000(60.836+/-0.37)

MCP notable changes from previous patch (>1 stddev):
	-vsz_kb:1812904
	+vsz_kb:1812956
	-store_rewrite_sec:21.390000-27.070000(23.596+/-2.4)
	+store_rewrite_sec:13.410000-16.970000(14.438+/-1.3)
	-listnodes_sec:1.120000-1.230000(1.176+/-0.044)
	+listnodes_sec:0.590000-0.660000(0.62+/-0.033)
	-listchannels_sec:38.900000-50.580000(44.716+/-3.9)
	+listchannels_sec:28.140000-29.560000(28.816+/-0.56)
	-routing_sec:45.080000-48.160000(46.814+/-1.1)
	+routing_sec:29.530000-32.590000(30.352+/-1.1)
	-peer_write_all_sec:58.780000-87.150000(72.278+/-9.7)
	+peer_write_all_sec:60.380000-61.320000(60.836+/-0.37)

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-11 18:31:34 -07:00
Rusty Russell
d8aee68ba8 gossipd: handle duplicate nodes from unverified channel_announces properly.
If we have a channel_announcement, we catch any node_announcement for
either end while we validate the channel_announcement.  But if we have
multiple channel_announcements and the first one failed to verify, it
would remove this catch, meaning we'd discard following node_announcements
even though there was a pending channel_announcement.

The answer is to use a simple reference count, and as a further
optimization, only place the `pending_node_announce` if there's no
node already.

We also move the process_pending_node_announcement() calls lower down,
so *any* new channel creation checks it.  This is more robust, and
will prove useful for the next patch, where we can use the same
mechanism to handle node_announcements on channel_announcements which
are verified, but don't yet have a channel_update.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-11 18:31:34 -07:00
Rusty Russell
da884751e8 gossipd: make routing_add_channel_update discard old timestamps.
This is currently done higher up, in handle_channel_update(), but
that's one reason why handle_channel_update() has to do a channel
lookup.  Moving the check down means handle_channel_update() can do a
minimal "get node id for this channel" so it can check the signature.

This helps, because the chan lookup semantics are changing in the next
few patches.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-11 18:31:34 -07:00
Rusty Russell
6b9069ee28 broadcast: don't keep payload pointer.
If we need the payload, pull it from the gossip store.

MCP results from 5 runs, min-max(mean +/- stddev):
	store_load_msec:30189-52561(39416.4+/-8.8e+03)
	vsz_kb:1812904
	store_rewrite_sec:21.390000-27.070000(23.596+/-2.4)
	listnodes_sec:1.120000-1.230000(1.176+/-0.044)
	listchannels_sec:38.900000-50.580000(44.716+/-3.9)
	routing_sec:45.080000-48.160000(46.814+/-1.1)
	peer_write_all_sec:58.780000-87.150000(72.278+/-9.7)

MCP notable changes from previous patch (>1 stddev):
	-vsz_kb:2288784
	+vsz_kb:1812904
	-store_rewrite_sec:38.060000-39.130000(38.426+/-0.39)
	+store_rewrite_sec:21.390000-27.070000(23.596+/-2.4)
	-listnodes_sec:0.750000-0.850000(0.794+/-0.042)
	+listnodes_sec:1.120000-1.230000(1.176+/-0.044)
	-listchannels_sec:30.740000-31.760000(31.096+/-0.35)
	+listchannels_sec:38.900000-50.580000(44.716+/-3.9)
	-routing_sec:29.600000-33.560000(30.472+/-1.5)
	+routing_sec:45.080000-48.160000(46.814+/-1.1)
	-peer_write_all_sec:49.220000-52.690000(50.892+/-1.3)
	+peer_write_all_sec:58.780000-87.150000(72.278+/-9.7)

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-11 18:31:34 -07:00
Rusty Russell
da845b660b gossipd: gossip_store_get() to load a single store entry.
This will allow us to load on demand, and not keep all messages in
memory.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-11 18:31:34 -07:00
Rusty Russell
1f08cfb3e3 gossipd: use file offset within store as broadcast index.
Instead of an arbitrary counter, we can use the file offset for our
partial ordering, removing a field.  It takes some care when we compact
the store, however, as this field changes.

MCP results from 5 runs, min-max(mean +/- stddev):
	store_load_msec:34271-35283(34789.6+/-3.3e+02)
	vsz_kb:2288784
	store_rewrite_sec:38.060000-39.130000(38.426+/-0.39)
	listnodes_sec:0.750000-0.850000(0.794+/-0.042)
	listchannels_sec:30.740000-31.760000(31.096+/-0.35)
	routing_sec:29.600000-33.560000(30.472+/-1.5)
	peer_write_all_sec:49.220000-52.690000(50.892+/-1.3)

MCP notable changes from previous patch (>1 stddev):
	-store_load_msec:35685-38538(37090.4+/-9.1e+02)
	+store_load_msec:34271-35283(34789.6+/-3.3e+02)
	-vsz_kb:2288768
	+vsz_kb:2288784
	-peer_write_all_sec:51.140000-58.350000(55.69+/-2.4)
	+peer_write_all_sec:49.220000-52.690000(50.892+/-1.3)

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-11 18:31:34 -07:00
Rusty Russell
ec50ec6a71 gossipd: make gossip loading stats accurate.
They didn't count the header sizes when reporting bytes, which is
misleading.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-11 18:31:34 -07:00
Rusty Russell
eb4564c3cd gossipd: embed broadcast information into each structure.
This is more compact, but also required once we replace the arbitrary
"index" with an actual offset into the gossip store.  That will let us
remove the in-memory variants entirely.

MCP results from 5 runs, min-max(mean +/- stddev):
	store_load_msec:35685-38538(37090.4+/-9.1e+02)
	vsz_kb:2288768
	store_rewrite_sec:35.530000-41.230000(37.904+/-2.3)
	listnodes_sec:0.720000-0.810000(0.762+/-0.041)
	listchannels_sec:30.750000-35.990000(32.704+/-2)
	routing_sec:29.570000-34.010000(31.374+/-1.8)
	peer_write_all_sec:51.140000-58.350000(55.69+/-2.4)

MCP notable changes from previous patch (>1 stddev):
	-vsz_kb:2621808
	+vsz_kb:2288768

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-11 18:31:34 -07:00
Rusty Russell
62918fcb3b gossip_store: avoid gratuitous copy on load.
Doesn't make measurable difference, but an obvious optimization.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-11 18:31:34 -07:00
Rusty Russell
617c23e735 gossipd: use u32 for timestamp.
We used an s64 so we could use -1 and save a check, but that's just
silly as we have adjacent non-u64 fields: wastes 7 bytes per node
and 16 per channel.

Interestingly, this seemed to make us a little slower for some reason.

MCP results from 5 runs, min-max(mean +/- stddev):
	store_load_msec:35569-38776(37169.8+/-1.2e+03)
	vsz_kb:2621808
	store_rewrite_sec:35.870000-40.290000(38.14+/-1.6)
	listnodes_sec:0.740000-0.800000(0.768+/-0.023)
	listchannels_sec:29.820000-32.730000(30.972+/-0.99)
	routing_sec:30.110000-30.590000(30.346+/-0.18)
	peer_write_all_sec:52.420000-59.160000(54.692+/-2.5)

MCP notable changes from previous patch (>1 stddev):
	-store_load_msec:32825-36365(34615.6+/-1.1e+03)
	+store_load_msec:35569-38776(37169.8+/-1.2e+03)
	-vsz_kb:2637488
	+vsz_kb:2621808
	-store_rewrite_sec:35.150000-36.200000(35.59+/-0.4)
	+store_rewrite_sec:35.870000-40.290000(38.14+/-1.6)
	-listnodes_sec:0.590000-0.710000(0.682+/-0.046)
	+listnodes_sec:0.740000-0.800000(0.768+/-0.023)
	-peer_write_all_sec:49.020000-52.890000(50.376+/-1.5)
	+peer_write_all_sec:52.420000-59.160000(54.692+/-2.5)

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-11 18:31:34 -07:00
Rusty Russell
0b484b111e gossipd: make more compact getchannels entries.
We can save significant space by combining both sides: so much that we
can reduce the WIRE_LEN_LIMIT to something sane again.

MCP results from 5 runs, min-max(mean +/- stddev):
	store_load_msec:34467-36764(35517.8+/-7.7e+02)
	vsz_kb:2637488
	store_rewrite_sec:35.310000-36.580000(35.816+/-0.44)
	listnodes_sec:1.140000-2.780000(1.596+/-0.6)
	listchannels_sec:55.390000-58.110000(56.998+/-0.99)
	routing_sec:30.330000-30.920000(30.642+/-0.19)
	peer_write_all_sec:50.640000-53.360000(51.822+/-0.91)

MCP notable changes from previous patch (>1 stddev):
	-store_rewrite_sec:34.720000-35.130000(34.94+/-0.14)
	+store_rewrite_sec:35.310000-36.580000(35.816+/-0.44)

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-09 12:37:16 -07:00
Rusty Russell
91849dddc4 wire: use struct node_id for node ids.
Don't turn them to/from pubkeys implicitly.  This means nodeids in the store
don't get converted, but bitcoin keys still do.

MCP results from 5 runs, min-max(mean +/- stddev):
	store_load_msec:33934-35251(34531.4+/-5e+02)
	vsz_kb:2637488
	store_rewrite_sec:34.720000-35.130000(34.94+/-0.14)
	listnodes_sec:1.020000-1.290000(1.146+/-0.086)
	listchannels_sec:51.110000-58.240000(54.826+/-2.5)
	routing_sec:30.000000-33.320000(30.726+/-1.3)
	peer_write_all_sec:50.370000-52.970000(51.646+/-1.1)

MCP notable changes from previous patch (>1 stddev):
	-store_load_msec:46184-47474(46673.4+/-4.5e+02)
	+store_load_msec:33934-35251(34531.4+/-5e+02)
	-vsz_kb:2638880
	+vsz_kb:2637488
	-store_rewrite_sec:46.750000-48.280000(47.512+/-0.51)
	+store_rewrite_sec:34.720000-35.130000(34.94+/-0.14)

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-09 12:37:16 -07:00
Rusty Russell
a2fa699e0e Use node_id everywhere for nodes.
I tried to just do gossipd, but it was uncontainable, so this ended up being
a complete sweep.

We didn't get much space saving in gossipd, even though we should save
24 bytes per node.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-09 12:37:16 -07:00
Rusty Russell
d4ab0592c5 fixup! gossipd: use simple inline array for nodes with few channels.
Suggested-by: @cdecker
Suggested-by: @niftynei
2019-04-09 12:37:16 -07:00
Rusty Russell
b6494c1994 gossipd: use simple inline array for nodes with few channels.
Allocating a htable is overkill for most nodes; we can fit 11 pointers
in the same space (10, since we use 1 to indicate we're using an array).

MCP results from 5 runs, min-max(mean +/- stddev):
	store_load_msec:45947-47016(46683.4+/-4e+02)
	vsz_kb:2639240
	store_rewrite_sec:46.950000-49.830000(48.048+/-0.95)
	listnodes_sec:1.090000-1.350000(1.196+/-0.095)
	listchannels_sec:48.960000-57.640000(53.358+/-2.8)
	routing_sec:29.990000-33.880000(31.088+/-1.4)
	peer_write_all_sec:49.360000-53.210000(51.338+/-1.4)

MCP notable changes from previous patch (>1 stddev):
-	vsz_kb:2641316
+	vsz_kb:2639240

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-09 12:37:16 -07:00
Rusty Russell
417e1bab7d gossipd: use iterator helpers for iterating node channels.
Makes the next step easier.

MCP results from 5 runs, min-max(mean +/- stddev):
	store_load_msec:45791-46917(46330.4+/-3.6e+02)
	vsz_kb:2641316
	store_rewrite_sec:47.040000-48.720000(47.684+/-0.57)
	listnodes_sec:1.140000-1.340000(1.2+/-0.072)
	listchannels_sec:50.970000-54.250000(52.698+/-1.3)
	routing_sec:29.950000-31.010000(30.332+/-0.37)
	peer_write_all_sec:51.570000-52.970000(52.1+/-0.54)

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-09 12:37:16 -07:00
Rusty Russell
891ee20a59 tools/bench-gossipd.sh: rough benchmark for gossipd and the million channels project
Outputs CSV.  We add some stats for load times in developer mode, so we can
easily read them out.

peer_read_all_sec doesn't work, since we seem to reject about half the
updates for having bad signatures.  It's also very slow...

routing fails, for unknown reasons, so that failure is ignored in routing_sec.

Results from 5 runs, min-max(mean +/- stddev):
	store_load_msec,vsz_kb,store_rewrite_sec,listnodes_sec,listchannels_sec,routing_sec,peer_write_all_sec
	39275-44779(40466.8+/-2.2e+03),2899248,41.010000-44.970000(41.972+/-1.5),2.280000-2.350000(2.304+/-0.025),49.770000-63.390000(59.178+/-5),33.310000-34.260000(33.62+/-0.35),42.100000-44.080000(43.082+/-0.67)

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>



Header from folded patch 'fixup!_tools-bench-gossipd.sh__rough_benchmark_for_gossipd_and_the_million_channels_project-2.patch':

fixup! tools/bench-gossipd.sh: rough benchmark for gossipd and the million channels project

Suggested-by: @niftynei



Header from folded patch 'fixup!_tools-bench-gossipd.sh__rough_benchmark_for_gossipd_and_the_million_channels_project-1.patch':

fixup! tools/bench-gossipd.sh: rough benchmark for gossipd and the million channels project

MCP filename change.



Header from folded patch 'tools-bench-gossipd.sh__dont_print_csv_by_default.patch':

tools/bench-gossipd.sh: don't print CSV by default.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>



Header from folded patch 'fixup!_tools-bench-gossipd.sh__rough_benchmark_for_gossipd_and_the_million_channels_project.patch':

fixup! tools/bench-gossipd.sh: rough benchmark for gossipd and the million channels project

Make shellcheck happy.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-08 04:41:43 +00:00
Rusty Russell
2bd7df93c6 gossipd: preserve unannounced channels across store compaction.
Otherwise we'd forget them on restart, again.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-08 04:41:43 +00:00
Rusty Russell
c424c42668 gossipd: store local channel updates across restart, even if unannounced.
Either private or simply not enough confirms.  They would have been added
on reconnect, but that's not ideal.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-08 04:41:43 +00:00
Rusty Russell
7c8f506a0f dev-compact-store-gossip: specific RPC so we can test gossip_store rewrite.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-08 04:41:43 +00:00
Rusty Russell
5b12007a4f gossipd: dev option to allow unknown channels.
This lets us benchmark without a valid blockchain.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>



Header from folded patch 'fixup!_gossipd__dev_option_to_allow_unknown_channels.patch':

fixup! gossipd: dev option to allow unknown channels.

Suggested-by: @cdecker
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-08 04:41:43 +00:00
Rusty Russell
f8f6533dba dev: --dev-gossip-time so gossipd doesn't prune old data.
This is useful for canned data, such as the million channels project.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-08 04:41:43 +00:00
Rusty Russell
b2c93beaed gossipd: use htable instead of simple array for node's channels.
For giant nodes, it seems we spend a lot of time memmoving this array.
Normally we'd go for a linked list, but that's actually hard: each
channel has two nodes, so needs two embedded list pointers, and when
iterating there's no good way to figure out which embedded pointer
we'd be using.

So we (ab)use htable; we don't really need an index, but it's good for
cache-friendly iteration (our main operation).  We can actually change
to a hybrid later to avoid the extra allocation for small nodes.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-08 04:41:43 +00:00
Christian Decker
f3c234529e gossip: Cache txout query failures
If we asked `bitcoind` for a txout and it failed we were not storing that
information anywhere, meaning that when we see the channel announcement the
next time we'd be reaching out to `lightningd` and `bitcoind` again, just to
see it fail again. This adds an in-memory cache for these failures so we can
just ignore these the next time around.

Fixes #2503

Signed-off-by: Christian Decker <decker.christian@gmail.com>
2019-04-01 23:54:19 +00:00
Christian Decker
426b22fdcb gossip: Bump gossip_getnodes_reply result count to be u32 as well
Otherwise we'll just have the same issue once we reach 65k nodes.

Signed-off-by: Christian Decker <decker.christian@gmail.com>
2019-03-27 12:48:52 +01:00
Christian Decker
25e829c7d1 gossip: Make the listchannels reply result count a u32
Fixes #2504

Signed-off-by: Christian Decker <decker.christian@gmail.com>
Reported-by: Antoine Le Calvez <@alecalve>
2019-03-27 12:48:52 +01:00
Rusty Russell
00f3a84af2 test: fix thinko in gossipd/test/run-bench-find_route.c
Reported-by: @cdecker
Fixes: #2440
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-03-05 11:42:43 +01:00
Rusty Russell
38e7d19dd5 Makefile: check for direct amount_sat/amount_msat access.
We need to do it in various places, but we shouldn't do it lightly:
the primitives are there to help us get overflow handling correct.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-02-21 08:01:37 +00:00
Rusty Russell
28f5da7b2f tools/generate-wire: use amount_msat / amount_sat for peer protocol.
Basically we tell it that every field ending in '_msat' is a struct
amount_msat, and 'satoshis' is an amount_sat.  The exceptions are
channel_update's fee_base_msat which is a u32, and
final_incorrect_htlc_amount's incoming_htlc_amt which is also a
'struct amount_msat'.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-02-21 08:01:37 +00:00
Rusty Russell
3ac0e814d0 daemons: use amount_msat/amount_sat in all internal wire transfers.
As a side-effect of using amount_msat in gossipd/routing.c, we explicitly
handle overflows and don't need to pre-prune ridiculous-fee channels.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-02-21 08:01:37 +00:00
Rusty Russell
85b8b25749 bitcoin/chainparams: use amount_sat / amount_msat
Simple changes, but ripples through the code.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-02-21 08:01:37 +00:00
Rusty Russell
83adb94583 lightningd and routing: use struct amount_msat.
We use it in route_hop, and paper over it in the JSON APIs.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-02-21 03:44:44 +00:00
Rusty Russell
7fad7bccba common/amount: new types struct amount_msat and struct amount_sat.
They're generally used pass-by-copy (unusual for C structs, but
convenient they're basically u64) and all possibly problematic
operations return WARN_UNUSED_RESULT bool to make you handle the
over/underflow cases.

The new #include in json.h means we bolt11.c sees the amount.h definition
of MSAT_PER_BTC, so delete its local version.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-02-21 00:44:57 +00:00
Michael Schmoock
302a78f4eb fix: add inline exception for recent cppcheck false positive 2019-02-18 01:06:01 +00:00
Rusty Russell
b99293fbb6 short_channel_id: don't accept :-separated in JSON if --allow-deprecated-apis=false
We need to still accept it when parsing the database, but this flag
should allow upgrade testing for devs building on top

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-02-08 16:52:30 -08:00
Rusty Russell
3ae0c20026 getroute: change definition (and pay default) for riskfactor.
Up until now, riskfactor was useless due to implementation bugs, and
also the default setting is wrong (too low to have an effect on
reasonable payment scenarios).

Let's simplify the definition (by assuming that P(failure) of a node
is 1), to make it a simple percentage.  I examined the current network
fees to see what would work, and under this definition, a default of
10 seems reasonable (equivalent to 1000 under the old definition).

It is *this* change which finally fixes our test case!  The riskfactor
is now 40msat (1500000 * 14 * 10 / 5259600 = 39.9), comparable with
worst-case fuzz is 50msat (1001 * 0.05 = 50).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-02-06 18:39:52 +01:00
Rusty Russell
05f95b59c1 gossipd: take into account risk in final route comparison.
We were only comparing by total msatoshis.

Note, this *still* isn't sufficient to fix our indirect problem, as
our risk values are all 1 (the minimum):

	lightning_gossipd(25480): 2 hop solution: 1501990 + 2
	lightning_gossipd(25480): 3 hop solution: 1501971 + 3
	...
	lightning_gossipd(25480): => chose 3 hop solution

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-02-06 18:39:52 +01:00
Rusty Russell
662bb0c565 gossipd: fix riskfactor passing.
We used a u16, and a 1000 multiplier, which meant we wrapped at
riskfactor 66.  We also never undid the multiplier, so we ended up
applying 1000x the riskfactor they specified.

This changes us to pass the riskfactor with a 1M multiplier.  The next
patch changes the definition of riskfactor to be more useful.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-02-06 18:39:52 +01:00
Rusty Russell
6a26b0c18d gossipd: increase randomness in route selection.
We have a seed, which is for (future!) unit testing consistency.  This
makes it change every time, so our pay_direct_test is more useful.

I tried restarting the noed around the loop, but it tended to fail
rebinding to the same port for some reason?

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-02-06 18:39:52 +01:00
Rusty Russell
afab1f7b3c gossipd: handle onion errors internally.
As a general rule, lightningd shouldn't parse user packets.  We move the
parsing into gossipd, and have it respond only to permanent failures.

Note that we should *not* unconditionally remove a channel on
WIRE_INVALID_ONION_HMAC, as this can be triggered (and we do!) by
feeding sendpay a route with an incorrect pubkey.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-01-23 22:08:08 +01:00
Rusty Russell
4eddf57fd9 gossipd: don't mark channels unroutable.
For transient failures, the pay plugin should simply exclude those
from route considerations.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-01-23 22:08:08 +01:00
Rusty Russell
018a3f1d58 short_channel_id: make mk_short_channel_id return a failure.
We had a bug 0ba547ee10 caused by
short_channel_id overflow.  If we'd caught this, we'd have terminated
the peer instead of crashing, so add appropriate checks.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-01-21 12:31:06 +01:00
Rusty Russell
e2777642c0 getroute: add direction to route returned.
We also ignore it in sendpay.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-01-17 13:02:24 +01:00
Rusty Russell
0ba547ee10 gossipd: handle overflowing query properly (avoid slow 100% CPU reports)
Don't do this:
  (gdb) bt
  #0  0x00007f37ae667c40 in ?? () from /lib/x86_64-linux-gnu/libz.so.1
  #1  0x00007f37ae668b38 in ?? () from /lib/x86_64-linux-gnu/libz.so.1
  #2  0x00007f37ae669907 in deflate () from /lib/x86_64-linux-gnu/libz.so.1
  #3  0x00007f37ae674c65 in compress2 () from /lib/x86_64-linux-gnu/libz.so.1
  #4  0x000000000040cfe3 in zencode_scids (ctx=0xc1f118, scids=0x2599bc49 "\a\325{", len=176320) at gossipd/gossipd.c:218
  #5  0x000000000040d0b3 in encode_short_channel_ids_end (encoded=0x7fff8f98d9f0, max_bytes=65490) at gossipd/gossipd.c:236
  #6  0x000000000040dd28 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17290511, number_of_blocks=8) at gossipd/gossipd.c:576
  #7  0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17290511, number_of_blocks=16) at gossipd/gossipd.c:595
  #8  0x000000000040ddee in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17290495, number_of_blocks=32) at gossipd/gossipd.c:596
  #9  0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17290495, number_of_blocks=64) at gossipd/gossipd.c:595
  #10 0x000000000040ddee in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17290431, number_of_blocks=128) at gossipd/gossipd.c:596
  #11 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17290431, number_of_blocks=256) at gossipd/gossipd.c:595
  #12 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17290431, number_of_blocks=512) at gossipd/gossipd.c:595
  #13 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17290431, number_of_blocks=1024) at gossipd/gossipd.c:595
  #14 0x000000000040ddee in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17289408, number_of_blocks=2047) at gossipd/gossipd.c:596
  #15 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17289408, number_of_blocks=4095) at gossipd/gossipd.c:595
  #16 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17289408, number_of_blocks=8191) at gossipd/gossipd.c:595
  #17 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17289408, number_of_blocks=16382) at gossipd/gossipd.c:595
  #18 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17289408, number_of_blocks=32764) at gossipd/gossipd.c:595
  #19 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17289408, number_of_blocks=65528) at gossipd/gossipd.c:595
  #20 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17289408, number_of_blocks=131056) at gossipd/gossipd.c:595
  #21 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17289408, number_of_blocks=262112) at gossipd/gossipd.c:595
  #22 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17289408, number_of_blocks=524225) at gossipd/gossipd.c:595
  #23 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17289408, number_of_blocks=1048450) at gossipd/gossipd.c:595
  #24 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17289408, number_of_blocks=2096900) at gossipd/gossipd.c:595
  #25 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17289408, number_of_blocks=4193801) at gossipd/gossipd.c:595
  #26 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17289408, number_of_blocks=8387603) at gossipd/gossipd.c:595
  #27 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17289408, number_of_blocks=16775207) at gossipd/gossipd.c:595
  #28 0x000000000040ddee in queue_channel_ranges (peer=0x3868fc8, first_blocknum=514201, number_of_blocks=33550414) at gossipd/gossipd.c:596
  #29 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=514201, number_of_blocks=67100829) at gossipd/gossipd.c:595
  #30 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=514201, number_of_blocks=134201659) at gossipd/gossipd.c:595
  #31 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=514201, number_of_blocks=268403318) at gossipd/gossipd.c:595
  #32 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=514201, number_of_blocks=536806636) at gossipd/gossipd.c:595
  #33 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=514201, number_of_blocks=1073613273) at gossipd/gossipd.c:595
  #34 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=514201, number_of_blocks=2147226547) at gossipd/gossipd.c:595
  #35 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=514201, number_of_blocks=4294453094) at gossipd/gossipd.c:595
  #36 0x000000000040df26 in handle_query_channel_range (peer=0x3868fc8, msg=0x37e0678 "\001\ao\342\214\n\266\361\263r\301\246\242F\256c\367O\223\036\203e\341Z\b\234h\326\031") at gossipd/gossipd.c:625

The cause was that converting a block number to an scid truncates it
at 24 bits.  When we look through the index from (truncated number) to
(real end number) we get every channel, which is too large to encode,
so we iterate again.

This fixes both that problem, and also the issue that we'd end up
dividing into many empty sections until we get to the highest block
number.  Instead, we just tack the empty blocks on to then end of the
final query.

(My initial version requested 0xFFFFFFFE blocks, but the dev code
which records what blocks were returned can't make a bitmap that big
on 32 bit).

Reported-by: George Vaccaro
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-01-15 11:34:45 -08:00
Rusty Russell
9f1f79587e short_channel_id_dir: new primitive for one direction of short_channel_id
Currently only used by gossipd for channel elimination.

Also print them in canonical form (/[01]), so tests need to be
changed.

Suggested-by: @cdecker
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-01-15 12:01:38 +01:00
Rusty Russell
80753bfbd5 Feedback from @niftynei.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-01-15 12:01:38 +01:00
Rusty Russell
dc2ee9639b listchannels: allow source arg to list channels by their source node.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-01-15 12:01:38 +01:00
Rusty Russell
358b7fda91 getroute: allow caller to specify maximum hops.
This is required for routeboost.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-01-15 12:01:38 +01:00
Rusty Russell
599ec5efbe gossipd: allow an array of excluded channels for getroute_request.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-01-15 12:01:38 +01:00
Rusty Russell
be64dd84ca waitsendpay: indicate which channel direction the error was.
You can figure this yourself by knowing the route, but it's better to report
it directly here.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-01-15 12:01:38 +01:00
Rusty Russell
c0cfddfa95 test/run-bench-find_route: fix so it runs properly.
We didn't populate the channels properly so it always failed.

Additionally, somewhere along the line we kept using the single scid
so we only created one channel.

Also, the next patch will start comparing the pubkeys, so make valid
ones: use an array so we don't affect the benchmark too much.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-01-15 12:01:38 +01:00
Rusty Russell
1567238dd9 invoice: option to expose/not-expose private channels.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-01-15 12:01:38 +01:00
Rusty Russell
fe4a600bc7 routeboost: don't use channels to dead-end nodes.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-01-15 12:01:38 +01:00
Rusty Russell
547d6ab878 routeboost: expose private channel in invoice iff we have no public ones.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-01-15 12:01:38 +01:00
Rusty Russell
f321b1d35f getroute: remove seed arg, document fromid, make default fuzzpercent match docs.
seed isn't very useful at this level: I've left it in routing.c
because it might be useful for detailed testing.  Pretty sure it's unused,
so I simply removed it.

The fuzzpercent is documented to default at 5%, but actually was 75%.
Fix that too.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-01-15 12:01:38 +01:00
Rusty Russell
26dda57cc0 utils: make tal_arr_expand safer.
Christian and I both unwittingly used it in form:

	*tal_arr_expand(&x) = tal(x, ...)

Since '=' isn't a sequence point, the compiler can (and does!) cache
the value of x, handing it to tal *after* tal_arr_expand() moves it
due to tal_resize().

The new version is somewhat less convenient to use, but doesn't have
this problem, since the assignment is always evaluated after the
resize.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-01-15 12:01:38 +01:00
Christian Decker
659a26ea5a misc: Update short_channel_id representation to use 'x' separators
Reported-by: Alex Bosworth <@alexbosworth>
Signed-off-by: Christian Decker <decker.christian@gmail.com>
2019-01-15 03:50:27 +00:00
Christian Decker
94eb2620dc bolt: Updated the BOLT specification to the latest version
This is mainly just copying over the copy-editing from the
lightning-rfc repository.

[ Split to just perform changes after the UNKNOWN_PAYMENT_HASH change --RR ]

Signed-off-by: Christian Decker <decker.christian@gmail.com>
Reported-by: Rusty Russell <@rustyrussell>
2019-01-15 02:19:56 +00:00
Christian Decker
65054ae72e bolt: Updated the BOLT specification to a07dc3df3b4611989e3359f28f96c574f7822850
This is mainly just copying over the copy-editing from the
lightning-rfc repository.

[ Split to just perform changes prior to the UNKNOWN_PAYMENT_HASH change --RR ]

Signed-off-by: Christian Decker <decker.christian@gmail.com>
Reported-by: Rusty Russell <@rustyrussell>
2019-01-15 02:19:56 +00:00
Rusty Russell
23540fe956 common: make funding_tx and withdraw_tx share UTXO code.
They both do the same thing: convert utxos into tx inputs.  Share code.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-12-06 23:11:51 +01:00
Rusty Russell
ab735dcbe6 gossipd: wire up memleak detection.
For simplicity we dump leaks to logs, and just return a bool to master.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-11-22 05:15:42 +00:00
Rusty Russell
78771ca371 gossipd: mark timers as not being leaks.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-11-22 05:15:42 +00:00
Rusty Russell
5a81dbd783 common/daemon: enable/cleanup memleak in daemon_setup / daemon_shutdown.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-11-22 05:15:42 +00:00
Rusty Russell
29b672b117 gossipd: hear no wumbo.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-11-21 21:43:37 +00:00
Rusty Russell
9620393109 gossipd: store chainparams internally.
We keep a chain_hash in struct daemon, becayse otherwise we end up with
`&peer->daemon->rstate->chainparams->genesis_blockhash` which is a bit
ridiculous.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-11-21 21:43:37 +00:00
Rusty Russell
5312ec1e34 gossipd: add documentation comments now it's relatively understandable.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-11-21 00:36:31 +00:00
Rusty Russell
ea2c03e2e2 gossipd: don't have code to exit final loop; we always leave via master_gone.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-11-21 00:36:31 +00:00
Rusty Russell
4038061d0f gossipd: use take() in getroute_req.
Trivial optimization.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-11-21 00:36:31 +00:00
Rusty Russell
5c60d7ffb2 gossipd: split wire types into msgs from lightningd and msgs from per-peer daemons
This avoids some very ugly switch() statements which mixed the two,
but we also take the chance to rename 'towire_gossip_' to
'towire_gossipd_' for those inter-daemon messages; they're messages to
gossipd, not gossip messages.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-11-21 00:36:31 +00:00
Rusty Russell
07b16e37d0 daemon_conn: don't rely on outq_empty callback telling us to retry queue.
We had at least one bug caused by it not returning true when it had
queued something.  Instead, just re-check thq queue after it's called.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-11-21 00:36:31 +00:00
Rusty Russell
4e9eba1965 gossipd: rework query_channel_range to accept overlapping range.
We shouldn't insist on an exact reponse match: they can batch it and send
a whole batch, as long as it overlaps what we ask.

We also change to a bitmap to save some memory.

This isn't note in the CHANGELOG since we don't actually send gossip
range queries except for testing.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-11-21 00:36:31 +00:00
Rusty Russell
363564301f gossipd: be more rigorous in handling peer messages vs. daemon requests.
Messages from a peer may be invalid in many ways: we send an error
packet in that case.  Rather than internally calling peer_error,
however, we make it explicit by having the handle_ functions return
NULL or an error packet.

Messages from the daemon itself should not be invalid: we log an error
and close the fd to them if it is.  Previously we logged an error but
didn't kill them.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-11-21 00:36:31 +00:00
Rusty Russell
1bd76861fd gossipd: reorder functions into related groups (MOVEONLY)
It's MOVEONLY but for the removal of the '#ifndef TESTING' which was
needed for old test code.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-11-21 00:36:31 +00:00
Christian Decker
8e83d43c39 opts: Split early from non-early args so plugins can register theirs
The idea is that `plugin` is an early arg that is parsed (from command
line or the config file). We can then start the plugins and have them
tell us about the options they'd like to add to the mix, before we
actually parse them.

Signed-off-by: Christian Decker <@cdecker>
2018-11-13 00:44:50 +01:00
Rusty Russell
3c97f3954e daemon_conn: make it a tal object, typesafe callbacks.
It means an extra allocation at startup, but it means we can hide the definition,
and use standard patterns (new_daemon_conn and typesafe callbacks).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-10-29 04:06:16 +00:00
Rusty Russell
0e6aec081a gossipd: make sure that freeing peer closes connection to it.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-10-29 04:06:16 +00:00
Rusty Russell
689d51cba5 common/daemon_conn: remove finished function.
For the moment, caller sets it manually.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-10-29 04:06:16 +00:00
Rusty Russell
c236361efd wireaddr: update bolt version, remove 'padding' from addresses.
Nobody used this, so it was removed from the spec.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-10-28 23:51:05 +00:00
Rusty Russell
66dcba099d gossipd: hand raw pubkeys in getnodes and getchannels entries.
We spend quite a bit of time in libsecp256k1 moving them to and from
DER encoding.  With a bit of care, we can transfer the raw bytes from
gossipd and manually decode them so a malformed one can't make us
abort().

Before:
	real	0m0.629000-0.695000(0.64985+/-0.019)s

After:
	real	0m0.359000-0.433000(0.37645+/-0.023)s

At this point, the main issues are 11% of time spent in ccan/io's
backend_wake (I tried using a hash table there, but that actually makes
the small-number-of-fds case slower), and 65% of gossipd's time is
in marshalling the response (all those tal_resize add up!).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-10-19 22:02:11 +00:00
Rusty Russell
bbc36a7bec gossipd: update node announcement even if we change within a second.
Usually Travis triggers corner cases because it's so slow, but this
time the moons aligned, and it managed to fail test_node_reannounce
because it generated the updated node_announcement with the same
timestamp as the old one.

This is because we only updated "last_announce_timestamp" when
we generated the announcement, not when we got it off the wire or
loaded it from the gossip store.

The fix is to ask the routing code what the latest timestamp is;
we could still generate a clashing timestamp if (1) the gossip store
is lost, and (2) we restart within one second.  Hard to care.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-10-16 04:24:03 +00:00
lisa neigut
0ae1d03513 BOLT7: broadcast htlc_maximum_msat in `channel_update s
Have c-lightning nodes send out the largest value for
`htlc_maximum_msat` that makes sense, ie the lesser of
the peer's max_inflight_htlc value or the total channel
capacity minus the total channel reserve.
2018-10-16 03:32:27 +00:00
Rusty Russell
afac01380d gossipd: don't initialize broadcast interval, make field name explicit.
We initialize it to 30 seconds, but it's *always* overridden by the
gossip_init message (and usually to 60 seconds, so it's doubly
misleading).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-10-15 23:04:17 +00:00
Rusty Russell
3991425111 gossipd: don't accept forwarding short_channel_ids we don't own.
Gossipd provided a generic "get endpoints of this scid" and we only
use it in one place: to look up htlc forwards.  But lightningd just
assumed that one would be us.

Instead, provide a simpler API which only returns the peer node
if any, and now we handle it much more gracefully.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-10-15 23:04:17 +00:00
Rusty Russell
030fe1ce53 gossipd: don't expose private channels for routeboost.
We don't create unannouncable channels, but other implementations can.
Not only is it rude to expose these via invoices, it's probably not
useable anyway.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-10-15 23:04:17 +00:00
lisa neigut
762c795c9b gossip: reject channel_update with invalid htlc_max_msat
If the channel update signals an invalid `htlc_maximum_msat` value,
we ignore the update.
2018-10-09 23:22:52 +00:00
lisa neigut
1b6bd3fded wire: add test for parsing optional version of channel_update 2018-10-09 23:22:52 +00:00
lisa neigut
a289282bad gossipd: use u64 for htlc_minimum_msat field
It's u64 in the spec, so we should use u64 too.
2018-10-09 23:22:52 +00:00
lisa neigut
b9331e5ac8 gossipd: parse and respect optional htlc_maximum_msat
If another channel has set the optional `htlc_maximum_msat` field,
we should correctly parse that field and respect it when drawing up
routes for payments.
2018-10-09 23:22:52 +00:00
Rusty Russell
de37586a97 gossipd: use riskfactor in getroute, not "1".
AFAICT, this was there in the original commit by @cdecker.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-10-09 08:40:52 +00:00
Rusty Russell
d946e965a6 gossipd: test that fromwire from lightningd messages succeeds.
Also tiny drive-by cleanup for gossip_disable_local_channels to modern form.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-10-09 08:40:52 +00:00
Rusty Russell
864812019f gossipd: use tal_arr_expand instead of open-coding it.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-10-09 08:40:52 +00:00
Rusty Russell
915ffe35ed gossipd: clean up getnodes handling.
globalfeatures should not be accessed if we haven't received a
channel_update.  Treat it like the other fields which are only
initialized and marshalled/unmarshalled if the timestamp is positive.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-10-09 08:40:52 +00:00
Rusty Russell
df27fc55af More renaming of gfeatures to globalfeatures.
Use the BOLT #1 naming.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-10-09 08:40:52 +00:00
Rusty Russell
bb5e2ffafb gossipd: don't create redundant node_announcements.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-09-28 18:20:17 +02:00
Rusty Russell
afc92dd757 gossipd: use array[32] not pointer for alias.
And use ARRAY_SIZE() everywhere which will break compile if it's not a
literal array, plus assertions that it's the same length.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-09-28 18:20:17 +02:00
Rusty Russell
0baa5f7071 gossipd: send node announcement on startup.
I suspect this fixes #1660 too, but checking would be good.

Fixes: #1781
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-09-28 18:20:17 +02:00
Rusty Russell
2f667c5227 gossipd: routine to get route_info for known incoming channels.
For routeboost, we want to select from all our enabled channels with
sufficient incoming capacity.  Gossipd knows which are enabled (ie. we
have received a `channel_update` from the peer), but doesn't know the
current incoming capacity.

So we get gossipd to give us all the candidates, and lightningd
selects from those.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-09-28 15:03:42 +02:00
Rusty Russell
f64eee717d gossipd: make helpers const-correct.
Always be const if you can.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-09-28 15:03:42 +02:00
Rusty Russell
95c9a73fbb gossipd: set sent flag when sending reply_short_channel_ids_end
Otherwise, if we don't announce the last node, we'll not flush this
out; it will be delayed until the next time we send gossip!

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-09-28 14:39:25 +02:00
Rusty Russell
fbb7bafc3b gossipd: don't include channel in query_short_channel_ids reply if no channel_update.
This is consistent: we don't broadcast a channel_announce until we've seen
a channel_update, so we probably shouldn't advertise it here.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-09-28 14:39:25 +02:00
Rusty Russell
41b0872f58 Use localfeatures and globalfeatures consistently.
That's what BOLT #1 calls them; make it easier for people to grep.

Reported-by: @niftynei
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-09-28 04:14:28 +00:00
Rusty Russell
96f05549b2 common/utils.h: add tal_arr_expand helper.
We do this a lot, and had boutique helpers in various places.  So add
a more generic one; for convenience it returns a pointer to the new
end element.

I prefer the name tal_arr_expand to tal_arr_append, since it's up to
the caller to populate the new array entry.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-09-27 22:57:19 +02:00
Rusty Russell
e450c6bbdb gossipd: remove time-delayed local channel_update, produce DISABLE on-demand.
We have a lot of infrastructure to delay local channel_updates to
avoid spamming on each peer reconnect; we had to keep tracking of
pending ones though, in case we needed the very latest for sending an
error when failing an HTLC.

Instead, it's far simpler to set the local_disabled flag on a channel
when we disconnect, but only send a disabling channel_update if we
actually fail an HTLC.

Note: handle_channel_update() TAKES update (due to tal_arr_dup), but we
didn't use that before.  Now we do, add annotation.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-09-26 03:21:35 +00:00
Rusty Russell
16e16a725e gossipd: apply private updates to announce channel.
We trade channel_update before channel_announce makes the channel
public, and currently forget them when we finally get the
channel_announce.  We should instead apply them, and not rely on
retransmission (which we remove in the next patch!).

This earlier channel_update means test_gossip_jsonrpc triggers too
early, so have that wait for node_announcement.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-09-26 03:21:35 +00:00
Rusty Russell
66105e83ea gossipd: simplify "broadcast channel_announcement now we have channel_update" logic
It's simpler and more robust to just check that it's not yet announced
(the broadcast index will be 0).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-09-26 03:21:35 +00:00
Rusty Russell
8455b12781 Revert "gossipd: handle premature node_announcements in the store."
This reverts commit e2f426903d.

With the new store version, this can't happen.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-09-21 17:56:15 +02:00
Rusty Russell
48de77d56e gossipd: invalidate old gossip_stores.
Incrementing version number means stores which were prior to the previous
commit will be removed, and refreshed.  The simplest fix, if not the most
efficient.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-09-21 17:56:15 +02:00
lisa neigut
b1ceaf9910 gossipd: Update BOLT-split flags in channel_update
BOLT 7's been updated to split the flags field in `channel_update`
into two: `channel_flags` and `message_flags`. This changeset does the
minimal necessary to get to building with the new flags.
2018-09-21 00:24:12 +00:00
Rusty Russell
e012e94ab2 hsmd: rename hsm_client_wire_csv to hsm_wire.csv
That matches the other CSV names (HSM was the first, so it was written
before the pattern emerged).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-09-20 09:49:39 +02:00
Rusty Russell
8f1f1784b3 hsmd: remove hsmd/client.c
It was only used by handshake.c.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-09-20 09:49:39 +02:00
Rusty Russell
704d30edce ping: complete JSON RPC ping commands even if one ping gets no response.
We would never complete further ping commands if we had < responses
than pings.  Oops.

Fixes: #1928
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-09-14 22:11:23 +02:00
Rusty Russell
97c7ba2f80 gossipd: fix reordering of node_announcements in presence of a unannounced channel.
If we receive a channel_announce but not a channel_update, we store the announce
but don't put it in the broadcast map.

When we delete a channel, we check if the node_announcement broadcast
now preceeds all channel_announcements, and if so, we move it to the
end of the map.  However, with a channel_announcement at index '0',
this test fails.

This is at least one potential cause of the node map getting out of order.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-09-04 14:36:05 +02:00
Rusty Russell
e2f426903d gossipd: handle premature node_announcements in the store.
These happen after we compact the store; every log I've seen of a
restart on a real node has a message about truncating the store,
because node_announcements predate channel_announcements.

I extracted one such case from testnet, and reduced it to test here.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-09-04 14:36:05 +02:00
Rusty Russell
0d46a3d6b0 Put the 'd' back in the daemons.
@renepickhardt: why is it actually lightningd.c with a d but hsm.c without d ?

And delete unused gossipd/gossip.h.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-09-03 05:01:40 +00:00
Rusty Russell
317a830e94 devtools: dump-gossipstore.
Not very useful by itself, but when combined with decodemsg it can tell
us quite a bit.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-09-03 00:39:06 +00:00
Rusty Russell
f80955c932 broadcast: don't leak in broadcast_del.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-08-24 19:54:32 +02:00
Rusty Russell
5d1f71c3c0 gossipd: don't leak fields in create_node_announcement.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-08-24 19:54:32 +02:00
Rusty Russell
a475098928 gossipd: fix leak in gossip_store_add_channel_delete.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-08-24 19:54:32 +02:00
Rusty Russell
1c81486b48 routing: fix falsely flagged leak.
pending goes away on a timer, sure, but might as well use tmpctx here.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-08-24 19:54:32 +02:00
Rusty Russell
b10bae1ceb gossipd: use ctx arg in create_channel_update.
Turns out it was always `tmpctx` anyway, so this isn't a real bug right now.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-08-24 19:54:32 +02:00
Rusty Russell
2db77f5d1d gossipd: minor modifications for memleak detection to work.
1. Move the list to the start of `struct peer`: memleak walks the
   list correctly this way.
2. Don't create tal parent loop daemon->conn->daemon.

The second one is silly anyway: we exit via master_gone when the master
conn is closed.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-08-24 19:54:32 +02:00
Rusty Russell
83eadb3548 gossipd: fix SUPERVERBOSE usage, enhance, when turned on.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-08-23 14:46:22 +02:00
Rusty Russell
74521b3fb7 gossipd: don't delay the very first channel_update.
Lightning charge tests stopped working without a timeout, being unable
to find a route.  The 15 second delay doesn't matter in real life, but
in these scenarios it does.  This fixes it by making sure the channel
is usable immediately.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-08-21 00:49:12 +02:00
conanoc
b1900b18ab Fix DEVELOPER guard for ping
ping_req() should be outside of DEVELOPER guard now.
2018-08-15 06:48:55 +00:00
Christian Decker
6627da5eb5 routing: Do not consider risk when capping transfers
Reported-by: Rusty Russell <@rustyrussell>
Signed-off-by: Christian Decker <@cdecker>
2018-08-06 22:46:02 +02:00
Christian Decker
84905eac2b routing: Make the capacity a parameter to new_chan
As pointed out by @rustyrussell the capacity is now always defined, so we can
fold that into the construction of the channel itself.

Reported-by: Rusty Russell <@rustyrussell>
Signed-off-by: Christian Decker <@cdecker>
2018-08-06 22:46:02 +02:00
Christian Decker
8201764117 routing: Skip channels that require larger HTLCs than we are routing
The `htlc_minimum_msat` parameter was ignored so far, and we'd be attempting to
pay and hitting a brick wall by doing so. This patch just skips channels that
are not eligible anyway.
2018-08-06 22:46:02 +02:00
Christian Decker
14000a22bc routing: Skip channels that don't have sufficient capacity
We know the total channel capacity after checking for its existence on-chain, so
we can actually make use of that information to discard channels that don't have
a sufficient capacity anyway, reducing the number of failed attempts.
2018-08-06 22:46:02 +02:00
Christian Decker
8a34933c1a gossip: Annotate locally added channels with their capacity
We were adding channels without their capacity, and eventually annotated them
when we exchanged `channel_update`s. This worked as long as we weren't
considering the channel capacity, but would result in local-only channels to be
unusable once we start checking.
2018-08-06 22:46:02 +02:00
Rusty Russell
584ee26200 gossipd: fix thinko in node_announcement address parsing which made us miss final address
'cursor < ser + max' isn't valid because we reduce 'max' as we go!  Effectively
we'll stop once we're past halfway, which can only happen with ipv6 + a torv2
address.

Ths fix is one-line, but we rename 'max' to 'len' which makes its purpose
clearer.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-08-06 19:33:46 +02:00
Rusty Russell
0b08601951 sync_crypto_write/sync_crypto_read: just fail, don't return NULL.
There's only one thing the caller ever does, just do that internally.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-08-05 02:03:58 +00:00
practicalswift
7969cc335e Allocate off ctx instead of tmpctx in encode_short_channel_ids_start(const tal_t *ctx) 2018-08-01 13:09:16 +09:30
practicalswift
b5682a773b Remove dead stores 2018-07-31 12:45:02 +02:00
Rusty Russell
5cf34d6618 Remove tal_len, use tal_count() or tal_bytelen().
tal_count() is used where there's a type, even if it's char or u8, and
tal_bytelen() is going to replace tal_len() for clarity: it's only needed
where a pointer is void.

We shim tal_bytelen() for now.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-07-30 11:31:17 +02:00
Rusty Russell
36730ddb6d gossipd: dev-suppress-gossip.
Useful for testing that we only get an update via the error message.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-07-27 14:12:00 +02:00
Rusty Russell
73b3782943 gossipd: send latest update in error message, even if delayed.
We delay internally to reduce broadcastig route flap, but errors are
a special case: we want to send the latest, otherwise we might send an
old (non-disabled) update.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-07-27 14:12:00 +02:00
Rusty Russell
3c66d5fa03 gossipd: add flag for locally disabling channel.
We used to just manually set ROUTING_FLAGS_DISABLED, but that means we
then suppressed the real channel_update because we thought it was a
duplicate!

So use a local flag: set it for the channel when the peer disconnects,
and clear it when channeld sends a local update.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-07-27 14:12:00 +02:00
Rusty Russell
d241bd762c connectd: don't use gossip_getnodes_entry.
gossip_getnodes_entry was used by gossipd for reporting nodes, and for
reporting peers.  But the local_features field is only available for peers,
and most other fields are only available from node_announcement.

Note that the connectd change actually means we get less information
about peers: gossipd used to do the node lookup for peers and include the
node_announcement information if it had it.

Since generate_wire.py can't create arrays-of-arrays, we add a 'struct
peer_features' to encapsulate the two feature arrays for each peer, and
for convenience we add it to lightningd/gossip_msg.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-07-25 02:13:52 +00:00
Rusty Russell
7b2641ed0d gossipd: remove peer-related fields and wire messages.
This completes the removal of peer-related messages.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-07-25 02:13:52 +00:00
Rusty Russell
0d442b5ff2 gossipd: move files into connectd.
These source files are only used for peer-related things, so move them.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-07-25 02:13:52 +00:00
Rusty Russell
dba7f9002f gossipd: provide connectd with address resolution.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-07-25 02:13:52 +00:00
Rusty Russell
3d3d2ef9af gossipd: remove connectd functionality, enable connectd.
This patch guts gossipd of all peer-related functionality, and hands
all the peer-related requests to channeld instead.

gossipd now gets the final announcable addresses in its init msg, since
it doesn't handle socket binding any more.

lightningd now actually starts connectd, and activates it.  The init
messages for both gossipd and connectd still contain redundant fields
which need cleaning up.

There are shims to handle the fact that connectd's wire messages are
still (mostly) gossipd messages.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-07-25 02:13:52 +00:00
Rusty Russell
92d66a5451 gossipd: take connectd fd on initialization.
connectd has a dedicated fd to gossipd, so it can ask for a new gossip_fd
for a peer.

gossipd has a standalone routine to create a remote peer (this will
eventually be the only way gossipd creates a new peer).

For now lightningd creates a socketpair but doesn't run connectd, so
gossipd never sees any requests on this fd.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-07-25 02:13:52 +00:00
Rusty Russell
e1dfb1b178 gossipd: simplify per-peer features.
Store the two we care about as booleans.  Once connectd is complete we won't
even have the feature bitmaps for peers.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-07-25 02:13:52 +00:00
Rusty Russell
f747ad8f73 common/daemon_conn: add daemon_conn_wake() helper.
We've been open-coding it.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-07-25 02:13:52 +00:00
Rusty Russell
16b8f1eb83 gossipd: actually use global features to create our own node_announcement.
It's currently empty, but I was surprised we still used "NULL".

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-07-25 02:13:52 +00:00
Rusty Russell
a52d522525 gossipd: handle ping messages for remote peers too.
This simplifies our ping handling: make gossipd always do it.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-07-25 02:13:52 +00:00
Rusty Russell
9bf238e001 hsmd: provide message for master to get basepoints & funding pubkey for a channel
This is only used by the master daemon, but it's not secret information.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-07-24 00:40:01 +02:00
Rusty Russell
dfaf74d972 hsmd: add routines to sign onchain transactions, part 1.
This handles the "to-us" transactions which return funds to the wallet.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-07-24 00:40:01 +02:00
Rusty Russell
019ba86b91 gossipd: use optional fields.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-07-17 12:32:00 +02:00
Christian Decker
14c6310a4f gossip: Fix concurrent PR merge issue with structeq
PR #1618 in parallel with the migration to macro `structeq` created this.

Fixes #1674
2018-07-08 19:04:46 +02:00
Rusty Russell
ed83bbe623 pytest: fix flaky race in test_gossip_query_channel_range.
We weren't waiting for gossipd to actually process the
dev_set_max_scids_encode_size message, so under Travis it sometimes
split the reply before processing that.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-07-07 16:26:23 +02:00
Rusty Russell
57794b9285 gossipd: also delay locally-generated disables when peer vanishes.
Note that we mark both directions of the channel disabled immediately,
it's just the broadcast of the update which is delayed, just like the
ones generated when channeld tells us to.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-07-07 16:07:53 +02:00
Rusty Russell
f9b8237d50 gossipd: delay generation of local updates.
We disable the channel every time the peer disconnects; if it reconnects
we get two updates.

The simplest solution: delay all updates by 15 seconds.  Replace any
pending delayed update.  If update is redundant after 15 seconds,
discard.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-07-07 16:07:53 +02:00
Rusty Russell
ef59a8f4aa gossipd: suppress redundant local updates which we would generate.
This doesn't do anything for us now, since we actually tend to produce
DISABLE/ENABLE update pairs.  But the infrastructure is useful for the
next patch.

We also add more details to the trace message in the core update code.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-07-07 16:07:53 +02:00
Rusty Russell
8e571ba688 listnodes: expose global features.
Since nobody sets these yet, it's a bit moot, but it will be great in
future.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-07-07 16:07:53 +02:00
Rusty Russell
9fa738a741 listpeers: expose peer features as 'local_features' and 'global_features'
For now, just the connected peers.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-07-07 16:07:53 +02:00
Rusty Russell
7b735fbeee gossipd: fix json_listpeers printing node information.
json_listpeers returns an array of peers, and an array of nodes: the latter
is a subset of the former, and is used for printing alias/color information.

This changes it so there is a 1:1 correspondance between the peer information
and nodes, meaning no more O(n^2) search.

If there is no node_announce for a peer, we use a negative timestamp
(already used to indicate that the rest of the gossip_getnodes_entry
is not valid).

Other fixes:
1. Use get_node instead of iterating through the node map.
2. A node without addresses is perfectly valid: we have to use the timestamp
   to see if the alias/color are set.  Previously we wouldn't print that
   if it didn't also advertize an address.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-07-07 16:07:53 +02:00
Rusty Russell
fed5a117e7 Update ccan/structeq.
structeq() is too dangerous: if a structure has padding, it can fail
silently.

The new ccan/structeq instead provides a macro to define foo_eq(),
which does the right thing in case of padding (which none of our
structures currently have anyway).

Upgrade ccan, and use it everywhere.  Except run-peer-wire.c, which
is only testing code and can use raw memcmp(): valgrind will tell us
if padding exists.

Interestingly, we still declared short_channel_id_eq, even though
we didn't define it any more!

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-07-04 23:57:00 +02:00
Rusty Russell
4a1ca0fb99 gossipd: don't use raw secp256k1_pubkey in routing.
We wrap it in 'struct pubkey' for typesafety and consistency, and the
next patch takes advantage of that when we move to pubkey_eq.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-07-04 23:57:00 +02:00
Rusty Russell
82ff891202 Update to latest BOLT version.
And remove the FIXMEs now that the gossip_query extension is merged.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-07-01 17:37:03 +02:00
Rusty Russell
f67182ff20 gossipd: order node_announcement addresses correctly, remove duplicate types.
Fixes: #1596
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-07-01 15:03:21 +02:00
Rusty Russell
284f0a04c9 gossipd: don't announce bound address if given with --bind-addr, even if public.
Only --addr implies announce-if-public: --bind-addr does not.

It's also possible to have --bind-addr to an automatic Tor address:
you'd have to dig the onion address out of the logs or getinfo to use
it, but it's possible.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-07-01 15:03:21 +02:00
Rusty Russell
9d3ce87700 decode_short_ids: move to common.
We want to use it in devtools/decodemsg.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-07-01 14:55:29 +02:00
arowser
25f60f9456 remove unused return value 2018-06-30 04:27:34 +00:00
Christian Decker
4a5cff8490 gossip: Try to detect broken ISP resolvers and discard broken replies
This is a best effort attempt to skip connection attempts if we detect a broken
ISP resolver. A broken ISP resolver is a resolver that will replace NXDOMAIN
replies with a dummy response. This is best effort in that it'll only detect a
single fixed dummy reply, it'll check only on startup, and will not detect if we
switched networks. It should be good enough for most cases, and in the worst
case it will result in a connection attempt that does not complete.

Signed-off-by: Christian Decker <decker.christian@gmail.com>
Reported-by: Glenn Willen <@gwillen>
2018-06-21 11:21:16 +02:00
Christian Decker
91c2416657 gossip: Do not use DNS if we were told not to
Signed-off-by: Christian Decker <decker.christian@gmail.com>
2018-06-21 11:21:16 +02:00
Christian Decker
ceef61dbbd gossip: Pass use_dns option down to gossipd
Signed-off-by: Christian Decker <decker.christian@gmail.com>
2018-06-21 11:21:16 +02:00
William Casarin
d7aa0528b8 gossipd: fix compile error, uninitialized variable
Seems to be a problem with gcc 6.4+?

Fixes #1527

Signed-off-by: William Casarin <jb55@jb55.com>
2018-06-20 21:25:03 +00:00
Rusty Russell
833e8387aa gossipd: fix up BOLT references.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-06-18 12:31:09 +02:00
Christian Decker
71ec8193b2
gossip: Avoid integer count overflow in gossip_store
Signed-off-by: Christian Decker <decker.christian@gmail.com>
2018-06-18 12:04:25 +02:00
Rusty Russell
f5ebf8e231 gossipd: send correct channel_update in response to query_short_channel_ids
Cut & paste means we sometimes sent NULL:

```
2018-06-15T00:13:51.908Z lightningd(23653): lightning_closingd-03864ef025fde8fb587d989186ce6a4a186895ee44a926bfc370e2c366597a3f8f chan #436: Gossipd gave us bad send_gossip message 0bc80000
```

Fixes: #1581
Reported-by: @Xian001
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-06-15 15:39:30 +02:00
Rusty Russell
60b3f0e376 gossipd: remove oververbose logging when we uncompress short_channel_id array
Reported-by: Xian001 (#1581)
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-06-15 15:39:30 +02:00
Rusty Russell
9d721ecb99 gossipd: add assertions to try to catch mysterious crash.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-06-15 11:53:47 +02:00
Rusty Russell
5c19c55841 gossipd: fix take leak when peer is dying.
In this case, local and remote are *both* NULL; so if someone tries to
send a packet with take(), we need to free it.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-06-15 11:53:47 +02:00
Rusty Russell
a7e6cdb418 gossipd: peer->local->peer_out queue should have lifetime of peer->local.
The current code attaches it to peer, which is a slight leak.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-06-15 11:53:47 +02:00
Rusty Russell
e098578731 gossipd: fix leak when we fail to dup fds.
In this case, peer would stay around, but conn would be freed.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-06-15 11:53:47 +02:00
Rusty Russell
f6ff89e596 gossipd: fix use-after-free when we fail to make connection.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-06-15 11:53:47 +02:00
Christian Decker
4279e5cdbd gossip: Fix "already reaching" issue
I think this is what is causing #1536: getting disconnected causes gossipd to
attempt to reach the peer again, unconditionally setting the flag to tell the
master. At the same time the master also issues a reaching command (which is
allowed since it is its first), but then it clashes on the already set
flag. Setting this flag only when the master actually needs to be told should
fix this.

Signed-off-by: Christian Decker <decker.christian@gmail.com>
2018-06-15 01:06:42 +00:00
Christian Decker
985af483cf gossip: Wrap insert_broadcast and gossip_store_add in persistent_broadcast
They should sync up nicely otherwise we may be overestimating the stale rate.

Signed-off-by: Christian Decker <decker.christian@gmail.com>
2018-06-09 13:38:46 +02:00
Christian Decker
6632f44133 gossip: Disable gossip_store temporarily while replaying messages
Signed-off-by: Christian Decker <decker.christian@gmail.com>
2018-06-09 13:38:46 +02:00
Christian Decker
2b5e1ee65f gossip: Enable the consistency check only when really pedantic
Signed-off-by: Christian Decker <decker.christian@gmail.com>
2018-06-09 13:38:46 +02:00
Christian Decker
8a5bebed59 gossip: Disable future compactions if we fail a compaction
A failed compaction shouldn't be deadly, but we should also not attempt to do
one on every gossip message after the first one fails.

Signed-off-by: Christian Decker <decker.christian@gmail.com>
2018-06-09 13:38:46 +02:00
Christian Decker
74a1cbd877 gossip: Implement gossip_store compaction
Signed-off-by: Christian Decker <decker.christian@gmail.com>
2018-06-09 13:38:46 +02:00
Christian Decker
b9a2400a5f gossip: Simplify message handling in gossip_store
`gossip_store_add` is the entry point for messages from the network, so it
should do the bookkeeping and disable on failures. `gossip_store_append` is the
shared function that wraps messages and writes it to the given file. This is
shared between the from network path and the compaction path, so we don't
directly use the `gossip_store` instance, but `fd`s.

Signed-off-by: Christian Decker <decker.christian@gmail.com>
2018-06-09 13:38:46 +02:00
Christian Decker
60efa314fe gossip: Separate writing to gossip_store fd from append
We write both when coming from outside, as well as when compacting, so we
extract the write functionality to use it in both cases.

Signed-off-by: Christian Decker <decker.christian@gmail.com>
2018-06-09 13:38:46 +02:00
Christian Decker
e6ab594904 gossip: Have gossip_store annotate gossip messages
This makes the exposed interface much smaller, cleaner and will allow us to just
replay gossip messages from the broadcast.

Signed-off-by: Christian Decker <decker.christian@gmail.com>
2018-06-09 13:38:46 +02:00
Christian Decker
0546ca446d gossip: Pass routing_state to the gossip_store
We'll need it later to annotate the raw gossip messages, e.g., the capacity of a
channel.

Signed-off-by: Christian Decker <decker.christian@gmail.com>
2018-06-09 13:38:46 +02:00
Christian Decker
eaba5a249a gossip: Introduce bookkeeping into gossip_store for rewrite
Signed-off-by: Christian Decker <decker.christian@gmail.com>
2018-06-09 13:38:46 +02:00
Christian Decker
552ddb8dfd gossip: Pass broadcast_state to gossip_store
We'll be sourcing messages from this `broadcast_state` when rewriting the
`gossip_store`.

Signed-off-by: Christian Decker <decker.christian@gmail.com>
2018-06-09 13:38:46 +02:00
Christian Decker
37dc458b4d gossip: Have the broadcast_state track its message count
This is far more precise than bolting on the stale tracking in the
`gossip_store`.

Signed-off-by: Christian Decker <decker.christian@gmail.com>
2018-06-09 13:38:46 +02:00
Christian Decker
4e7fc99ae1 gossip: Duplicate removes can result in null pointers in broadcast
Signed-off-by: Christian Decker <decker.christian@gmail.com>
2018-06-08 20:00:27 +02:00
Rusty Russell
5d6a9f3fb0 gossipd: check consistency.
This is a hack to check that our gossip state is consistent on every
insert and delete.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-06-08 17:53:34 +02:00
Rusty Russell
da55d3c0ff gossipd: handle node_announcement when channel_announcement removed.
Two cases:
1. Node no longer has any public channels: remove node_announcement.
2. Node's node_announcement now preceeds all the channel_announcements:
   move node_announcement to the end.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-06-08 17:53:34 +02:00
Rusty Russell
def18a7bc1 gossipd: implement broadcast_del to delete a specific index.
Required if we want to reorder node_announcement broadcasts.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-06-08 17:53:34 +02:00
Rusty Russell
a38c619486 gossipd: keep index of node and channel announcements.
This lets detect if a node announce preceeds a channel announce once we
delete the node announcement.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-06-08 17:53:34 +02:00
Rusty Russell
1bb7713274 gossipd: minor cleanups.
Suggested-by: @cdecker
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-06-06 03:25:56 +00:00
Rusty Russell
035d6067e4 Rename consider_own_node_announce to maybe_send_own_node_announce.
Suggested-by: @cdecker
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-06-06 03:25:56 +00:00
Rusty Russell
5ec454c7b2 gossipd: don't queue node_announce unless we've queued channel_announce.
We *accept* a node_announce if we have a channel_announce, but we
can't queue it until we queue the channel_announce, which we only do
once we have recieved a channel_update.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-06-06 03:25:56 +00:00
Rusty Russell
f52245d442 gossipd: support and use zlib encoding in short_channel_id encoding.
We still use uncompressed if zlib turns out to be larger.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-06-06 03:25:56 +00:00
Rusty Russell
9e51e196c1 gossipd: dev-set-max-scids-encode-size to artificially force "full" replies.
We cap each reply at a single one, which forces the code into our
recursion logic.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-06-06 03:25:56 +00:00
Rusty Russell
118f099dd8 gossip: dev-query-channel-range to test query_channel_range.
We keep a crappy bitmap, and finish when their replies cover
everything we asked.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-06-06 03:25:56 +00:00
Rusty Russell
0dda5d4e1c gossipd: handle query_channel_range
We send them all the short_channel_ids we have in a given range.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-06-06 03:25:56 +00:00
Rusty Russell
c34b49c356 gossipd: add dev-send-timestamp-filter command for testing timestamp filtering.
Since we currently only (ab)use it to send everything, we need a way to
generate boutique queries for testing.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-06-06 03:25:56 +00:00
Rusty Russell
db6a6442cb gossipd: single-thread the gossip timer.
We have a function called 'wake_pkt_out' which is really 'start
gossiping', so rename it to 'wake_gossip_out'.

In addition, it's fired both on a timer, and in response to our first
gossip_timestamp_filter, which leads to very confusing (though,
technically, not incorrect) behavior.

Keep a single timer at all times, which now doubles as the flag to
indicating we're syncing right now.  Set it once we're done syncing
gossip.

Technically this means we got from once-every-60-seconds to
quiet-for-60-seconds-between-gossip, but that's OK.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-06-06 03:25:56 +00:00
Rusty Russell
531c82b6ad gossipd: handle gossip_timestamp_filter message.
And initialize filter (to "never") when we negotiated LOCAL_GOSSIP_QUERIES,
and send initial filter message.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-06-06 03:25:56 +00:00
Rusty Russell
97bb6c5a28 gossipd: ensure incoming timestamps are reasonable.
This is kind of orthogonal to the other changes, but makes sense: if we
would instantly or never prune the message, don't accept it.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-06-06 03:25:56 +00:00
Rusty Russell
7a32637b5f gossipd: add timestamp to each broadcast message.
This lets us filter by timestamp.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-06-06 03:25:56 +00:00
Rusty Russell
4d8b29089b gossipd: wire up infrastructure to generate query_short_channel_ids msg.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-06-06 03:25:56 +00:00
Rusty Russell
7ee5da858c gossipd: handle query_short_channel_ids message.
This doesn't handle zlib yet.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-06-06 03:25:56 +00:00
Rusty Russell
32c39c2979 gossipd: send node announcements after short_channel_id replies.
We use the same system as for gossip: we trickle out replies when we're
otherwise idle.

As we trickle out replies to query_short_channel_ids, we remember the
pubkeys of nodes we mention.  At the end, we sort and uniquify, and
then send any node_announcements we have for those.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-06-06 03:25:56 +00:00
Rusty Russell
5864415d31 gossipd: infrastructure to handle short_channel_id replies.
We use the same system as for gossip: we trickle out replies when we're
otherwise idle.

This is minimal infrastructure: we don't actually process the
query_short_channel_ids message yet, nor do we append node
announcements.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-06-06 03:25:56 +00:00
Rusty Russell
6c6da45f53 wire: Update to lastest BOLT draft.
This includes the gossip query messages.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-06-06 03:25:56 +00:00
Rusty Russell
803e4f8895 gossipd: announce nodes after channel announcement.
In general, we need to only publish node announcements after
publishing channel announcements, though we can accept node
announcements as soon as we see channel announcements.  So we keep a
flag for those node_announcement which haven't been broadcast yet.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-06-06 03:25:56 +00:00
Rusty Russell
c2cc3823db gossipd: announce own node only after channel announcement actually broadcast.
handle_pending_cannouncement might not actually add the announcment,
as it could be waiting for a channel_update.  We need to wait for
the actual announcement before considering announcing our node.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-06-06 03:25:56 +00:00
Rusty Russell
c2189229ca gossipd: only broadcast channel_announcement once we have a channel_update.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-06-06 03:25:56 +00:00
Rusty Russell
2431742285 gossipd: don't publish private updates after channel_announce.
We generate new ones anyway; removing this code changes fixes coming
up which now only need to change one place.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-06-06 03:25:56 +00:00
Christian Decker
c550fd1752 gossip: Clean up the code to disable a local channel
Signed-off-by: Christian Decker <decker.christian@gmail.com>
2018-05-31 02:30:27 +00:00
Christian Decker
c17848a3f3 gossip: Disable local channels after loading the gossip_store
We don't have any connection yet, so how could they be active? Disable both
sides to avoid trying to route through them or telling others to use them as
`contact_points` in invoices.

Signed-off-by: Christian Decker <decker.christian@gmail.com>
2018-05-31 02:30:27 +00:00
Christian Decker
f2dc406172 moveonly: Hoist gossip_disable_channel higher up
We'll need it in the next commit

Signed-off-by: Christian Decker <decker.christian@gmail.com>
2018-05-31 02:30:27 +00:00
Christian Decker
ba31dd2d9d gossip: Avoid sending duplicate disable messages
Signed-off-by: Christian Decker <decker.christian@gmail.com>
2018-05-31 02:30:27 +00:00
Christian Decker
8e278044e3 gossip: Disable channels when we lose the connection to the peer
We're telling gossipd about disconnections anyway, so let's just use that signal
to disable both sides of the channel.

Signed-off-by: Christian Decker <decker.christian@gmail.com>
2018-05-31 02:30:27 +00:00
Christian Decker
3e5b798c60 gossip: Fix disable flags in handle_disable_channel
Signed-off-by: Christian Decker <decker.christian@gmail.com>
2018-05-31 02:30:27 +00:00
Christian Decker
9982e24a1c gossip: Add local_channel_close message to disable channels upon close
This was failing some of our integration tests, i.e., the ones closing a channel
and not waiting for sigexchange. The remote node would often not be quick enough
to send us its disabling channel_update, and hence we'd still remember the
incoming direction. That could then be sent out as part of an invoice, and fail
subsequently. So just set both directions to be disabled and let the onchain
spend clean up once it happens.

Signed-off-by: Christian Decker <decker.christian@gmail.com>
2018-05-31 02:30:27 +00:00
Christian Decker
402125a70e gossip: Add CRC32 checksum to the gossip_store
Signed-off-by: Christian Decker <decker.christian@gmail.com>
Reported-by: Rusty Russell @rustyrussell
2018-05-29 12:16:00 +00:00
Rusty Russell
88053bd1ca gossipd: remove too-loose timestamp workaround.
Now timestamps always increment, we don't have to allow them to do the
wrong thing.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-05-21 09:17:57 -07:00
Rusty Russell
6454d7af84 gossip: cleanup keepalive updates to use the same create_channel_update() code.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-05-21 09:17:57 -07:00
Rusty Russell
fca5a9ef30 channeld: tell gossipd to generate channel_updates.
This resolves the problem where both channeld and gossipd can generate
updates, and they can have the same timestamp.  gossipd is always able
to generate them, so can ensure timestamp moves forward.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-05-21 09:17:57 -07:00
Rusty Russell
adbe02c6be gossip: temporarily allow replacement of updates with same timestamp.
We erroneously create updates with the same timestamps when tests run
quickly, and the second one is ignored.

We've already noted that this should be fixed: gossipd should generate
all the updates, as it already has to do the case where channeld
crashed, for example.  But that's a bigger change.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-05-19 15:52:56 -04:00
Rusty Russell
c546b1bbb6 gossipd: specify origin of updates in errors.
@cdecker points out that in test_forward, where we manually create a route,
we get an error back which contains an update for an unknown channel.

We should still note this, but it's not an error for testing.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-05-19 15:52:56 -04:00
Rusty Russell
8ee60e2d8e testing: make sure we don't see gossip in bad order.
This is something which generally shouldn't happen, but we didn't
notice it previously.

We ignore this warning in the case where a channel was deleted: this
happens because one side can send an update while the other notices
that the channel is closed.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-05-19 15:52:56 -04:00
Rusty Russell
177a1fc88e gossipd: handle local channel creation separately from update.
Note: this will break the gossip_store if they have current channels,
but it will fail to parse and be discarded.

Have local_add_channel do just that: the update is logically separate
and can be sent separately.

This removes the ugly 'bool add_to_store' flag.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-05-19 15:52:56 -04:00
Rusty Russell
540c68d7ca gossipd/gossip_constants.h: Single place for BOLT constants.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-05-19 15:52:56 -04:00
Rusty Russell
b965ef7d1d routing: make sure we fail if we can't unmarshal announcements.
This is how we notice if the gossip store is corrupt!

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-05-19 15:52:56 -04:00
practicalswift
fab3b214b4 Avoid static analyzer warning about integer wraparound 2018-05-15 05:26:29 +00:00
Rusty Russell
1125682ceb wireaddr: new type, ADDR_INTERNAL_FORPROXY, use it if we can't/wont resolve.
Tor wasn't actually working for me to connect to anything, but it worked
for 'ssh -D' testing.

Note that the resulting 'netaddr' is a bit weird, but I guess it's honest.

    $ ./cli/lightning-cli connect 021f2cbffc4045ca2d70678ecf8ed75e488290874c9da38074f6d378248337062b
    {
      "id": "021f2cbffc4045ca2d70678ecf8ed75e488290874c9da38074f6d378248337062b"
    }
    $ ./cli/lightning-cli listpeers
    {
      "peers": [
        {
          "state": "GOSSIPING", 
          "id": "021f2cbffc4045ca2d70678ecf8ed75e488290874c9da38074f6d378248337062b", 
          "netaddr": [
            "ln1qg0je0lugpzu5ttsv78vlrkhteyg9yy8fjw68qr57mfhsfyrxurzkq522ah.lseed.bitcoinstats.com:9735"
          ], 
          "connected": true, 
          "owner": "lightning_gossipd"
        }
      ]
    }

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-05-11 09:15:54 +00:00
Rusty Russell
2a0acd3492 tor: log proxy communications using status_io.
Good for debugging (you have to send SIGUSR1 to lightning_gossipd to turn
it on though, and --log-level=io on the lightningd cmdline to have it
output IO messages by default).

I also noticed that io_tor_connect_after_req_host() does a useless
test on reach->buffer[0] after it's *written*: remove it.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-05-11 09:15:54 +00:00
Rusty Russell
570283bc76 gossipd: don't use fake addrhint for non-addrhint resolutions.
Use a wireaddr_internal directly (which is what we want).

Also, don't hardcode 9735, use DEFAULT_PORT internally in
seed_resolve_addr().

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-05-11 09:15:54 +00:00
Rusty Russell
de063edb54 gossip: extract function to derive seedname.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-05-11 09:15:54 +00:00
Rusty Russell
0d23f4fb4a gossipd: hand io_tor_connect the host as a string.
Previously it converted the wireaddr to a string internally: to support
unresolved names we need that done externally.

We actually tell the SOCKS5 proxy to do a domain lookup already, even
though we give use IP/IPv6 address, so this change is sufficient to
support connect-by-name.

Note replacement of assert() with an explicit case statement, which
has the benefit that the compiler complains when we add new
ADDR_INTERNAL types.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-05-11 09:15:54 +00:00
Rusty Russell
a1dc4eef56 wireaddr: tell caller that we failed due to wanting DNS lookup, don't try.
This is useful for the next patch, where we want to hand the unresolved
name through to the proxy.

This also addresses @Saibato's worry that we still called getaddrinfo()
(with the AI_NUMERICHOST option) even if we didn't want a lookup.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-05-11 09:15:54 +00:00
Rusty Russell
5345e43354 gossipd: rename use_tor to use_proxy,
Not all of them, but it's really about using the SOCKS proxy rather than
really using Tor at this level.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-05-11 09:15:54 +00:00
Rusty Russell
bcb047a729 gossipd: fix uninitialized var.
We assert() that it's set by one of the branches (it should be!) but
if we don't hit one it's uninitialized, not NULL.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-05-11 09:15:54 +00:00
Rusty Russell
cca791d1cb routing: clean up channel public/active states.
1. If we have a channel_announcement, the channel is public, otherwise
   it's not.  Not all channels are public, as they can be local: those
   have a NULL channel_announcement.

2. If we don't have a channel_update, we know nothing about that half
   of the channel, and no other fields are valid.

3. We can tell if a half channel is disabled by the flags field directly.

Note that we never send halfchannels without an update over
gossip_getchannels_reply so that marshalling/unmarshalling can be
vastly simplified.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-05-10 21:35:53 +02:00
Rusty Russell
9d1e496b11 gossipd: use a real update in local_add_channel.
We generate one now, so let's use it.  That lets us simplify the
code, too.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-05-10 21:35:53 +02:00
Rusty Russell
c71e16f784 broadcast: invert ownership of messages.
Make the update/announce messages own the element in the broadcast map
not the other way around.

Then we keep a pointer to the message, and when we free it
(eg. channel closed, update replaces it), it gets freed from the
broadcast map automatically.

The result is much nicer!

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-05-10 21:35:53 +02:00
Rusty Russell
8940528bdb gossipd: don't include private announcements into broadcast map.
Basically, if we don't have an announcement for the channel, stash it,
and once we get an announcement, replay if necessary.

Fixes: #1485
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-05-10 21:35:53 +02:00
Rusty Russell
d1b28f832d gossipd: when reconnecting, make sure we free old connection.
Looks like old connection got a callback, and we blew up since
the old peer was freed:

2018-05-06T10:57:11.865Z lightning_gossipd(14387): ...will try again in 300 seconds
2018-05-06T10:57:16.397Z lightning_gossipd(14387): peer_out WIRE_INIT
2018-05-06T10:57:16.405Z lightning_gossipd(14387): peer_in WIRE_INIT
2018-05-06T10:57:16.406Z lightning_gossipd(14387): peer 03b30e131241fe28fc923d74a060a8c7abfcc91323c485f8a9cf964575cb4fd3f4: reconnect for local peer
2018-05-06T10:57:16.406Z lightning_gossipd(14387): peer 03b30e131241fe28fc923d74a060a8c7abfcc91323c485f8a9cf964575cb4fd3f4 now remote
2018-05-06T10:57:16.406Z lightning_gossipd(14387): UPDATE WIRE_GOSSIP_PEER_CONNECTED
2018-05-06T10:57:16.406Z lightning_gossipd(14387): UPDATE WIRE_GOSSIP_PEER_CONNECTED
2018-05-06T10:57:16.406Z lightning_gossipd(14387): Handing back peer 03b30e131241fe28fc923d74a060a8c7abfcc91323c485f8a9cf964575cb4fd3f4 to master
2018-05-06T10:57:16.420Z lightning_gossipd(14387): hand_back_peer 03b30e131241fe28fc923d74a060a8c7abfcc91323c485f8a9cf964575cb4fd3f4: now local again
2018-05-06T10:57:16.420Z lightning_gossipd(14387): FATAL SIGNAL 11
2018-05-06T10:57:16.420Z lightning_gossipd(14387): backtrace: common/daemon.c:42 (crashdump) 0x416991
2018-05-06T10:57:16.420Z lightning_gossipd(14387): backtrace: (null):0 ((null)) 0x7f70cf57a4af
2018-05-06T10:57:16.420Z lightning_gossipd(14387): backtrace: common/msg_queue.c:38 (msg_dequeue) 0x418232
2018-05-06T10:57:16.421Z lightning_gossipd(14387): backtrace: gossipd/gossip.c:816 (peer_pkt_out) 0x404ac4
2018-05-06T10:57:16.421Z lightning_gossipd(14387): backtrace: ccan/ccan/io/io.c:59 (next_plan) 0x4316db
2018-05-06T10:57:16.421Z lightning_gossipd(14387): backtrace: ccan/ccan/io/io.c:427 (io_do_always) 0x4322ce
2018-05-06T10:57:16.421Z lightning_gossipd(14387): backtrace: ccan/ccan/io/poll.c:228 (handle_always) 0x433abd
2018-05-06T10:57:16.421Z lightning_gossipd(14387): backtrace: ccan/ccan/io/poll.c:249 (io_loop) 0x433b48
2018-05-06T10:57:16.421Z lightning_gossipd(14387): backtrace: gossipd/gossip.c:2407 (main) 0x4093aa
2018-05-06T10:57:16.421Z lightning_gossipd(14387): backtrace: (null):0 ((null)) 0x7f70cf56582f
2018-05-06T10:57:16.421Z lightning_gossipd(14387): backtrace: (null):0 ((null)) 0x402ad8
2018-05-06T10:57:16.421Z lightning_gossipd(14387): backtrace: (null):0 ((null)) 0xffffffffffffffff
2018-05-06T10:57:16.421Z lightning_gossipd(14387): STATUS_FAIL_INTERNAL_ERROR: FATAL SIGNAL

Fixes: #1469
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-05-10 21:11:00 +02:00
Rusty Russell
89c76a5a78 Move always-use-proxy auto-override to master daemon.
This means it will effect connect commands too (though it's too
late to stop DNS lookups caused by commandline options).

We also warn that this is one case where we allow forcing through Tor
without a proxy set: it just means all connections will fail.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-05-10 02:28:44 +00:00