Commit Graph

445 Commits

Author SHA1 Message Date
Johan T. Halseth
c9afc93151
discovery/gossiper: add local updates to graph immediately
Since the batch interval can potentially be long, adding local updates
to the graph could be slow. This would slow down operations like adding
our own channel update and announcements during the funding process, and
updating edge policies for local channels.

Now we instead check whether the update is remote or not, and only for
remote updates use the SchedulerOption to lazily add them to the graph.
2021-02-10 23:54:03 +01:00
Johan T. Halseth
7e34132c53
routing: let graph methods take scheduler option 2021-02-10 23:54:03 +01:00
Wilmer Paulino
904003fbcb discovery: use source of ann upon confirmed channel ann batch
We do this instead of using the source of the AnnounceSignatures
message, as we filter out the source when broadcasting any
announcements, leading to the remote node not receiving our channel
update. Note that this is done more for the sake of correctness and to
address a flake within the integration tests, as channel updates are
sent directly and reliably to channel counterparts.
2021-02-10 13:22:28 -08:00
Conner Fromknecht
58e924ad1c
discovery: don't historical sync when NumActiveSyncers == 0
Currently when numgraphsyncpeers=0, lnd will still attempt to perform
an initial historical sync. We change this behavior here to forgoe
historical sync entirely when numgraphsyncpeers is zero, since the
routing table isn't being updated anyway while the node is active.

This permits a no-graph lnd mode where no syncing occurs at all.
2021-02-10 09:35:45 -08:00
Olaoluwa Osuntokun
555de44d9f Revert "Merge pull request #4895 from wpaulino/disallow-premature-chan-updates"
This reverts commit 6e6384114c, reversing
changes made to 98ea433271.
2021-02-09 19:55:45 -08:00
Conner Fromknecht
b1fee734ec
discovery/sync_manager: remove unneeded markGraphSyncing
AFAICT it's not possible to flip back from bein synced_to_chain, so we
remove the underlying call that could reflect this. The method is moved
into the test file since it's still used to test correctness of other
portions of the flow.
2021-01-29 00:19:48 -08:00
Conner Fromknecht
e42301dee2
lntest: call markGraphSynced from gossipSyncer
Rather than performing this call in the SyncManager, we give each
gossipSyncer the ability to mark the first sync completed. This permits
pinned syncers to contribute towards the rpc-level synced_to_graph
value, allowing the value to be true after the first pinned syncer or
regular syncer complets. Unlinke regular syncers, pinned syncers can
proceed in parallel possibly decreasing the waiting time if consumers
rely on this field before proceeding to load their application.
2021-01-29 00:19:48 -08:00
Conner Fromknecht
fcd5cb625a
config: expose gossip.pinned-syncers for conf
The pinned syncer set is exposed as a comma-separated list of pubkeys.
2021-01-29 00:19:47 -08:00
Conner Fromknecht
340414356d
discovery: perform initial historical sync for pinned peers 2021-01-29 00:19:47 -08:00
Conner Fromknecht
2f0d56d539
discovery: add support for PinnedSyncers
A pinned syncer is an ActiveSyncer that is configured to always remain
active for the lifetime of the connection. Pinned syncers do not count
towards the total NumActiveSyncer count, which are rotated periodically.

This features allows nodes to more tightly synchronize their routing
tables by ensuring they are always receiving gossip from distinguished
subset of peers.
2021-01-29 00:19:47 -08:00
Conner Fromknecht
9e932f2a64
discovery/sync_manager: Pause/Resume HistoricalSyncTicker
This gives each initial historical syncer an equal amount of time before
being rotated, even if some fail.
2021-01-29 00:19:47 -08:00
Conner Fromknecht
ef0cd82c1f
discovery/sync_manager: make setHistoricalSyncer closure 2021-01-29 00:19:46 -08:00
Conner Fromknecht
72fbd1283b
discovery/sync_manager: break out IsGraphSynced check 2021-01-29 00:19:46 -08:00
Conner Fromknecht
7c6aa20bd8
discovery: handle err for linter 2021-01-29 00:19:46 -08:00
Wilmer Paulino
7ef1f3f636
discovery: use source of ann upon confirmed channel ann batch
We do this instead of using the source of the AnnounceSignatures
message, as we filter out the source when broadcasting any
announcements, leading to the remote node not receiving our channel
update. Note that this is done more for the sake of correctness and to
address a flake within the integration tests, as channel updates are
sent directly and reliably to channel counterparts.
2021-01-06 13:16:44 -08:00
Wilmer Paulino
00d4e92362
discovery: prevent rebroadcast of premature channel updates
As similarly done with premature channel announcements, we'll no longer
allow premature channel updates to be rebroadcast once mature. This is
no longer necessary as channel announcements that we're not aware of are
usually broadcast to us with their accompanying channel updates.
2021-01-06 12:52:41 -08:00
Wilmer Paulino
871a6f1690
discovery: prevent rebroadcast of previously premature announcements 2020-12-08 15:18:08 -08:00
Wilmer Paulino
a4f33ae63c
discovery: adhere to proper channel chunk splitting for ReplyChannelRange 2020-12-08 15:18:07 -08:00
Wilmer Paulino
c5fc7334a4
discovery: limit NumBlocks to best known height for outgoing QueryChannelRange
This is done to ensure we don't receive replies for channels in blocks
not currently known to us, which we wouldn't be able to process.
2020-12-08 15:18:06 -08:00
Olaoluwa Osuntokun
13a2598ded
discovery: add new option to toggle gossip rate limiting
In this commit, we add a new option to toggle gossip rate limiting. This
new option can be useful in contexts that require near instant
propagation of gossip messages like integration tests.
2020-11-30 16:38:56 -08:00
Olaoluwa Osuntokun
7e298f1434
Merge pull request #3367 from cfromknecht/batched-graph-updates
Batched graph updates
2020-11-25 18:40:40 -08:00
Wilmer Paulino
791ba3eb50
discovery: rate limit incoming channel updates
This change was largely motivated by an increase in high disk usage as a
result of channel update spam. With an in memory graph, this would've
gone mostly undetected except for the increased bandwidth usage, which
this doesn't aim to solve yet. To minimize the effects to disks, we
begin to rate limit channel updates in two ways. Keep alive updates,
those which only increase their timestamps to signal liveliness, are now
limited to one per lnd's rebroadcast interval (current default of 24H).
Non keep alive updates are now limited to one per block per direction.
2020-11-25 15:38:08 -08:00
Conner Fromknecht
f8154c65c5
discovery/gossiper: increase validation barrier size to 1000
This allows for a 1000 different validation operations to proceed
concurrently. Now that we are batching operations at the db level, the
average number of outstanding requests will be higher since the commit
latency has increased. To compensate, we allow for more outstanding
requests to keep the gossiper busy while batches are constructed.
2020-11-24 16:39:47 -08:00
Conner Fromknecht
fb9218d100
discovery/gossiper: channel announcements can't be outdated 2020-11-24 16:38:14 -08:00
Andras Banki-Horvath
d89f51d1d0
multi: add reset closure to kvdb.Update
Similarly as with kvdb.View this commits adds a reset closure to the
kvdb.Update call in order to be able to reset external state if the
underlying db backend needs to retry the transaction.
2020-11-05 17:57:12 +01:00
Andras Banki-Horvath
2a358327f4
multi: add reset closure to kvdb.View
This commit adds a reset() closure to the kvdb.View function which will
be called before each retry (including the first) of the view
transaction. The reset() closure can be used to reset external state
(eg slices or maps) where the view closure puts intermediate results.
2020-11-05 17:57:12 +01:00
yyforyongyu
ef38b12fda
multi: use timeout field in dialer 2020-09-16 11:50:04 +08:00
eugene
49d8f04197 multi: migrate instances of mockSigner to the mock package
This commit moves all localized instances of mock implementations of
the Signer interface to the lntest/mock package. This allows us to
remove a lot of code and have it housed under a single interface in
many cases.
2020-08-28 15:43:51 -04:00
Conner Fromknecht
cff52f7622
Merge pull request #4352 from matheusdtech/discovery-lock-premature
discovery: correctly lock premature messages
2020-06-26 22:50:09 -07:00
Brian Mancini
28931390ff discovery: prevent endBlock overflow in replyChanRangeQuery
Modifies syncer.replyChanRangeQuery method to use the LastBlockHeight
method on the query. LastBlockHeight safely calculates the ending
block height and prevents an overflow of start_block + num_blocks.

Prior to this change, query messages that had a start_block +
num_blocks that overflows uint32_max would return zero results in the
reply message.

Tests are added to fix the bug and ensure proper start and end values
are supplied to the channel graph filter.
2020-06-18 16:48:09 -04:00
Matheus Degiovani
44f83731bc discovery: Correctly lock premature annoucements
This reworks the locking behavior of the Gossiper so that a race
condition on channel updates and block notifications doesn't cause any
loss of messages.

This fixes an issue that manifested mostly as flakes on itests during
WaitForNetworkChannelOpen calls.

The previous behavior allowed ChannelUpdates to be missed if they
happened concurrently to block notifications. The
processNetworkAnnoucement call would check for the current block height,
then lock the gossiper and add the msg to the prematureAnnoucements
list. New blocks would trigger an update to the current block height
then a lock and check of the aforementioned list.

However, specially during itests it could happen that the missing lock
before checking the height could case a race condition if the following
sequence of events happened:

- A new ChannelUpdate message was received and started processing on a
  separate goroutine
- The isPremature() call was made and verified that the ChannelUpdate
  was in fact premature
- The goroutine was scheduled out
- A new block started processing in the gossiper. It updated the block
  height, asked and was granted the lock for the gossiper and verified
  there was zero premature announcements. The lock was released.
- The goroutine processing the ChannelUpdate asked for the gossiper lock
  and was granted it. It added the ChannelUpdate in the
  prematureAnnoucements list. This can never be processed now.

The way to fix this behavior is to ensure that both isPremature checks
done inside processNetworkAnnoucement and best block updates are made
inside the same critical section (i.e. while holding the same lock) so
that they can't both check and update the prematureAnnoucements list
concurrently.
2020-06-05 15:58:33 -03:00
Matheus Degiovani
ccc8f8e48f discovery: Log new blocks
This should help debug some flaky itests.
2020-06-05 13:31:40 -03:00
Conner Fromknecht
d0d2ca403d
multi: rename ReadTx to RTx 2020-05-26 18:20:37 -07:00
Roei Erez
ae2c37e043 Ensure chain notifier is started before accessed.
The use case comes from the RPC layer that is ready before the
chain notifier which is used in the sub server.
2020-04-30 12:54:33 +03:00
Conner Fromknecht
0f94b8dc62
multi: return input.Signature from SignOutputRaw 2020-04-10 14:27:35 -07:00
Conner Fromknecht
ec784db511
multi: remove returned error from WipeChannel
The linter complains about not checking the return value from
WipeChannel in certain places. Instead of checking we simply remove the
returned error because the in-memory modifications cannot fail.
2020-04-02 17:39:29 -07:00
Conner Fromknecht
4e793497c8
Merge pull request #2669 from cfromknecht/use-netann-in-discovery
netann+discovery+server: consolidate network announcements to netann pkg
2020-03-23 13:38:06 -07:00
Conner Fromknecht
92456d063d
discovery: remove unused updateChanPolicies struct 2020-03-19 13:43:57 -07:00
Conner Fromknecht
5c2fc4a2d6
discovery/gossiper: use netann pkg for signing channel updates 2020-03-19 13:43:39 -07:00
Olaoluwa Osuntokun
ace7a78494
discovery: covert to use new kvdb abstraction 2020-03-18 19:35:07 -07:00
Conner Fromknecht
089ac647d8
discovery/chan_series: use netann.ChannelUpdateFromEdge helper 2020-03-17 16:24:25 -07:00
Conner Fromknecht
7b0d564692
discovery: move remotePubFromChanInfo to gossiper, remove utils 2020-03-17 16:24:10 -07:00
Conner Fromknecht
6a813e3433
discovery/multi: move CreateChanAnnouncement to netann 2020-03-17 16:23:54 -07:00
Conner Fromknecht
d82aacbdc5
discovery/utils: use netann.ChannelUpdateFromEdge 2020-03-17 16:23:37 -07:00
Conner Fromknecht
df44d19936
discovery/multi: move SignAnnouncement to netann 2020-03-17 16:23:01 -07:00
Wilmer Paulino
57b69e3b1a
discovery: check ChainHash in QueryChannelRange messages
If the provided ChainHash in a QueryChannelRange message does not match
that of our current chain, then we should send a blank response, rather
than reply with channels for the wrong chain.
2020-01-17 11:51:09 -08:00
Wilmer Paulino
1bacdfb41e
discovery: interpret block range from ReplyChannelRange messages
We move from our legacy way of interpreting ReplyChannelRange messages
which was incorrect. Previously, we'd rely on the Complete field of the
ReplyChannelRange message to determine when our peer had sent all of
their replies. Now, we properly adhere to the specification by
interpreting the block ranges of these messages as intended.

Due to the large number of nodes deployed with the previous method, we
still maintain and detect when we are communicating with them, such that
we are still able to sync with them for backwards compatibility.
2020-01-06 14:03:13 -08:00
Wilmer Paulino
d688e13d35
discovery: remove unnecessary test check
It's not possible to send another reply once all replies have been sent
without another request. The purpose of the check is also done within
another test, TestGossipSyncerReplyChanRangeQueryNoNewChans, so it can
be removed from here.
2020-01-06 14:02:31 -08:00
Wilmer Paulino
c7c0853531
discovery: cover requested range in ReplyChannelRange messages
In order to properly adhere to the spec, when handling a
QueryChannelRange message, we must reply with a series of
ReplyChannelRange messages, that when consumed together cover the
entirety of the block range requested.
2020-01-06 14:00:15 -08:00
Wilmer Paulino
1f781ea431
discovery: use inclusive range in FilterChannelRange
FilterChannelRange takes an inclusive range, so it was possible for us
to return channels for an additional block that was not requested.
2020-01-06 14:00:14 -08:00
Johan T. Halseth
c04ef68cc3
Merge pull request #3826 from arik-so/wrong_chain_error_fix
fix order in wrong chain error message
2019-12-12 09:46:31 +01:00
Arik Sosman
e83df875ad
fix wrong chain error message 2019-12-11 17:43:24 -08:00
Olaoluwa Osuntokun
6a9b96122d
discovery: properly set FirstBlockHeight and NumBlocks in responses
In this commit we fix in a bug in `lnd` that could cause other
implementations which implement a strict version of the spec to
disconnect when trying to sync their channel graph using the gossip
query feature. Before this commit, we would embed the request to a
`QueryChannelRange` in the response, causing some clients to reject the
response as the `FirstBlockHeight` and `NumBlocks` field would be
identical for each chunk of the response.

In order to remedy this, we now properly set these two fields with each
returned chunk. Note that even after this commit, we keep our existing
behavior surrounding the `Complete` field as is. Otherwise, current
`lnd` clients which rely on this field (rather than the two
aforementioned fields) wouldn't be able to properly detect when a set of
responses to their query was "complete".

Partially fixes #3728.
2019-12-10 17:05:58 -08:00
Conner Fromknecht
5e27b5022c
multi: remove LocalFeatures and GlobalFeatures 2019-11-08 05:32:00 -08:00
Conner Fromknecht
16318c5a41
multi: merge local+global features from remote peer 2019-11-08 05:31:47 -08:00
cryptagoras
0ad6c4748f
discovery/gossiper: fix minor typo
It was missing a space
"addingto waiting batch" -> "adding to waiting batch"
2019-10-15 14:16:03 +03:00
Olaoluwa Osuntokun
3f8526a0ca
peer+lnpeer: add new methods to expose local+global features for lnpeer interface 2019-09-25 18:26:01 -07:00
Wilmer Paulino
04a7cda3d5
Merge pull request #3534 from alrs/discovery-test-improvements
discovery: Goroutine Test Fixes and Linting
2019-09-25 16:12:30 -07:00
Lars Lehtonen
0cae1e69ab discovery: error string lint fixes
discovery: lint fix to remove append loop
2019-09-25 18:42:38 +00:00
Lars Lehtonen
58c23074d1 discovery: use error channels with test goroutines 2019-09-25 18:41:42 +00:00
Joost Jager
c80feeb4b3
routing+discovery: extract local channel manager
The policy update logic that resided part in the gossiper and
part in the rpc server is extracted into its own object.

This prepares for additional validation logic to be added for policy
updates that would otherwise make the gossiper heavier.

It is also a small first step towards separation of our own channel data
from the rest of the graph.
2019-09-23 13:07:08 +02:00
Joost Jager
4b2eb9cb81
discovery: push max htlc migration further up the call tree
As a preparation for making the gossiper less responsible for validating
and supplementing local channel policy updates, this commits moves the
on-the-fly max htlc migration up the call tree. The plan for a follow up
commit is to move it out of the gossiper completely for local channel
updates, so that we don't need to return a list of final applied policies
anymore.
2019-09-23 13:07:06 +02:00
Joost Jager
339ff357d1
channeldb: invalidate channel signature cache on update 2019-09-23 13:07:04 +02:00
Joost Jager
5090bb27ad
discovery: remove redundant signature setting
The signature is retrieved, not used and overwritten with a
new signature.
2019-09-23 13:07:02 +02:00
Conner Fromknecht
1d41d4d666
multi: move WaitPredicate, WaitNoError, WaitInvariant to lntest/wait 2019-09-19 12:46:29 -07:00
Johan T. Halseth
92123c603d
gossiper: retransmit self NodeAnnouncement 2019-09-16 10:54:42 +02:00
Johan T. Halseth
24004fcb37
gossiper+server: define SelfNodeAnnouncement 2019-09-16 10:54:42 +02:00
Johan T. Halseth
e36d15582c
discovery/gossiper test: add TestRetransmit
This commit adds a test that ensures outdated announcements are
retransmitted when the RetransmitTicker ticks.
2019-09-16 10:54:38 +02:00
Johan T. Halseth
70d63abe9f
discovery/test: set global test timestamp 2019-09-16 10:23:01 +02:00
Johan T. Halseth
8b9fd039ec
discovery/gossiper test: remove mockGraphSource.SelfEdges 2019-09-16 10:23:01 +02:00
Johan T. Halseth
e201fbe396
discovery+server: RetransmitDelay->RetransmitTicker
Also let retransmitStaleChannels take a timestamp, to make it easier to
test.
2019-09-16 10:23:01 +02:00
Johan T. Halseth
74c9551564
discovery+server: make RebroadcastInterval part of config 2019-09-16 10:23:00 +02:00
Johan T. Halseth
3d8f194670
discovery/gossiper: extract adding nodeAnnouncement into method 2019-09-16 10:23:00 +02:00
Joost Jager
3d7de2ad39
multi: remove dead code 2019-09-10 17:21:59 +02:00
Valentine Wallace
8ce7f82da0 discovery+switch: apply zero forwarding policy updates in-memory as well as on disk
In this commit, we fix a bug where if a user updates a forwarding policy to be
zero, the update will be applied to the policy correctly on-disk, but not
in-memory.

We solve this issue by having the gossiper return the list of on-disk updated
policies and passing these policies to the switch, so the switch can assume
that zero-valued fields are intentional and not just uninitialized.
2019-09-09 23:39:44 -07:00
Wilmer Paulino
2e122a807b
Merge pull request #3406 from cfromknecht/die-spew
pilot+discovery: die spew
2019-08-22 15:33:56 -07:00
Wilmer Paulino
e15e524637
discovery: prevent broadcast of anns received during initial graph sync
There's no need to broadcast these as we assume that online nodes have
already received them. For nodes that were offline, they should receive
them as part of their initial graph sync.
2019-08-21 12:06:33 -07:00
Conner Fromknecht
e2a53f71d0
pilot+discovery: remove info spews 2019-08-20 14:13:05 -07:00
Wilmer Paulino
c405e89197
discovery: check non-nil syncer upon historical sync tick 2019-08-13 18:23:05 -07:00
Wilmer Paulino
977c139f3c
discovery: handle graph synced status after stalled initial historical sync
This ensures that the graph synced status is marked true at some point
once a historical sync has completed. Before this commit, a stalled
historical sync could cause us to never mark the graph as synced.
2019-08-06 17:56:55 -07:00
Wilmer Paulino
af4234f680
discovery: allow the SyncManager to report whether the graph is synced 2019-08-06 17:56:54 -07:00
Conner Fromknecht
8cebddfe50
discovery/gossiper: thread IgnoreHistoricalFilters to sync manager 2019-07-30 17:25:47 -07:00
Conner Fromknecht
a3e690e253
discovery/sync_manager: init all syncers with IgnoreHistoricalFilters 2019-07-30 17:25:31 -07:00
Conner Fromknecht
35a2de23a3
discovery/syncer: add flag to prevent historical gossip filter dump 2019-07-30 17:25:15 -07:00
Conner Fromknecht
a4097c113a
discovery/gossiper: remove unused SynchronizeNode method 2019-07-15 14:11:18 -07:00
Conner Fromknecht
933e723ec7
Merge pull request #3178 from federicobond/once-refactor
multi: replace manual CAS with sync.Once in several more modules
2019-07-08 20:33:44 -07:00
Olaoluwa Osuntokun
9c957193cf
discovery: remove retries from DNS based SampleNodeAddrs, allow down seeds
In this commit, we modify the `SampleNodeAddrs` method to no longer
retry itself. Instead, we'll now leave this task to the caller of the
this method. Additionally, we'll no longer return with an error if we
can't hit a particular seed. Instead, we'll log the error and move onto
the next seed. Finally, we'll also no longer require that the DNS seed
has a secondary seed in order to support a wider array of DNS seeds.
2019-06-28 16:10:47 -07:00
Wilmer Paulino
67132d4ee3
discovery: set source of node announcement broadcast to belonging node
We do this to ensure the node announcement propagates to our channel
counterparty. At times, the node announcement does not propagate to them
when opening our first channel due to a race condition between
IsPublicNode and processing announcement signatures. This isn't
necessary for channel updates and announcement signatures as we send
those to our channel counterparty directly through the reliable sender.
2019-06-20 13:59:37 -07:00
Federico Bond
0a9141763e multi: replace manual CAS with sync.Once in several more modules 2019-06-12 09:37:26 -03:00
Federico Bond
9bd3055fb8 discovery,fundingmanager: avoid serialization in NotifyWhenOnline 2019-06-04 16:36:21 -03:00
Conner Fromknecht
f8287b0080
Merge pull request #2985 from johng/sub-batch
Broadcast gossip announcements in sub batches
2019-05-28 17:05:06 -07:00
Johan T. Halseth
6ba6982ae7
discovery/sync_manager_test: add TestSyncManagerHistoricalSyncOnReconnect
TestSyncManagerHistoricalSyncOnReconnect tests that the sync manager will
re-trigger a historical sync when a new peer connects after a historical
sync has completed, but we have lost all peers.
2019-05-24 11:05:30 +02:00
Johan T. Halseth
526486ae24
discovery/sync_manager: restart historical sync on first connected peer
To handle the case where we have been without peers, and get a new
connection, we reset the historical scan booleans when the first active
syncer is connected to trigger another historical sync.
2019-05-24 11:05:29 +02:00
John Griffith
cf2885dd4a discovery: test we calculate and generate correct sub batch sizes 2019-05-23 10:51:25 +01:00
John Griffith
9eb5fe9587 discovery: split gossiper announcement into sub batches 2019-05-23 10:51:25 +01:00
Johan T. Halseth
c4415f0400
Merge pull request #3044 from cfromknecht/spelling-fixes
multi: fix spelling mistakes
2019-05-07 08:50:36 +02:00
Conner Fromknecht
17b2140cb5
multi: fix spelling mistakes 2019-05-04 15:35:37 -07:00
Olaoluwa Osuntokun
985902be27
Merge pull request #2916 from cfromknecht/split-syncer-query-reply
discovery: make gossip replies synchronous
2019-04-29 17:40:13 -07:00
Johan T. Halseth
ee257fd0eb
multi: move Route to sub-pkg routing/route 2019-04-29 14:52:33 +02:00
Conner Fromknecht
5251ebe117
discovery/syncer_test: fix benign off-by-one in DelayDOS test
Prior to this change, the numQueryResponses that we calculated would be
one more than what we actually wanted since it didn't account for the
initial QueryChannelRange msg. This resulted in the test sending one
extra delayed query than was configured. This doesn't fundamentally
impact the test, but does make what happens in the test more reflective
of the configuration.
2019-04-26 20:05:23 -07:00
Conner Fromknecht
bf4543e2bd
discovery/syncer: make gossip sends synchronous
This commit makes all replies in the gossip syncer synchronous, meaning
that they will wait for each message to be successfully written to the
remote peer before attempting to send the next. This helps throttle
messages the remote peer has requested, preventing unintended
disconnects when the remote peer is slow to process messages. This
changes also helps out congestion in the peer by forcing the syncer to
buffer the messages instead of dumping them into the peer's queue.
2019-04-26 20:05:10 -07:00
Conner Fromknecht
23d10336c2
discovery/syncer: separate query + reply handlers
This commit creates a distinct replyHandler, completely isolating the
requesting state machine from the processing of queries from the remote
peer. Before the two were interlaced, and the syncer could only reply to
messages in certain states. Now the two will be complete separated,
which is preliminary step to make the replies synchronous (as otherwise
we would be blocking our own requesting state machine).

With this changes, the channelGraphSyncer of each peer will drive the
replyHanlder of the other. The two can now operate independently, or
even spun up conditionally depending on advertised support for gossip
queries, as shown below:

          A                                 B
 channelGraphSyncer ---control-msg--->
                                        replyHandler
 channelGraphSyncer <--control-msg----
           gossiper <--gossip-msgs----

                    <--control-msg---- channelGraphSyncer
       replyHandler
                    ---control-msg---> channelGraphSyncer
                    ---gossip-msgs---> gossiper
2019-04-26 20:03:14 -07:00
Wilmer Paulino
d68842ee9e
discovery: queue active syncers until initial historical sync signal
In this commit, we begin to queue any active syncers until the initial
historical sync has completed. We do this to ensure we can properly
handle any new channel updates at tip. This is required for fresh nodes
that are syncing the channel graph for the first time. If we begin
accepting updates at tip while the initial historical sync is still
ongoing, then we risk not processing certain updates since we've yet to
learn of the channels themselves.
2019-04-24 13:20:57 -07:00
Wilmer Paulino
07136a5bc2
discovery: handle initial historical sync disconnection
In this commit, we add logic to handle a peer with whom we're performing
an initial historical sync disconnecting. This is required to ensure we
get as much of the graph as possible when starting a fresh node. It will
also serve useful to ensure we do not get stalled once we prevent active
GossipSyncers from starting until the initial historical sync has
completed.
2019-04-24 13:20:55 -07:00
Wilmer Paulino
227e492ccf
discovery: make historicalSync transition synchronous
We do this to ensure that the state transition from chansSynced to
syncingChans has occurred by the time we return back to the caller.
2019-04-24 13:20:18 -07:00
Wilmer Paulino
72e9674cff
discovery: simplify chooseRandomSyncer helper 2019-04-24 13:20:16 -07:00
Wilmer Paulino
29baa12254
discovery: synchronize new/stale GossipSyncers with syncerHandler
Now that the roundRobinHandler is no longer present, this commit aims to
clean up and simplify some of the logic surrounding initializing/tearing
down new/stale GossipSyncers from the SyncManager. Along the way, we
also synchronize these calls with the syncerHandler, which will serve
useful in future work that allows us to recovery from initial historical
sync disconnections.
2019-04-24 13:19:09 -07:00
Wilmer Paulino
5db2cf6273
discovery+server: remove roundRobinHandler and related code
Since ActiveSync GossipSyncers no longer synchronize our state with the
remote peers, none of the logic surrounding the round-robin is required
within the SyncManager.
2019-04-24 13:19:07 -07:00
Wilmer Paulino
9a6e8ecb9e
discovery: remove channel synchronization from ActiveSync GossipSyncers
In this commit, we remove the ability for ActiveSync GossipSyncers to
synchronize our graph with our remote peers. This serves as a starting
point towards allowing the daemon to only synchronize our graph through
historical syncs, which will be routinely done by the SyncManager.
2019-04-24 13:19:06 -07:00
Wilmer Paulino
aed0c2a90e
discovery: support optional message fields when processing announcements
In this commit, we extend the gossiper with support for external callers
to provide optional fields that can serve as useful when processing a
specific network announcement. This will serve useful for light clients,
which are unable to obtain the channel point and capacity for a given
channel, but can provide them manually for their own set of channels.
2019-04-18 21:57:39 -07:00
Wilmer Paulino
90475d5339
discovery: check nil policy within isMsgStale
If both policies don't exist, then this would result in a panic. Since
they don't exist, we can assume the policy we're currently evaluating is
fresh.
2019-04-15 12:49:34 -07:00
Wilmer Paulino
4bb4b0fe4e discovery: increase DefaultHistoricalSyncInterval to one hour
Assuming a graph size of 50,000 channels, an interval of 20 minutes
would cause nodes to consume about 600MB per month in bandwidth doing
these routine historical sync spot checks. In this commit, we increase
to one hour, which consumes about 300MB per month.
2019-04-11 15:46:22 -07:00
Olaoluwa Osuntokun
53beed7aaf
discovery: send policy updates for private channels directly to the remote peer
In this commit, we modify the main loop in `processChanPolicyUpdate` to
send updates for private channels directly to the remote peer via the
reliable message sender. This fixes a prior issue where the remote peer
wouldn't receive new updates as this method doesn't go through the
traditional path for channel updates.
2019-04-10 17:05:51 -07:00
Olaoluwa Osuntokun
a9d6273828
discovery: add new TestPropagateChanPolicyUpdate test case
In this commit, we add a new test case to exercise a recent bug fix to
ensure that we no longer broadcast private channel policy changes. Along
the way, a few helper functions were added to slim down the test to the
core logic compared to some of the existing tests in this package. In
the future, these new helper functions should be utilized more widely for
tests in this package in order to cut down on some of the duplicated
logic.
2019-04-10 17:05:49 -07:00
Olaoluwa Osuntokun
a6ae397f8c
discovery: ensure we don't broadcast policy changes for private channels 2019-04-10 17:05:48 -07:00
Olaoluwa Osuntokun
921eea9f57
discovery: update mockGraphSource to implement ForAllOutgoingChannels 2019-04-10 17:05:47 -07:00
Olaoluwa Osuntokun
aba32de1f4
discovery: set AnnSigner for mock signer in testCtx 2019-04-10 17:05:44 -07:00
Olaoluwa Osuntokun
13b91e6ea1 discovery: properly set short chan IDs for ann sigs in tests 2019-04-10 17:05:37 -07:00
Olaoluwa Osuntokun
3e5a6f1022
Merge pull request #2905 from cfromknecht/split-chunk-size
discovery: make batch size distinct from chunk size, reduce to 500
2019-04-09 19:27:20 -07:00
Conner Fromknecht
a4b4fe666a
discovery: make batch size distinct from chunk size, reduce to 500
This commit reduces the number of channels a syncer will request from
the remote node in a single QueryShortChanIDs message. The current size
is derived from the chunkSize, which is meant to signal the maximum
number of short chan ids that can fit in a single ReplyChannelRange
message. For EncodingSortedPlain, this number is 8000, and we use the
same number to dictate the size of the batch from the remote peer.

We modify this by introducing a separately configurable batchSize, so
that both can be tuned independently. The value is chosen to reduce the
amount of buffering the remote party will perform, only requiring them
queue 500 responses, as opposed to 8000. In turn, this reduces larges
spikes in allocation on the remote node at the expense of a few extra
round trips for the control messages. However, will be negligible since
the control messages are much smaller than the messages being returned.
2019-04-06 15:27:26 -07:00
Conner Fromknecht
9df6af237e
discovery/sync_manager: restore lazy gossip sends 2019-04-06 03:32:03 -07:00
Wilmer Paulino
00338c5ec2
discovery: properly handle SyncManager shutdown signal 2019-04-03 19:32:56 -07:00
Wilmer Paulino
46ceaf8cf6
discovery: only replace stale active syncer if disconnected
In this commit, we address a bug where we'd attempt to replace the
stale active syncer when it transitioned to a passive syncer. This
replacement logic is only intended to happen when the active syncer
disconnects, as rotateActiveSyncerCandidate chooses and queues its own
replacement.
2019-04-03 16:43:31 -07:00
Wilmer Paulino
8b6a9bb5d3
discovery: make timestamp range check inclusive within FilterGossipMsgs
As required by the spec:

> SHOULD send all gossip messages whose timestamp is greater or equal to
first_timestamp, and less than first_timestamp plus timestamp_range.
2019-04-03 15:44:46 -07:00
Wilmer Paulino
70be812747
discovery+server: use new gossiper's SyncManager subsystem 2019-04-03 15:44:43 -07:00
Wilmer Paulino
a188657b2f
discovery: introduce gossiper SyncManager subsystem
In this commit, we introduce a new subsystem for the gossiper: the
SyncManager. This subsystem is a major overhaul on the way the daemon
performs the graph query sync state machine with peers.

Along with this subsystem, we also introduce the concept of an active
syncer. An active syncer is simply a GossipSyncer currently operating
under an ActiveSync sync type. Before this commit, all GossipSyncer's
would act as active syncers, which means that we were receiving new
graph updates from all of them. This isn't necessary, as it greatly
increases bandwidth usage as the network grows. The SyncManager changes
this by requiring a specific number of active syncers. Once we reach
this specified number, any future peers will have a GossipSyncer with a
PassiveSync sync type.

It is responsible for three main things:

1. Choosing different peers randomly to receive graph updates from to
ensure we don't only receive them from the same set of peers.

2. Choosing different peers to force a historical sync with to ensure we
have as much of the public network as possible. The first syncer
registered with the manager will also attempt a historical sync.

3. Managing an in-order queue of active syncers where the next cannot be
started until the current one has completed its state machine to ensure
they don't overlap and request the same set of channels, which
significantly reduces bandwidth usage and addresses a number of issues.
2019-04-03 15:08:32 -07:00
Wilmer Paulino
e075817e44
discovery: introduce GossipSyncer signal delivery of chansSynced state
In this commit, we introduce another feature to the GossipSyncer in
which it can deliver a signal to an external caller once it reaches its
terminal chansSynced state. This is yet to be used, but will serve
useful with a round-robin sync mechanism, where we wait for to finish
syncing with a specific peer before moving on to the next.
2019-04-03 15:08:31 -07:00
Wilmer Paulino
042241dc48
discovery: allow gossip syncer to perform historical syncs
In this commit, we introduce the ability for gossip syncers to perform
historical syncs. This allows us to reconcile any channels we're missing
that the remote peer has starting from the genesis block of the chain.
This commit serves as a prerequisite to the SyncManager, introduced in a
later commit, where we'll be able to make spot checks by performing
historical syncs with peers to ensure we have as much of the graph as
possible.
2019-04-03 15:08:30 -07:00
Wilmer Paulino
ca4fbd598c
discovery: introduce GossipSyncer sync transitions
In this commit, we introduce the ability for GossipSyncer's to
transition their sync type. This allows us to be more flexible with our
gossip syncers, as we can now prevent them from receiving new graph
updates at any time. It's now possible to transition between the
different sync types, as long as the GossipSyncer has reached its
terminal chansSynced sync state. Certain transitions require some
additional wire messages to be sent, like in the case of an ActiveSync
GossipSyncer transitioning to a PassiveSync type.
2019-04-03 15:08:29 -07:00
Wilmer Paulino
acc42c1b68
discovery: set GossipSyncer update horizon to current time
With the introduction of the gossip sync manager in a later commit,
retrieving the backlog of updates within the last hour is no longer
necessary as we'll be forcing full syncs periodically.
2019-04-03 15:08:28 -07:00
Wilmer Paulino
8d7c0a9899
discovery: replace GossipSyncer syncChanUpdates flag with SyncerType
In this commit, we introduce a new type: SyncerType. This type denotes
the type of sync a GossipSyncer is currently under. We only introduce
the two possible entry states, ActiveSync and PassiveSync. An ActiveSync
GossipSyncer will exchange channels with the remote peer and receive new
graph updates from them, while a PassiveSync GossipSyncer will not and
will only response to the remote peer's queries.

This commit does not modify the behavior and is only meant to be a
refactor.
2019-04-03 15:08:27 -07:00
Wilmer Paulino
7e92b9a4e2
discovery: export gossipSyncer 2019-04-03 15:08:26 -07:00
Wilmer Paulino
d954cfc4ba
discovery: include peerPub in gossipSyncerCfg 2019-04-03 15:08:25 -07:00
Wilmer Paulino
c72db902f0
discovery: replace waitPredicate with lntest version 2019-04-03 15:08:12 -07:00
Wilmer Paulino
0ab97957ea discovery: check if stale within isMsgStale for ChannelUpdate messages
In this commit, we address an assumption of the gossiper's recently
introduce reliable sender. The reliable sender is currently only used
for messages of unannounced channels. This makes sense as peers should
be able to retrieve messages from the network if they've previously
announced. However, within isMsgStale, we assumed that the reliable
sender would be used for every ChannelUpdate being sent, even if the
channel is already announced. Due to this, checking if the policy is
stale was unnecessary. But since this isn't the case, we should actually
be checking whether it is stale to prevent sending it later on.
2019-03-28 17:22:23 -07:00
Wilmer Paulino
93414bd27a discovery: clean up TestSignatureAnnouncementRetryAtStartup
In this commit, we remove code from
TestSignatureAnnouncementRetryAtStartup that is not crucial to the
assumptions the test is exercising.
2019-03-28 17:22:23 -07:00
Wilmer Paulino
95ed11b01b discovery: properly store and retrieve edge policies within mockGraphSource
In this commit, we address an issue with our router mock in which it was
not properly storing and retrieving edge policies. Previously, they were
being appended to a slice of policies, but this doesn't always work like
when you attempt to update the same edge twice. Instead, the slice can
only contain up to two entries, each one being the latest version of
each direction.
2019-03-28 17:22:22 -07:00
Wilmer Paulino
3a2c4ec594 discovery: add missing offline peer check before sending message reliably 2019-03-28 17:21:28 -07:00
Wilmer Paulino
3e81e89062 discovery: add log when attempting to send msg reliably and peer is offline 2019-03-28 17:21:28 -07:00
Wilmer Paulino
5cec4513de
discovery: reject announcements for known zombie edges
In this commit, we leverage the recently introduced zombie edge index to
quickly reject announcements for edges we've previously deemed as
zombies. Care has been taken to ensure we don't reject fresh updates for
edges we've considered zombies.
2019-03-27 13:08:03 -07:00
Wilmer Paulino
23796d3247
routing+discovery: extend ChannelGraphSource with zombie index methods 2019-03-27 13:07:30 -07:00
Conner Fromknecht
0ae06c8189
discovery+server: send lazy gossip msgs 2019-03-05 17:08:48 -08:00
Conner Fromknecht
f39edd8000
peer: add SendMessageLazy 2019-03-05 17:08:22 -08:00
Wilmer Paulino
12168f022e
server+discovery: send channel updates to remote peers reliably
In this commit, we also allow channel updates for our channels to be
sent reliably to our channel counterparty. This is especially crucial
for private channels, since they're not announced, in order to ensure
each party can receive funds from the other side.
2019-02-14 18:33:27 -08:00
Wilmer Paulino
4996d49118
server+discovery: use reliableSender to replace existing resend logic 2019-02-14 18:33:27 -08:00
Wilmer Paulino
2f679f6015
discovery/reliable_sender: implement message-agnostic reliable sender
In this commit, we implement a new subsystem for the gossiper that
uses some of the existing logic for resending channel announcement
signatures and implements it in a way to make it message-agnostic,
meaning that any type of message can be resent. Along the way we also
modify the way this works to prevent multiple goroutines per peer _and_
message.

A peerHandler will be spawned for each peer for which we attempt to send
a message reliably to. This handler is responsible for managing requests
to reliably send messages to a peer while also taking the peer's
connection lifecycle into account by requesting notifications for when
the peer connects/disconnects. A peer connection notification is first
requested to determine when we should attempt to send any pending
messages. After the messages are sent, a peer disconnection notification
is requested to ensure we don't continue to request connection
notifications while the peer remains connected. Once there are no more
pending messages left to be sent for a given peer, the peerHandler can
be torn down.
2019-02-14 18:33:27 -08:00
Wilmer Paulino
6e556aa897
discovery/gossiper_test: prevent race conditions within mockGraphSource 2019-02-14 18:33:27 -08:00
Wilmer Paulino
73b4bc4b68
server+discovery: remove channeldb.DB reference within the gossiper
Now that we've replaced the built-in messageStore with the
channeldb.GossipMessageStore, the reference to channeldb.DB is no longer
needed.
2019-02-14 18:29:39 -08:00
Wilmer Paulino
2277535e6b
server+discovery: replace gossiper message store with MessageStore 2019-02-14 18:29:39 -08:00
Wilmer Paulino
847b064461
discovery/message_store: add gossip message store
In this commit, we add a new store within the database that'll be
responsible for storing gossip messages which we need to reliably send
to peers. This aims to replace the current messageStore that exists
within the gossiper, so much of this logic is borrowed from there.
One of the main differences between the two is that we now index
messages with a new key format in which we take into account the
message's type. This allows us to store different messages for a
specific channel with a peer. The old key format is still supported in
order to prevent a database migration.
2019-02-14 18:29:39 -08:00