Only include the final hop's cltv delta in the total timelock
calculation if the route does not include a blinded path. This is
because in a blinded path, the final hops final cltv delta will be
included in the blinded path's accumlated cltv delta value.
With this commit, we remove the responsibility of remembering not to set
the `finalHop.cltvDelta` from the caller of `newRoute`. The relevant
test is updated accordingly.
Later on in this series, we will need to know during path finding if an
edge we are traversing was derived from a blinded payment path. In
preparation for that, we add a BlindedPayment member to the
`unifiedEdge` struct.
The reason we will need this later on is because: In the case where we
receive multiple blinded paths from the receipient, we will first swap
out the final hop node of each path with a single unified target node so
that path finding can work as normal. Once we have selected a route
though, we will want to know which path an edge belongs to so that we
can swap the correct destination node back in.
Add a constructor for unified edge. In upcoming commits, we will add a
new member to unifiedEdge and a constructor forces us to not forget to
populate a required member.
Expand the AdditionalEdges interface with a BlindedPayment method. In
upcoming commits, we will want to know if an AdditionalEdge was derived
from a blinded payment or not and we will also need some information
from the blinded payment it was derived from. So we expand the interface
here to avoid needing to do type casts later on. The new method may
return nil if the edge was not derived from a blinded payment.
This commit is purely a refactor. In it, we let the `BlindedEdge` struct
carry a pointer to the `BlindedPayment` that it was derived from. This
is done now because later on in the PR series, we will need more
information about the `BlindedPayment` that an edge was derived from.
Since we now pass in the whole BlindedPayment, we swap out the
`cipherText` member for a `hopIndex` member so that we dont carry around
two sources of truth in the same struct.
This commit removes another check for TLV payload support of the
destination node. We assume TLV payloads as the default everywhere else,
so we just remove two checks that were previously forgotten.
This removes duplication of in-memory data during the periodic flushing
stage of the mission control store.
The existing code entirely duplicates the in-memory cache of the store,
which is very wasteful when only a few additional results are being
rotated into the store.
This has a significant performance penalty specially for wallets that
remain online for a long time with a low volume of payments. The worst
case scenario are wallets that see at most 1 new payment a second, where
the entire in-memory cache is recreated every second.
This commit improves the situation by determining what will be the
actual changes that need to be committed before initiating the db
transaction and only keeping track of these to update the in-memory
cache if the db tx is successful.
This modifies the mission control store to avoid running the one second
ticker for flushing data when there is no work to be done.
This improves performance of a quiscent LN node by avoiding a one second
interval busy loop that does nothing when there are no payments flowing
through the node.
This modifies the mission control store to avoid doing any work when no
new payment result entries are in the queue to be processed.
The mission control store maintains keeps the latest N (in production:
1000) entries in its DB, evicting older entries when new ones are added.
Currently, its implementation is somewhat less performant than it could
be.
This commit adds an early return to the storeResults function to avoid
doing any DB or memory operations when its outstanding queue is empty,
improving the performance during quiescent periods of the LN node's
execution.
In this commit we set up the payment loop context
according to user-provided parameters. The
`cancelable` parameter indicates whether the user
is able to interrupt the payment loop by cancelling
the server stream context. We'll additionally wrap
the context in a deadline if the user provided a
payment timeout.
We remove the timeout channel of the payment_lifecycle.go
and in favor of the deadline context.
Fixes the problem that inbound base fee and fee rate are overwritten
with 0 if they are not specified in PolicyUpdateRequest. This ensures
backward compatibility with older rpc clients that do not yet support
the inbound feature.
In this commit, the tlv extension of a channel update message is parsed.
If an inbound fee schedule is encountered, it is reported in the
graph rpc calls.
Since we fixed a number of issues in chainntnfs.NewBitcoindBackend that
makes it compatible with bitcoind v26.0, we now want to use that
function in all our unit tests.
With the chainntnfs.NewMiner now being optimized for not creating
nodes with colliding ports, we use it in all unit tests that spin up
temporary miners.
The final hop size is calculated differently therefore we extract
the logic in its own function and also account for the case where
the final hop might be a blinded hop.
In the previous commit the AdditionalEdge interface was introduced
with both of its implementations `BlindedEdge` and `PrivateEdge`.
In both cases where we append a route either by a blinded route
section or private route hints we now use these new types. In
addition the `PayloadSizeFunc` type is introduced in the
`unifiedEdge` struct. This is necessary to have the payload size
function at hand when searching for a route hence not overshooting
the max sphinx package size of 1300 bytes.
A new interface AdditionalEdge is introduced which allows us to
enhance the CachedEdgePolicy with additional information. We
currently only need a PayloadSize function which is then used to
determine the exact payload size when building a route to adhere
to the sphinx package size limit (BOLT04).
In this commit, the FetchChanInfos ChannelGraph method is updated to
take in an optional read transaction for the case where it is called
from within another transaction.
We do not expect blinding errors from the final node:
1. If the introduction is the recipient, they should use regular errors.
2. Otherwise, nodes have no business sending this error when they are
not part of a blinded route.
Note: this refactor updates the inequality used from >= 2 to > 1 to
align with the rest of this file so that we express this concept
consistently throughout the code.
This commit adds handling for route blinding errors that are reported
by the introduction node in a multi-hop blinded route. As the
introduction node is always responsible for handling blinded errors,
it is not penalized - only the final hop is penalized to discourage the
blinded route without filling up mission control with ephemeral
results.
If this error code is reported by a node that is not an introduction
node, we penalize the node because it is returning an error code that
it should not be using.
This commit adds handling for errors that originate after the
introduction node when making payment to a blinded route. This
indicates that the introduction node is not obeying the spec, so
it is punished for the violation.
Previously, we'd use the value of nextChanID to infer whether a payload
was for the final hop in a route. This commit updates our packing logic
to explicitly signal to account for blinded routes, which allow zero
value nextChanID in intermediate hops. This is a preparatory commit
that allows us to more thoroughly validate payloads.
Add the missing channel field to the final hop in our clear text
route test case. Note that this is the channel of the hop. With the
addition of stricter validation, we'll need this so that the
penultimate hop has a non-zero next channel ID.
This commit introduces a wrapper function fetchFundingTxWrapper
which calls fetchFundingTx in a goroutine. This is to avoid an issue
with pruned nodes where the router is attempting to stop, but the
prunedBlockDispatcher is waiting to connect to peers that can serve
the block. This can cause the shutdown process to hang until we
connect to a peer that can send us the block.
This commit makes sure a testing payment is created via
`createDummyLightningPayment` to ensure the payment hash is unique to
avoid collision of the same payment hash being used in uint tests. Since
the tests are running in parallel and accessing db, if two difference
tests are using the same payment hash, no clean test state can be
guaranteed.
This commit adds unit tests for `resumePayment`. In addition, the
`resumePayment` has been split into two parts so it's easier to be
tested, 1) sending the htlc, and 2) collecting results. As seen in the
new tests, this split largely reduces the complexity involved and makes
the unit test flow sequential.
This commit also makes full use of `mock.Mock` in the unit tests to
provide a more clear testing flow.
The old payment lifecycle is removed due to it's not "unit" -
maintaining these tests probably takes as much work as the actual
methods being tested, if not more so. Moreover, the usage of the old
mockers in current payment lifecycle test is removed as it re-implements
other interfaces and sometimes implements it uniquely just for the
tests. This is bad as, not only we need to work on the actual interface
implementations and test them , but also re-implement them again in the
test without testing them!
This commit adds a new interface method `TerminalInfo` and changes its
implementation to return an `*HTLCAttempt` so it includes the route for
a successful payment. Method `GetFailureReason` is now removed as its
returned value can be found in the above method.
This commit adds a new struct, `stateStep`, to decide the workflow
inside `resumePayment`.
It also refactors `collectResultAsync` introducing a new channel
`resultCollected`. This channel is used to signal the payment
lifecycle that an HTLC attempt result is ready to be processed.
This commit makes sure we only fail attempt inside `handleSwitchErr` to
ensure the orders in failing payment and attempts. It refactors
`collectResult` to return `attemptResult`, and expands `handleSwitchErr`
to also handle the case where the attemptID is not found.
This commit refactors the `resumePayment` method by adding the methods
`checkTimeout` and `requestRoute` so it's easier to understand the flow
and reason about the error handling.
This commit removes the method `launchShard` and splits its original
functionality into two steps - first create the attempt, second send the
attempt. This enables us to have finer control over "which error is
returned from which system and how to handle it".
This commit starts handling switch error inside `sendAttempt` when an
error is returned from sending the HTLC. To make sure the updated
`HTLCAttempt` is always returned to the callsite, `handleSwitchErr` now
also returns a `attemptResult`.
This commit removes the `launchOutcome` and `shardResult` and uses
`attemptResult` instead. This struct is also used in `failAttempt` so we
can future distinguish critical vs non-critical errors when handling
HTLC attempts.
`handleSwitchErr` is now responsible for failing the given HTLC attempt
after deciding to fail the payment or not. This is crucial as
previously, we might enter into a state where the payment's HTLC has
already been marked as failed, and while we are marking the payment as
failed, another HTLC attempt can be made at the same time, leading to
potential stuck payments.
This commit removes the unclear abstraction `shardHandler` that's used
in our payment lifecycle. As we'll see in the following commits,
`shardHandler` is an unnecessary layer and everything can be cleanly
managed inside `paymentLifecycle`.
Finally, The LightningNode object is removed from ChannelEdgePolicy.
This is a step towards letting ChannelEdgePolicy reflect exactly the
schema that is on disk.
This is also nice because the `Node` object is not necessarily always
required when the ChannelEdgePolicy is loaded from the DB, so now it
only get's loaded when needed.
In preparation for the next commit which will remove the
`*LightningNode` from the `ChannelEdgePolicy` struct,
`FetchLightningNode` is modified to take in an optional transaction so
that it can be utilised in places where a transaction exists.
Having a `ForEachChannel` method on the `LightningNode` struct itself
results in a `kvdb.Backend` object needing to be stored within the
LightningNode struct. In this commit, this method is replaced with a
`ForEachNodeChannel` method on the `ChannelGraph` struct will perform
the same function without needing the db pointer to be stored within the
LightningNode. This change, the LightningNode struct more closely
represents the schema on disk.
The existing `ForEachNodeChannel` method on `ChannelGraph` is renamed to
`ForEachNodeDirectedChannel`. It performs a slightly different function
since it's call-back operates on Cached policies.
This commit updates route construction to backfill the fields
required for payment to blinded paths and set amount to forward
and expiry fields to zero for intermediate hops (as is instructed
in the route blinding specification).
We could attempt to do this in the first pass, but that loop
relies on fields like amount to forward and expiry to calculate
each hop backwards, so we keep it simple (stupid) and post
processes the blinded portion, since it's computationally cheap
and more readable.
Add the option to include a blinded route in a route request (exclusive
to including hop hints, because it's incongruous to include both), and
express the route as a chain of hop hints.
Using a chain of hints over a single hint to represent the whole path
allows us to re-use our route construction to fill in a lot of the
path on our behalf.
This commit introduces a single struct to hold all of the parameters
that are passed to FindRoute. This cleans up an already overloaded
function signature and prepares us for handling requests with blinded
routes where we need to perform some additional processing on our
para (such as extracting the target node from the blinded path).
When we introduce blinded routes, some of our hops are expected
to have zero amounts to forward in their hop payload. This commit
updates our hop fee logic to attribute the full blinded route's
fees to the introduction node. We can't actually know where/how
these fees are distributed, so we collect them all at the
introduction node.
With the addition of blinded routes, we now need to account for the
possibility that intermediate nodes payloads will not have an amount
and expiry set because that information is provided by the recipient
encrypted data blob. This commit updates our payload packing to only
optionally include those fields.
This commit adds the encrypted_data, blinding_point and total_amt_msat
tlvs to the known set of even tlvs for the onion payload. These TLVs
are added in two places (the onion payload and hop struct) because
lnd uses the same set of TLV types for both structs (and they
inherently represent the same thing).
Note: in some places, unit tests intentionally mimic the style
of older tests, so as to be more consistently readable.
This commit adds a representation of blinded payments, which include a
blinded path and aggregate routing parameters to be used in payment to
the path.
This commit refactors the params used in lifecycle to prefer
`HTLCAttempt` over `HTLCAttemptInfo`. This change is needed as
`HTLCAttempt` also wraps settled and failure info, which is useful in
the following commits.
This commit turns `MPPayment` into an interface inside `routing`. Having
this interface gives us the benefit to write more granular unit tests
inside payment lifecycle. As seen from the modified unit tests, several
hacky ways of testing the `SendPayment` method is now replaced by a mock
over `MPPayment`.
This commit adds a new method, `NeedWaitAttempts`, to properly decide
whether we need to wait for the outcome of htlc attempts based on the
payment's current state.
This commit moves the struct `paymentState` used in `routing` into
`channeldb` and replaces it with `MPPaymentState`. In the following
commit we'd see the benefit, that we don't need to pass variables back
and forth between the two packages. More importantly, this state is put
closer to its origin, and is strictly updated whenever a payment is read
from disk. This approach is less error-prone comparing to the previous
one, which both the `payment` and `paymentState` need to be updated at
the same time to make sure the data stay consistant in a parallel
environment.
This commit moves the creations of hop and htlcAdd message from
`createNewPaymentAttempt` to `sendPaymentAttempt` to clean up the code
and further pave the way to decomposite the lifecycle.
This commit applies the new method `Terminated`. A side effect from
using this method is, we can now save one query `fetchPayment` inside
`FetchInFlightPayments`.
This commit adds a new payment status, `StatusInitiated`, to properly
represent the state where a payment is newly created without attempting
any HTLCs. Since the `PaymentStatus` is a memory representation of a
given payment's status, the enum used for the statuses are directly
updated.
In this commit, we carry out a new notion introduced during a recent
spec meeting to use a feature bit plus 100 before the feature has been
finalized in the spec.
We split into the Final and Staging bits.
In this commit, we start to set _internally_ a new feature bit in the
channel announcements we generate. As these taproot channels can only be
unadvertised, this will never actually leak to the public network. The
funding manager will then set this field to allow the router to properly
validate these channels.
In this commit, we update the Sig type to support ECDSA and schnorr
signatures. We need to do this as the HTLC signatures will become
schnorr sigs for taproot channels. The current spec draft opts to
overload this field since both the sigs are actually 64 bytes in length.
The only consideration with this move is that callers need to "coerce" a
sig to the proper type if they need schnorr signatures.
In this commit, we eliminate some code duplication by removing the old
`HashMutex` struct as it just duplicates all the code with a different
type (uint64 and hash). We then make the main Mutex struct take a type
param, so the key can be parametrized when the struct is instantiated.
This commit adds a a new `MarshalOutPoint` helper in the `lnrpc` package
that can be used to convert a `wire.Outpoint` to an `lnrpc.Outpoint`.
By using this helper, we are less likely to forget to populate both the
string and byte form of the TXID.
The process how we calculate a total probability from the direct and
node probability is modified to give more importance to the direct
probability.
This is important if we know about a recent failure. We should not try
to send over this channel again if we know we can't. Otherwise this can
lead to infinite retrials over the channel, when the probability is
pinned at high probabilities by the other node results.
Having a capacity available is important for liquidity estimation.
* We add a dummy capacity to hop hints. Hop hints specify neither the
capacity nor a maxHTLC size, which is why we assume the channel to
have a high capacity relative to the amount we send.
* We add a capacity to local edges. These channels should always have a
capacity associated with them, even in the neutrino case (a fallback
to maxHTLC is not necessary). This is just for completeness, as the
probability calculation for local channels is done separately.
* The maximal reduction in the probability is limited to 0.5 (previously
~0.05), such that we don't get too low apriori probabilities.
Otherwise, this may lead to a too strong selection of large (and maybe
expensive) channels. A two-hop path would get total probability
penalties of:
- 1000PPM/(0.6*0.6) = 2778 PPM in the unsaturated case
- 1000PPM/(0.6*(0.6*0.5)) = 5556 PPM in the saturated case, where the
second hop is saturated
The difference in PPM of 2778 PPM should be enough to bias towards the
first path.
* The smearing factor is reduced. Previously we had to keep a higher
smearing factor in order to make the capacity factor not go to zero
for high amounts, to still give a fully saturated channel a chance.
This is not needed anymore due to the capping to 0.5. A lower value of
the smearing factor lets us more precisely choose a capacity fraction
and the capacity factor is more neutral when it comes to intermediate
amounts.
We set a conservative default value for the capacity fraction, which
still has the effect of discarding exhausted channels, giving a
noticeable effect when about 90% of the capacity is being used.
We make the capacity factor configurable via an lnd.conf routerrpc
apriori parameter. The capacity factor trades off increased success
probability with a reduced set of channel candidates, which may lead to
increased fees. To let users choose whether the factor is active or not,
we add a config setting where a capacity fraction of 1.0 disables the
factor. We limit the capacity fraction to values between 0.75 and 1.0.
Lower values may discard too many channels.
We require channel updates to have the max HTLC message flag set.
Several flows need to pass that check before channel updates are
forwarded to peers:
* after channel funding: `addToRouterGraph`
* after receiving channel updates from a peer:
`ProcessRemoteAnnouncement`
* after we update channel policies: `PropagateChanPolicyUpdate`
We rename `ChanUpdateOptionMaxHtlc` to `ChanUpdateRequiredMaxHtlc`
as with the latest changes it is now required.
Similarly, rename `validateOptionalFields` to
`ValidateChannelUpdateFields`, export it to use it in a later commit.
We use a more general `Estimator` interface for probability estimation
in missioncontrol.
The estimator is created outside of `NewMissionControl`, passed in as a
`MissionControlConfig` field, to facilitate usage of externally supplied
estimators.
Implements a new probability estimator based on a probability theory
framework.
The computed probability consists of:
* the direct channel probability, which is estimated based on a
depleted liquidity distribution model, formulas and broader concept derived
after Pickhardt et al. https://arxiv.org/abs/2103.08576
* an extension of the probability model to incorporate knowledge decay
after time for previous successes and failures
* a mixed node probability taking into account successes/failures on other
channels of the node (similar to the apriori approach)
We introduce a probability `Estimator` interface which is implemented by
the current apriori probability estimator. A second implementation, the
bimodal probability estimator follows.
* we rename the current probability estimator to be the "apriori"
probability estimator to distinguish from a different implementation
later
* the AprioriEstimator is exported to later be able to type switch
* getLocalPairProbability -> LocalPairProbabiltiy (later part of an
exported interface)
* getPairProbability -> getPairProbabiltiy (later part of an exported
interface)
The test cases in `TestUpdatePaymentState` run in parallel. One of the
parameters is a pointer and the value to the struct it points to gets
modified during the test.
The race condition was introduced in 8d49dfb07e
To test the fix run from the main folder `go test ./routing/. -race`
before this fix and after.
We multiply the apriori probability with a factor to take capacity into
account:
P *= 1 - 1 / [1 + exp(-(amount - cutoff)/smearing)]
The factor is a function value between 1 (small amount) and 0 (high
amount). The zero limit may not be reached exactly depending on the
smearing and cutoff combination. The function is a logistic function
mirrored about the y-axis. The cutoff determines the amount at which a
significant reduction in probability takes place and the smearing
parameter defines how smooth the transition from 1 to 0 is. Both, the
cutoff and smearing parameters are defined in terms of fixed fractions
of the capacity.
Extends the pathfinder with a capacity argument for later usage.
In tests, the inserted testCapacity has no effect, but will be used
later to estimate reduced probabilities from it.