We do not expect blinding errors from the final node:
1. If the introduction is the recipient, they should use regular errors.
2. Otherwise, nodes have no business sending this error when they are
not part of a blinded route.
This commit adds handling for route blinding errors that are reported
by the introduction node in a multi-hop blinded route. As the
introduction node is always responsible for handling blinded errors,
it is not penalized - only the final hop is penalized to discourage the
blinded route without filling up mission control with ephemeral
results.
If this error code is reported by a node that is not an introduction
node, we penalize the node because it is returning an error code that
it should not be using.
This commit adds handling for errors that originate after the
introduction node when making payment to a blinded route. This
indicates that the introduction node is not obeying the spec, so
it is punished for the violation.
It can happen that the receiver times out some htlcs of the set if it
took to long to complete. But because the sender's mission control is
now updated, it is worth to keep trying to send those shards again.
Previously we only penalized the outgoing connections of a failing node.
This turned out not to be sufficient, because the next route sometimes
went into the same failing node again to try a different outgoing
connection that wasn't yet known to mission control and therefore not
penalized before.
This commit adds success mission control
results for all hops along the route in
a mpp timeout and takes no action for
the final hop along the route. This is a
temporary measure to prevent the default
logic from penalizing the final node while
we decide how to penalize mpp timeouts.
An InvalidOnionPayload implies that the onion was successfully received
by the reporting node, but that they were unable to extract the
contents. Since we assume our own behavior is correct, this mostly
likely poins to an error in the reporter's implementation or that we
sent an unknown required type. Therefore we only penalize that single
hop, and consider the failure terminal if the receiver reported it.
This commit modifies the interpretation of node-level failures.
Previously only the failing node was marked. With this commit, also the
incoming and outgoing connections involved in the route are marked as
failed.
The change prepares for the removal of node-level failures in mission
control probability estimation.
Extends the invalid payment details failure with the new accept height
field. This allows sender to distinguish between a genuine invalid
details situation and a delay caused by intermediate nodes.
This commit modifies paymentLifecycle so that it not only feeds
failures into mission control, but successes as well.
This allows for more accurate probability estimates. Previously,
the success probability for a successful pair and a pair with
no history was equal. There was no force that pushed towards
previously successful routes.
This commit overhauls the interpretation of failed payments. It changes
the interpretation rules so that we always apply the strongest possible
set of penalties, without making assumptions that would hurt good nodes.
Main changes are:
- Apply different rule sets for intermediate and final nodes. Both types
of nodes have different sets of failures that we expect. Penalize nodes
that send unexpected failure messages.
- Distinguish between direct payments and multi-hop payments. For direct
payments, we can infer more about the performance of our peer because we
trust ourselves.
- In many cases it is impossible for the sender to determine which of
the two nodes in a pair is responsible for the failure. In this
situation, we now penalize bidirectionally. This does not hurt the good
node of the pair, because only its connection to a bad node is
penalized.
- Previously we always penalized the outgoing connection of the
reporting node. This is incorrect for policy related failures. For
policy related failures, it could also be that the reporting node
received a wrongly crafted htlc from its predecessor. By penalizing the
incoming channel, we surely hit the responsible node.
- FailExpiryTooSoon is a failure that could have been caused by any node
up to the reporting node by delaying forwarding of the htlc. We don't
know which node is responsible, therefore we now penalize all node pairs
in the route.
This commit moves the payment outcome interpretation logic into a
separate file. Also, mission control isn't updated directly anymore, but
results are stored in an interpretedResult struct. This allows the
mission control state to be locked for a minimum amount of time and
makes it easier to unit test the result interpretation.