When an HTLC has been failed, we track it up until the point there
exists no broadcastable commitment transaction which has the HTLC
present, at which point Channel returns the HTLCSource back to the
ChannelManager, which fails the HTLC backwards appropriately.
When an HTLC is fulfilled, however, we fulfill on the backwards path
immediately. This is great for claiming upstream HTLCs, but when we
want to track pending payments, we need to ensure we can check with
ChannelMonitor data to rebuild pending payments. In order to do so,
we need an event similar to the HTLC failure event, but for
fulfills instead.
Specifically, if we force-close a channel, we remove its off-chain
`Channel` object entirely, at which point, on reload, we may notice
HTLC(s) which are not present in our pending payments map (as they
may have received a payment preimage, but not fully committed to
it). Thus, we'd conclude we still have a retryable payment, which
is untrue.
This commit does so, informing the ChannelManager via a new return
element where appropriate of the HTLCSource corresponding to the
failed HTLC.
This allows us to read a `HashMap` that has values which may be
skipped if they are some backwards-compatibility type.
We also take this opportunity to fail deserialization if keys are
duplicated.
If we have a `ChannelMonitor` update from an on-chain event which
returns a `TemporaryFailure`, we block `MonitorEvent`s from that
`ChannelMonitor` until the update is persisted. This prevents
duplicate payment send events to the user after payments get
reloaded from monitors on restart.
However, if the event being avoided isn't going to generate a
PaymentSent, but instead result in us claiming an HTLC from an
upstream channel (ie the HTLC was forwarded), then the result of a
user delaying the event is that we delay getting our money, not a
duplicate event.
Because user persistence may take an arbitrary amount of time, we
need to bound the amount of time we can possibly wait to return
events, which we do here by bounding it to 3 blocks.
Thanks to Val for catching this in review.
ChannelMonitors now require that they be re-persisted before
MonitorEvents be provided to the ChannelManager, the exact thing
that test_dup_htlc_onchain_fails_on_reload was testing for when it
*didn't* happen. As such, test_dup_htlc_onchain_fails_on_reload is
now testing that we bahve correctly when the API guarantees are not
met, something we don't need to do.
Here, we adapt it to test the new API requirements through
ChainMonitor's calls to the Persist trait instead.
This resolves several user complaints (and issues in the sample
node) where startup is substantially delayed as we're always
waiting for the chain data to sync.
Further, in an upcoming PR, we'll be reloading pending payments
from ChannelMonitors on restart, at which point we'll need the
change here which avoids handling events until after the user
has confirmed the `ChannelMonitor` has been persisted to disk.
It will avoid a race where we
* send a payment/HTLC (persisting the monitor to disk with the
HTLC pending),
* force-close the channel, removing the channel entry from the
ChannelManager entirely,
* persist the ChannelManager,
* connect a block which contains a fulfill of the HTLC, generating
a claim event,
* handle the claim event while the `ChannelMonitor` is being
persisted,
* persist the ChannelManager (before the CHannelMonitor is
persisted fully),
* restart, reloading the HTLC as a pending payment in the
ChannelManager, which now has no references to it except from
the ChannelMonitor which still has the pending HTLC,
* replay the block connection, generating a duplicate PaymentSent
event.
In the next commit, we'll be originating monitor updates both from
the ChainMonitor and from the ChannelManager, making simple
sequential update IDs impossible.
Further, the existing async monitor update API was somewhat hard to
work with - instead of being able to generate monitor_updated
callbacks whenever a persistence process finishes, you had to
ensure you only did so at least once all previous updates had also
been persisted.
Here we eat the complexity for the user by moving to an opaque
type for monitor updates, tracking which updates are in-flight for
the user and only generating monitor-persisted events once all
pending updates have been committed.
In the next commit we'll need ChainMonitor to "see" when a monitor
persistence completes, which means `monitor_updated` needs to move
to `ChainMonitor`. The simplest way to then communicate that
information to `ChannelManager` is via `MonitorEvet`s, which seems
to line up ok, even if they're now constructed by multiple
different places.
Expand routing::Score::channel_penalty_msat to include the source and
target node ids of the channel. This allows scorers to avoid certain
nodes altogether if desired.
In order to make the scoring tests easier to read, only check the
relevant RouteHop fields. The remaining fields are tested elsewhere.
Expand the test to show the path used without scoring.
Failed payments may be retried, but calling get_route may return a Route
with the same failing path. Add a routing::Score trait used to
parameterize get_route, which it calls to determine how much a channel
should be penalized in terms of msats willing to pay to avoid the
channel.
Also, add a Scorer struct that implements routing::Score with a constant
constant penalty. Subsequent changes will allow for more robust scoring
by feeding back payment path success and failure to the scorer via event
handling.
As ChainMonitor will need to see those errors in a coming PR,
we need to return errors via Persister so that our ChainMonitor
chain::Watch implementation sees them.
Previously, if a Persister returned a TemporaryFailure error when
we tried to persist a new channel, the ChainMonitor wouldn't track
the new ChannelMonitor at all, generating a PermanentFailure later
when the updating is restored.
This fixes that by correctly storing the ChannelMonitor on
TemporaryFailures, allowing later update restoration to happen
normally.
This is (indirectly) tested in the next commit where we use
Persister to return all monitor-update errors.
test_simple_monitor_permanent_update_fail and
test_simple_monitor_temporary_update_fail both have a mode where
they use either chain::Watch or persister to return errors.
As we won't be doing any returns directly from the chain::Watch
wrapper in a coming commit, the chain::Watch-return form of the
test will no longer make sense.
Exposing a `RwLock<HashMap<>>` directly was always a bit strange,
and in upcoming changes we'd like to change the internal
datastructure in `ChainMonitor`.
Further, the use of `RwLock` and `HashMap` meant we weren't able
to expose the ChannelMonitors themselves to users in bindings,
leaving a bindings/rust API gap.
Thus, we take this opportunity go expose ChannelMonitors directly
via a wrapper, hiding the internals of `ChainMonitor` behind
getters. We also update tests to use the new API.
The interface for get_route will change to take a scorer. Using
get_route_and_payment_hash whenever possible allows for keeping the
scorer inside get_route_and_payment_hash rather than at every call site.
Replace get_route with get_route_and_payment_hash wherever possible.
Additionally, update get_route_and_payment_hash to use the known invoice
features and the sending node's logger.
This makes it more practical for users to track channels prior to
funding, especially if the channel fails because the peer rejects
it for a parameter mismatch.
We cannot expose ReadOnlyNetworkGraph::get_addresses as is in C as
it returns a list of references to an enum, which the bindings
dont support. Instead, we simply clone the result so that it
doesn't contain references.
During the event of a channel close, if the funding transaction
is yet to be broadcasted then a DiscardFunding event is issued
along with the ChannelClose event.
If we attempt to send a payment, but the HTLC cannot be send due to
local channel limits, we'll provide the user an error but end up
with an entry in our pending payment map. This will result in a
memory leak as we'll never reclaim the pending payment map entry.