If we claim an MPP payment and only persist some of the
`ChannelMonitorUpdate`s which include the preimage prior to
shutting down, we may be in a state where some of our
`ChannelMonitor`s have the preimage for a payment while others do
not.
This, it turns out, is actually mostly safe - on startup
`ChanelManager` will detect there's a payment it has as unclaimed
but there's a corresponding payment preimage in a `ChannelMonitor`
and go claim the other MPP parts. This works so long as the
`ChannelManager` has been persisted after the payment has been
received but prior to the `PaymentClaimable` event being processed
(and the claim itself occurring). This is not always true and
certainly not required on our API, but our
`lightning-background-processor` today does persist prior to event
handling so is generally true subject to some race conditions.
In order to address this we need to use copy payment preimages
across channels irrespective of the `ChannelManager`'s payment
state, but this introduces another wrinkle - if one channel makes
substantial progress while other channel(s) are still waiting to
get the payment preimage in `ChannelMonitor`(s) while the
`ChannelManager` hasn't been persisted after the payment was
received, we may end up without the preimage on disk.
Here, we address this issue by using the new
`RAAMonitorUpdateBlockingAction` variant for this specific case. We
block persistence of an RAA `ChannelMonitorUpdate` which may remove
the preimage from disk until all channels have had the preimage
added to their `ChannelMonitor`.
We do this only in-memory (and not on disk) as we can recreate this
blocker during the startup re-claim logic.
This will enable us to claim MPP parts without using the
`ChannelManager`'s payment state in later work.
If we claim an MPP payment and only persist some of the
`ChannelMonitorUpdate`s which include the preimage prior to
shutting down, we may be in a state where some of our
`ChannelMonitor`s have the preimage for a payment while others do
not.
This, it turns out, is actually mostly safe - on startup
`ChanelManager` will detect there's a payment it has as unclaimed
but there's a corresponding payment preimage in a `ChannelMonitor`
and go claim the other MPP parts. This works so long as the
`ChannelManager` has been persisted after the payment has been
received but prior to the `PaymentClaimable` event being processed
(and the claim itself occurring). This is not always true and
certainly not required on our API, but our
`lightning-background-processor` today does persist prior to event
handling so is generally true subject to some race conditions.
In order to address this race we need to use copy payment preimages
across channels irrespective of the `ChannelManager`'s payment
state, but this introduces another wrinkle - if one channel makes
substantial progress while other channel(s) are still waiting to
get the payment preimage in `ChannelMonitor`(s) while the
`ChannelManager` hasn't been persisted after the payment was
received, we may end up without the preimage on disk.
Here, we address this issue with a new
`RAAMonitorUpdateBlockingAction` variant for this specific case. We
block persistence of an RAA `ChannelMonitorUpdate` which may remove
the preimage from disk until all channels have had the preimage
added to their `ChannelMonitor`.
We do this only in-memory (and not on disk) as we can recreate this
blocker during the startup re-claim logic.
This will enable us to claim MPP parts without using the
`ChannelManager`'s payment state in later work.
Because we track pending `ChannelMonitorUpdate`s per-peer, we
really need to know what peer an HTLC came from when we go to claim
it on-chain, allowing us to look up any blocked actions in the
`PeerState`. While we could do this by moving the blocked actions
to some global "closed-channel update queue", its simpler to do it
this way.
While this trades off `ChannelMonitorUpdate` size somewhat (the
`HTLCSource` is included in many of them), which we should be
sensitive to, it will also allow us to (eventually) remove the
SCID -> peer + channel_id lookups we do when claiming or failing
today, which are somewhat annoying to deal with.
In some cases, we have variants of an enum serialized using
`impl_writeable_tlv_based_enum_upgradable` which we don't want to
write/read. Here we add support for such variants by writing them
using the (odd) type 255 without any contents and using
`MaybeReadable` to decline to read them.
Add a trait for handling async payments messages to OnionMessenger. This allows
users to either provide their own custom handling for async payments messages
or rely on a version provided by LDK.
This change implements non-strict forwarding, allowing the node to
forward an HTLC along a channel other than the one specified
by short_channel_id in the onion message, so long as the receiver has
the same node public key intended by short_channel_id
([BOLT](57ce4b1e05/04-onion-routing.md (non-strict-forwarding))).
This can improve payment reliability when there are multiple channels
with the same peer e.g. when outbound liquidity is replenished by
opening a new channel.
The implemented forwarding strategy now chooses the channel with the
lowest outbound liquidity that can forward an HTLC to maximize the
probability of being able to successfully forward a subsequent HTLC.
Fixes#1278.
Currently, every block connection triggers the persistence of all
ChannelMonitors with an updated best_block. This approach poses
challenges for large node operators managing thousands of channels.
Furthermore, it leads to a thundering herd problem
(https://en.wikipedia.org/wiki/Thundering_herd_problem), overwhelming
the storage with simultaneous requests.
To address this issue, we now persist ChannelMonitors at a
regular cadence, spreading their persistence across blocks to
mitigate spikes in write operations.
Outcome: After doing this, Ldk's IO footprint should be reduced
by ~50 times. The processing time required to sync each block
will be significantly reduced, particularly for nodes with 1000s
of channels, as write latency plays a significant role in this process.
As a result, the Node/ChainMonitor will be blocked for a shorter
duration, leading to further efficiency gains.
When creating blinded paths, introduction nodes are limited to peers
with at least three channels to prevent easily guessing the recipient.
Relax this check when the recipient is unannounced since they won't be
in the NetworkGraph.
DefaultRouter will ignore blinded paths where the sender is the
introduction node. Similar to message paths, advance such paths by one
so that payments may be sent to them.
Similar to blinded paths for onion messages, if given a blinded payment
path where we are the introduction node, the path must be advanced by
one in order to use it.
When using advance_path_by_one when we are the introduction node, any
error will result having the first hop of the input blinded path
removed. Instead, only remove the first hop on success. Otherwise, the
path will be invalid.
When creating blinded paths for receiving onion payments, allow using
the recipient's only peer as the introduction node when the recipient is
unannounced. This allows for sending payments without going through an
intermediary, which is useful for testing or when only connected to an
LSP with an empty NetworkGraph.
When creating blinded paths for receiving onion messages, allow using
the recipient's only peer as the introduction node when the recipient is
unannounced. This allows for sending messages without going through an
intermediary, which is useful for testing or when only connected to an
LSP with an empty NetworkGraph.