Implement https://github.com/lightningnetwork/lightning-rfc/pull/666
Keep the global/local split in Commitments to avoid backwards incompatibility in the codec.
Remove allowMultiPart API field: we instead rely on the MPP feature being set in nodeParams.
That means MPP-enabled nodes need to update their reference.conf.
Rework features:
* Add types to allow cleaner dependency validation.
* Most of the time we don't care whether a feature is activated as optional or mandatory, which caused duplicate code. This is now handled more cleanly.
* It also paves the way to annotate features with the places they should be advertised (Init vs NodeAnn vs ChannelAnn vs invoice).
This is safer for now since the splitting algorithm isn't working
well on nodes with a large number of channels and we don't
expect too many payments from Phoenix to non-Phoenix to
actually need MPP in the short term.
Mockito sometimes throws an unnecessary stubbing exception, it's unclear whether the test is faulty or mockito has issues with our parallel setup.
Rewrite switchboard tests without mockito makes them more flexible.
In case they randomly fail we should get more useful data to help troubleshooting.
Start relaying trampoline payments with multi-part aggregation (disabled by default,
must be enabled with config).
Recovery after a restart is correctly handled, even if payments were being forwarded.
No DB schema update in this commit.
The trampoline UX will be somewhat bad because many improvements/polish are missing.
Some shortcuts were taken, a few hacks here and there need to be fixed, but nothing too scary.
Those improvements will be done in separate commits before the next release.
Randomization is necessary, otherwise if two peers attempt to reconnect
to each other in a synchronized fashion, they will enter in a
disconnect-reconnect loop.
We already had randomization for the initial reconnection attempt, but
further reconnection attempts were using a deterministic schedule
following an exponential backoff curve.
Fixes#1238.
* Add a configurable time-out to onchain fee provider requests
We configure a timeout of 5 seconds, applicable to all fee providers. If a provider times out we switch to the next one in our list.
Our mobile app needs a feerate to start properly and currently waits too long when a fee provider is online but very slow to respond.
We can't guarantee with the current algorithm that the last HTLC won't be
a small one (the leftovers).
If we see that happen in real scenario, we'll need to add heuristics to avoid it.
This allows us to only use logback.xml to control the log level.
From akka docs [1]:
> If you set the loglevel to a higher level than DEBUG, any DEBUG events
will be filtered out already at the source and will never reach the
logging backend, regardless of how the backend is configured.
> You can enable DEBUG level for akka.loglevel and control the actual
level in the SLF4J backend without any significant overhead, also for
production.
[1] https://doc.akka.io/docs/akka/current/logging.html
Using the `max()` aggregating function on outgoing payments'
timestamps, we can ensure that the non-aggregated columns
for the outgoing payments contain the most recent/pertinent data.
If a chain re-org happens and a new ShortChannelId is assigned,
the `Relayer` kept both entries (new and old).
This resulted in an incorrect balance because we effectively counted this channel twice.
While #1222 was being reviewed, a new unit test was added to OnionCodecsSpec.
It didn't cause any file conflict so Github didn't warn about merging #1222.
However this test needed to be updated to the new truncated int format.
The spec defines tu64 (and friends) without the length prefix.
Multi-part uses a tu64 without a length prefix inside the PaymentData record.
Our previous implementation only supported using tu64 alone in a TLV record.
We make this more flexible by separating the length encoding.
MPP implies payment secret.
Avoid raising exceptions in PaymentInitiator: validate invoice instead of using a require.
This way senders always get a response.
We previously had some logic where we would fail incoming HTLCs
for which we were the final recipient when a channel would come online.
That made sense when we didn't have MPP, but with MPP we cannot do that.
There is a risk that we would be failing HTLCs that are considered received by the MPP FSM.
Instead we need to use the CommandBuffer when we are the final recipient.
This way pending commands cannot be lost and HTLCs are cleaned-up on restart.
This includes a bit of refactoring in `MultiPartPaymentLifecycle`. Note
that we can't use the `onTermination` handler to finish the spans,
because it is asynchronous and may not be called after a long time.
That's why we use a dedicated `myStop` function.
In Kamon 2.0, by default spans are automatically generated for tracked
actors, which we don't want because we define our own spans. That's why
there is an additional configuration in `application.conf`.
MPP split/retry improvements:
* Only use public channels when sending to remote node
* Don't retry when sending to direct peer
* Blacklist channels that are a bad route prefix
When paying a multi-part payment, we tell the PaymentLifecycle to use a route prefix that contains the first hop (for example a -> b via channel 1).
We need to also tell the router to ignore the nodes that are in the route prefix, otherwise when retrying it may try some completely dumb routes that have no chance of succeeding.
* Fix `allUpdates` API when used with the public key filter, the API now returns all updates that involve a channel of which the filter key has made an update
This is due to a callback being executed after the parent actor has been
cleaned up. We don't really care about the result anyway, so we can
safely ignore, even if the issue only arises in tests.
The root problem here is that we are making references to actor methods
from a callback, which we shouldn't do, because whatever we reference
may have disappeared by the time the callback tries to access it. A
better pattern would be to `pipe` the results of the `Future` to
oneself, but that would require more work and possibly change the FSM,
which seems overkill for the issue at hand.
When an actor sends a message to itself as part of its class definition,
there is no guarantee that this message will be processed first. Relying
on that to set the default payment handler is problematic and causes
race conditions in tests.
Add support for multi-part payments (MPP).
We can now send and receive multi-part payments, with a somewhat basic splitting algorithm that will be refined based on real-world usage.
Compatibility with other implementations hasn't been tested yet as they don't have a branch ready.
This compatibility testing may reveal small details that need to be changed and may invalidate pending multi-part invoices.
* Check configuration for obsolete keys on startup
We now check the loaded configuration for obsolete keys (that have been moved to a new section) and throw an error if any are found, which will prevent eclair from starting.