Default goes to stderr for LOG_UNUSUAL and higher.
We have to whitelist more cases in map_catchup so we don't spam the logs
with perfectly-expected (but ignored) messages though.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
I'm not sure why this happens, and suspect it is caused by an issue elsewhere, so
add some verbose debugging, don't crash.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Fixes: https://github.com/ElementsProject/lightning/issues/8017
The updated API requires typed htables to explicitly state whether they
allow duplicates: for most cases we don't, but we've had issues in the
past.
This is a big patch, but mainly mechanical.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Using jsonrpc_request_sync, layers are loaded before we finish init,
so we never can be asked to create a layer before we've loaded it
(xpay creates a layer immediately on startup).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-None: persistent layers new this release.
Create lower-level versions of routines to create biases, layers,
constraints, etc and only save the ones called from the public APIs.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-None: persistent layers were only added in this release
The ratio of the median of the fees and probability cost is overall not
a bad factor to combine these two features. This is what the
test_real_data shows.
Changelog-None
Signed-off-by: Lagrang3 <lagrang3@protonmail.com>
The fee_fallback test would fail after fixing the computation of the
median. Now by we can restore it by making the probability cost factor
1000x higher than the ratio of the median. This shows how hard it is to
combine fee and probability costs and why is the current approach so
fragile.
Changelog-None
Signed-off-by: Lagrang3 <lagrang3@protonmail.com>
Rusty: "We don't generally use NDEBUG in our code"
Instead use a compile time flag ASKRENE_UNITTEST to make checks on unit
tests that we don't normally need on release code.
Changelog-none
Signed-off-by: Lagrang3 <lagrang3@protonmail.com>
- use graph_max_num_arcs/nodes instead of tal_count in bound checks,
- don't use ccan/lqueue, use instead a minimalistic queue
implementation with an array,
- add missing const qualifiers to temporary tal allocators,
- check preconditions with assert,
- remove inline specifier for static functions,
Changelog-None
Signed-off-by: Lagrang3 <lagrang3@protonmail.com>
The calculation of the median values of probability and fee cost in the
linear approximation had a bug by counting on non-existing arcs.
Changelog-none: askrene: fix the median
Signed-off-by: Lagrang3 <lagrang3@protonmail.com>
We use an arc "array" in the graph structure, but not all arc indexes
correspond to real topological arcs. We must be careful when iterating
through all arcs, and check if they are enabled before making operations
on them.
Changelog-None: askrene: fix bug, not all arcs exists
Signed-off-by: Lagrang3 <lagrang3@protonmail.com>
Add a new function to compute a MCF using a more general description of
the problem. I call it mcf_refinement because it can start with a
feasible flow (though this is not necessary) and adapt it to achieve
optimality.
Changelog-None: askrene: add a MCF refinement
Signed-off-by: Lagrang3 <lagrang3@protonmail.com>
It is just a copy-paste of "dijkstra" but the name
implies what it actually is. Not an implementation of minimum cost path
Dijkstra algorithm, but a helper data structure.
I keep the old "dijkstra.h/c" files for the moment to avoid breaking the
current code.
Changelog-EXPERIMENTAL: askrene: add priorityqueue
Signed-off-by: Lagrang3 <lagrang3@protonmail.com>
This lets you place annotated biases on channels, to influence routing.
Uses include avoiding TOR nodes, slow channels or other local preferences.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-None: askrene is new anyway.
Without knowing what method was called, we can't have useful general logging
methods, so go through the pain of adding "const char *method" everywhere,
and add:
1. ignore_and_complete - we're done when jsonrpc returned
2. log_broken_and_complete - we're done, but emit BROKEN log.
3. plugin_broken_cb - if this happens, fail the plugin.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
When we used to allow cmd to be NULL, we had to hand the plugin
everywhere. We no longer do.
1. Various jsonrpc_ functions no longer need the plugin arg.
2. send_outreq no longer needs a plugin arg.
3. The init function takes a command, not a plugin.
4. Remove command_deprecated_in_nocmd_ok.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Since we know the total reservations on each hop, we can more easily
determine probabilities than using flowset_probability() which has to
replicate this collision detection.
We leave both in place for now, to check. The results are not
identical, due to slightly different calculation methods.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We were trying to get the max capacity of a flow to see if we could add some
more sats, and hit an assertion:
tests/test_askrene.py:707:
```
DEBUG plugin-cln-askrene: notify msg info: Flow reduced to deliver 88070161msat not 90008000msat, because 107x1x0/1 has remaining capacity 88071042msat
DEBUG plugin-cln-askrene: notify msg info: Flow reduced to deliver 284138158msat not 284787000msat, because 108x1x0/1 has remaining capacity 284141000msat
**BROKEN** plugin-cln-askrene: Flow delivers 129565000msat but max only 56506138msat
INFO plugin-cln-askrene: Killing plugin: exited during normal operation
```
We need to *unreserve* our flow before asking for max capacity. We were
also missing a few less important cases where we altered flows without altering
the reservation, so fix those too.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
I noticed that increasing mu a little bit sometimes made a big difference,
because by completely ignoring fees we were choosing the worst of two channels
in some cases.
Start at 1% fees; this saves a lot on initial fees in this test!
Here's the new stats on mu levels:
96 mu=1
90 mu=10
41 mu=20
30 mu=30
24 mu=40
19 mu=50
22 mu=60
8 mu=70
95 mu=80
19 mu=90
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-EXPERIMENTAL: `askrene` is now better at finding low-fee paths.
While the `k=8` value worked for the current main network tests with the
amounts in those tests, it wasn't robust across a wider range of values
(as demonstrated when other test changes broke tests!).
Time to do this properly: calculate the ratio at the time we combine them,
using median values.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Even after the previous fix, we still occasionally increase fees when my increases.
This is due to the difference between MCF's linear fees, and actual fees, and
is unavoidable, but add a check if it somehow happens.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
I noticed this in the logs:
plugin-cln-askrene: notify msg unusual: The flows had a fee of 151950msat, greater than max of 53697msat, retrying with mu of 10%...
plugin-cln-askrene: notify msg unusual: The flows had a fee of 220126msat, greater than max of 53697msat, retrying with mu of 20%...
We would expect increasing mu to *reduce* the fee!
Turns out that our linear fee is a bad terrible approximation, because I
was using base_fee_penalty of 10.0.
|
| / __ <- real fee, with base: fee = base + propfee * amount.
| / __/
| _//
| __/
| __/_/
|/ _/
| _/ <- linearized fee: fee = linear * amount
|/
+-----------------------------------
These cross over where linear = propfee + base / amount. Assume we split the
payment into 10 parts, this implies that the base_fee_penalty should be 10 / amount
(this gives a slight penalty to the normal case, but that's ok).
This gives better results, too: we get down to 650099 sats in fees, vs 801613
before.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
During "test_real_data", then only successes with reduced fees were 92 on "mu=10", and only
1 on "mu=30": the rest went to mu=100 and failed.
I tried numerous approaches, and in the end, opted for the simplest:
The typical range of probability costs looks likes:
min = 0, max = 924196240, mean = 10509.4, stddev = 1.9e+06
The typical range of linear fee costs looks like:
min = 0, max = 101000000, mean = 81894.6, stddev = 2.6e+06
This implies a k factor of 8 makes the two comparable.
This makes the two numbers comparable, and thus makes "mu" much more
effective. Here are the number of different mu values we succeeded at:
87 mu=0
90 mu=10
42 mu=20
24 mu=30
17 mu=40
19 mu=50
19 mu=60
11 mu=70
95 mu=80
19 mu=90
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The current prob_cost_factor setting does not seem to make mu very
effective, in fact, it gives strange results:
plugin-cln-askrene: notify msg unusual: The flows had a fee of 151950msat, greater than max of 53697msat, retrying with mu of 10%...
plugin-cln-askrene: notify msg unusual: The flows had a fee of 220126msat, greater than max of 53697msat, retrying with mu of 20%...
We would expect increasing mu to *reduce* the fee!
As a first step, simplify (it can't be infinite, and the -1 are weird).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We ask it again, but reduce fees by 1msat from the previous answer.
This is really nasty, as it frequently exercises the case where we
only go over fee when we do the refinement step.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
I tested with a really large gossmap (hacked to be 4GB), and when we
keep retrying to minimize cost (calling minflow 11 times), and we
don't free tmpctx.
Due to an issue with how gossmap estimates the index sizes, we ended
up running out of memory. This fixes it.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>