Commit graph

60 commits

Author SHA1 Message Date
Rusty Russell
fba738b65e askrene: really fix race between layer creation and persistent layer loading.
Using jsonrpc_request_sync, layers are loaded before we finish init,
so we never can be asked to create a layer before we've loaded it
(xpay creates a layer immediately on startup).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-None: persistent layers new this release.
2024-11-26 16:04:13 +10:30
Rusty Russell
af629e600e askrene: don't re-save layers as we restore them!
Create lower-level versions of routines to create biases, layers,
constraints, etc and only save the ones called from the public APIs.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-None: persistent layers were only added in this release
2024-11-26 16:04:13 +10:30
Rusty Russell
d85dcc0ce4 askrene: persistent layer support.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2024-11-08 21:48:55 +10:30
Rusty Russell
b2dcf7248d askrene: add askrene-bias-channel.
This lets you place annotated biases on channels, to influence routing.

Uses include avoiding TOR nodes, slow channels or other local preferences.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-None: askrene is new anyway.
2024-11-08 21:48:55 +10:30
Rusty Russell
c797b6fb20 libplugin: add method string to jsonrpc callbacks, implement generic helpers.
Without knowing what method was called, we can't have useful general logging
methods, so go through the pain of adding "const char *method" everywhere,
and add:

1. ignore_and_complete - we're done when jsonrpc returned
2. log_broken_and_complete - we're done, but emit BROKEN log.
3. plugin_broken_cb - if this happens, fail the plugin.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2024-11-07 17:04:35 +10:30
Rusty Russell
c5099b1647 libplugin: clean up API.
When we used to allow cmd to be NULL, we had to hand the plugin
everywhere.  We no longer do.

1. Various jsonrpc_ functions no longer need the plugin arg.
2. send_outreq no longer needs a plugin arg.
3. The init function takes a command, not a plugin.
4. Remove command_deprecated_in_nocmd_ok.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2024-11-07 17:04:35 +10:30
Rusty Russell
95c5fda79f askrene: remove flowset_probability() now refine step calculates it.
Now we've checked it gives the same answers, we can remove a lot of
work in flow.c.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2024-10-15 09:58:04 +10:30
Rusty Russell
5501e4b13d askrene: use refine step to calculate flowset probability.
Since we know the total reservations on each hop, we can more easily
determine probabilities than using flowset_probability() which has to
replicate this collision detection.

We leave both in place for now, to check.  The results are not
identical, due to slightly different calculation methods.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2024-10-15 09:58:04 +10:30
Rusty Russell
bcc8bd59c8 askrene: don't *completely* ignore fees to start.
I noticed that increasing mu a little bit sometimes made a big difference,
because by completely ignoring fees we were choosing the worst of two channels
in some cases.

Start at 1% fees; this saves a lot on initial fees in this test!

Here's the new stats on mu levels:

     96  mu=1
     90  mu=10
     41  mu=20
     30  mu=30
     24  mu=40
     19  mu=50
     22  mu=60
      8  mu=70
     95  mu=80
     19  mu=90

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-EXPERIMENTAL: `askrene` is now better at finding low-fee paths.
2024-10-15 09:58:04 +10:30
Rusty Russell
32aa79a1e2 askrene: debug and check we actually reduce fees when mu increase.
Even after the previous fix, we still occasionally increase fees when my increases.

This is due to the difference between MCF's linear fees, and actual fees, and
is unavoidable, but add a check if it somehow happens.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2024-10-15 09:58:04 +10:30
Rusty Russell
08df93cb25 askrene: fix base fee.
I noticed this in the logs:

	plugin-cln-askrene: notify msg unusual: The flows had a fee of 151950msat, greater than max of 53697msat, retrying with mu of 10%...
	plugin-cln-askrene: notify msg unusual: The flows had a fee of 220126msat, greater than max of 53697msat, retrying with mu of 20%...

We would expect increasing mu to *reduce* the fee!

Turns out that our linear fee is a bad terrible approximation, because I
was using base_fee_penalty of 10.0.

 |
 |          /   __ <- real fee, with base: fee = base + propfee * amount.
 |         / __/
 |       _//
 |    __/
 | __/_/
 |/  _/
 | _/ <- linearized fee: fee = linear * amount
 |/
 +-----------------------------------

These cross over where linear = propfee + base / amount.  Assume we split the
payment into 10 parts, this implies that the base_fee_penalty should be 10 / amount
(this gives a slight penalty to the normal case, but that's ok).

This gives better results, too: we get down to 650099 sats in fees, vs 801613
before.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2024-10-15 09:58:04 +10:30
Rusty Russell
6273adbe47 askrene: calculate prob_cost_factor using ratio of typical mainnet channel.
During "test_real_data", then only successes with reduced fees were 92 on "mu=10", and only
1 on "mu=30": the rest went to mu=100 and failed.

I tried numerous approaches, and in the end, opted for the simplest:

The typical range of probability costs looks likes:
	min = 0, max = 924196240, mean = 10509.4, stddev = 1.9e+06

The typical range of linear fee costs looks like:
	min = 0, max = 101000000, mean = 81894.6, stddev = 2.6e+06

This implies a k factor of 8 makes the two comparable.

This makes the two numbers comparable, and thus makes "mu" much more
effective.  Here are the number of different mu values we succeeded at:

     87  mu=0
     90  mu=10
     42  mu=20
     24  mu=30
     17  mu=40
     19  mu=50
     19  mu=60
     11  mu=70
     95  mu=80
     19  mu=90

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2024-10-15 09:58:04 +10:30
Rusty Russell
83eee64fda pytest: test askrene with worse maxfee argument.
We ask it again, but reduce fees by 1msat from the previous answer.
This is really nasty, as it frequently exercises the case where we
only go over fee when we do the refinement step.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2024-10-15 09:58:04 +10:30
Rusty Russell
1b82a3ad5b askrene: constrain to exact htlc_min/htlc_max values.
The fp16_t values are approximations (overestimate for htlc_max,
underestimate for htlc_min), so in the refinement step we should use
the exact values.

This also fixes a logic bug: flow_remaining_capacity returned the
total capacity, not the additional capacity!

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-EXPERIMENTAL: `askrene` now honors exact htlc_maximum_msat limits.
2024-10-15 09:58:04 +10:30
Rusty Russell
22e7a57557 askrene: make auto.sourcefree a real layer, too.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-EXPERIMENTAL: `getroutes` now applies `auto.sourcefree` layer in the order specified, so doesn't alter channels changed in later layers.
2024-10-15 09:58:04 +10:30
Rusty Russell
3321ad5883 askrene: populate auto.localchans layer properly.
Rather than adding to the gossmap modifications directly, populate
the layer and have the normal layer application logic do it.

This is consistent when we query layers in the next patch.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2024-10-15 09:58:04 +10:30
Rusty Russell
1230f1b832 askrene: give notifications back to caller as we go.
And unify logging for better debugging.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-EXPERIMENTAL: `askrene` now has better logging, gives notifications of progress.
2024-10-15 09:58:04 +10:30
Rusty Russell
e3c9bc6d3a askrene: trivial changes to avoid -O3 compiler warnings.
The code is a bit too complex for gcc to track it:

```
In file included from ccan/ccan/tal/str/str.h:7,
                 from plugins/askrene/askrene.c:11:
plugins/askrene/askrene.c: In function ‘do_getroutes’:
ccan/ccan/tal/tal.h:324:23: error: ‘routes’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
  324 | #define tal_count(p) (tal_bytelen(p) / sizeof(*p))
      |                       ^~~~~~~~~~~
plugins/askrene/askrene.c:476:24: note: ‘routes’ was declared here
  476 |         struct route **routes;
      |                        ^~~~~~
plugins/askrene/askrene.c:475:29: error: ‘amounts’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
  475 |         struct amount_msat *amounts;
      |                             ^~~~~~~
plugins/askrene/askrene.c:488:69: error: ‘probability’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
  488 |         json_add_u64(response, "probability_ppm", (u64)(probability * 1000000));
      |                                                        ~~~~~~~~~~~~~^~~~~~~~~~
cc plugins/askrene/dijkstra.c
cc1: all warnings being treated as errors
```

On my local machine, it also warns in param_dev_channel, so I fixed that too.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2024-10-04 11:27:53 +09:30
Rusty Russell
630ec6a566 askrene: give better feedback when we can't find a suitable route.
This turns out to be critical for users: also stops them from
bothering us when their node is offline or has insufficient capacity!

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2024-10-04 11:27:53 +09:30
Rusty Russell
e54c0f8ded askrene: don't replace constraints, simply accumulate.
Lagrang3 points out it's less useful (when we time them out), and probably
a premature optimization anyway.

Suggested-by: Lagrang3 <lagrang3@protonmail.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2024-10-04 11:27:53 +09:30
Rusty Russell
c307b77d2f askrene: split askrene-create-channel into create-channel and update-channel.
This allows for explicit partial updates to channels (e.g. just change
fees, or just disable) without haveing to set the other fields.

This generalizes askrene-disable-channel, which is removed.

We also take the chance to use the proper BOLT 7 terms in the API:

- htlc_minimum_msat
- htlc_maximum_msat
- cltv_expiry_delta
- fee_base_msat
- fee_proportional_millionths

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2024-10-04 11:27:53 +09:30
Rusty Russell
5052f0763f gossmap: keep capacity for locally-generated channels as well.
It was weird not to have a capacity associated with localmods channels, and
fixing it has some very nice side effects.

Now the gossmap_chan_get_capacity() call never fails (we prevented reading
of channels from gossmap in the partially-written case already), so we
make it return the capacity.  We do this in msat, because that's what
all the callers want.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2024-10-04 11:27:53 +09:30
Rusty Russell
a60063e763 common/gossmods_listpeerchannels: include channel capacity in callback.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2024-10-04 11:27:53 +09:30
Rusty Russell
a65e325b13 gossmap: implement partial updates.
This is actually what we want in several places: to only override one or
two fields in a channel_update.

We add a gossmap_local_setchan() with a similar API to the old
gossmap_local_updatechan(), for the case where we want to set every
field.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2024-10-04 11:27:53 +09:30
Rusty Russell
3253623785 gossmods_from_listpeerchannels: use correct type for cltv_delta.
Doesn't matter now, but will with the next change where we want to
pass a pointer.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2024-10-04 11:27:53 +09:30
Rusty Russell
fcc0d2bad8 askrene: remove unused parameter in layer_add_localmods.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2024-10-04 11:27:53 +09:30
Rusty Russell
321ec0875f askrene: rework constraints to exist in pairs.
This is a bit more efficient, but moreover the JSONRPC API is more
logical this way.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2024-10-04 11:27:53 +09:30
Rusty Russell
3c5c22b17a askrene: change inform interface, take into account reserve.
Lagrang3 points out that if we hit a maximum, we should take into account
the reserve.  This is true, but it's hard for the caller to do, so change
the API to be slightly higher level.

Tell "inform" what happened, and it adjust the constraints appropriately.
This makes the least assumptions possible (a reserve does *not* mean that
the capacity was actually used at that time).

We also add a mode to say "this succeeded": for now this does nothing,
but it could reduce both min/max capacities, and add capacity in the
other direction.  This is useful for future payments, but not as useful
for the current one.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2024-10-04 11:27:53 +09:30
Rusty Russell
d50838b60f askrene: implement listreservations
And actually write tests!

Suggested-by: Lagrang3 <lagrang3@protonmail.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2024-10-04 11:27:53 +09:30
Rusty Russell
0398d2ff73 askrene: remember individual reservations, for better debugging.
Suggested-by: Lagrang3 <lagrang3@protonmail.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2024-10-04 11:27:53 +09:30
Rusty Russell
10dc40a895 askrene: clean up reserve array handling.
I got confused, as we had a struct containing two arrays.  Simply expose the
reserve_hop struct and use arrays directly.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2024-10-04 11:27:53 +09:30
Rusty Russell
2df53c32c2 askrene: make route_query contain pointer to the command.
This is important for errors and feedback.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2024-10-04 11:27:53 +09:30
Rusty Russell
b88f4cb854 askrene: askrene-create-layer and askrene-remove-layer.
It's generally better to be explicit with these things: currently typos
would be ignored.  But it's also much easier to clean up entire layers
as we use them for temporary (per-payment) effects.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2024-10-04 11:27:53 +09:30
Rusty Russell
29cc227a53 askrene: use short_channel_id_dir in API.
It's generally much more convenient, and it's already present in
other APIs.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2024-10-04 11:27:53 +09:30
Lagrang3
33404b03a0 add askrene-disable-channel
Changelog-EXPERIMENTAL: askrene: add askrene-disable-channel RPC
Signed-off-by: Lagrang3 <lagrang3@protonmail.com>
2024-10-04 11:27:53 +09:30
Rusty Russell
f46219b505 common: round out the short_channel_id_dir JSON routines.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2024-10-04 11:27:53 +09:30
Rusty Russell
8c551ae48a askrene: move flow refining code to its own file.
askrene.c was getting quite long, and this is self-contained.

The only code change is a convenience accessor for the per-htlc-cost
hash table.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2024-09-19 12:16:53 +09:30
Rusty Russell
9d966b46a7 askrene: take into account the reduction in "spendable" with additional HTLCs.
"spendable" is for a single HTLC: if we own the channel, this amount
decreases with every HTLC, as we have to pay fees.  We have access to this since
we call listpeerchannels anyway, so we can calculate the additional costs and
use it in the refining phase.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2024-09-19 12:16:53 +09:30
Rusty Russell
db29a2d6b5 askrene: don't have get_flow_paths() handle htlc_max, htlc_min and extra millisats.
We don't actually hit the htlc_max cases, since the flow code already
constrains us to that.

And handling htlc_min is better done in the caller, where diagnostics
are better (basically, we should eliminate them, and if that means no
route, give a clear error message).

And the refinement step can handle any extra millisats from rounding.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2024-09-19 12:16:53 +09:30
Rusty Russell
f0331cd82e askrene: add a "refining" step to add fees and handle corner cases.
This is the root cause of the problem worked around in 50949b7b9c
"askrene: hack in some padding so we don't overflow capacities."

When adding fees to flows, we didn't recheck the boundary conditions: in
renepay this is done by routebuilder.

Fortunately, we can use our "reservations" infrastructure to temporarily
use capacity as we process flows, so we handle the cases where they are
not independent correclty.

My assumption is that the resulting errors are small, so we divide
them between the remaining flows based on highest-to-least
probability.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2024-09-19 12:16:53 +09:30
Rusty Russell
5883aa85ca askrene: rename struct flow amount to delivers.
This is clearer: it's the final amount, not the amount we send!

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2024-09-19 12:16:53 +09:30
Rusty Russell
829954ac71 askrene: remove struct flow probability member.
Simply calculate it when we need it, which means we don't have to keep it
up-to-date as we tweak the flow.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2024-09-19 12:16:53 +09:30
Rusty Russell
1b8a64f5e6 askrene: apply "auto.sourcefree" to channels created in layers, too.
We had a workaround for channels added by "auto.local", but instead we
should make it work properly.

I didn't do this before because we can't manipulate the localmods while
they're applied, but it's simple to do it in two stages.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2024-09-19 12:16:53 +09:30
Rusty Russell
679f46f733 common/amount: rename amount_sat_zero/amount_msat_zerp -> amount_sat_is_zero/amount_msat_is_zero.
I used `amount_msat_eq(x, AMOUNT_MSAT(0))` because I forgot this
function existed.  I probably missed it because the name is surprising,
so add "is" in there to make it clear it's a boolean function.

You'll note almost all the places which did use it are Eduardo's and
Lisa's code, so maybe it's just me.

Fix up a few places which I could use it, too.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2024-09-19 12:16:53 +09:30
Rusty Russell
975326ab5d askrene: round capacity *down* when converting to fp16.
Conversion is lossy, and we don't want to spend more than the channel,
so it's conservative to round down here.

This doesn't actually help our test though!

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2024-08-23 18:52:15 +09:30
Rusty Russell
7fb7234da1 askrene: change finalcltv to final_cltv, and return it in response.
You need to know it to make an onion, and in theory if we decided to
fuzz it could be different for different paths.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2024-08-19 10:12:51 -07:00
Lagrang3
5073942eef askrene: memleak: scan reserved htable
Signed-off-by: Lagrang3 <lagrang3@protonmail.com>
2024-08-13 09:12:40 +09:30
Lagrang3
cf3375c701 askrene: change reserve_hash for reserve_htable
Signed-off-by: Lagrang3 <lagrang3@protonmail.com>
2024-08-13 09:12:40 +09:30
Lagrang3
5ef5739e27 askrene: reserve: fix assertion
Fixes bad guide for json_scan.

Signed-off-by: Lagrang3 <lagrang3@protonmail.com>
2024-08-12 13:17:09 -07:00
Rusty Russell
f8b259d5e9 askrene: add "auto.localchans" layer.
This populates information on both topology (i.e. unannounced channels) and capacity for the local node using `listpeerchannels`.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2024-08-07 20:35:30 +09:30