mirrors/core-lightning

mirror of https://github.com/ElementsProject/lightning.git synced 2025-02-24 15:10:51 +01:00

Author	SHA1	Message	Date
Rusty Russell	fdfc7ce62f	gossmap: add (and use) logging hook. Default goes to stderr for LOG_UNUSUAL and higher. We have to whitelist more cases in map_catchup so we don't spam the logs with perfectly-expected (but ignored) messages though. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2025-02-11 15:11:47 -06:00
Rusty Russell	e0aaf60bbd	askrene: don't crash, just report, when a flow's remaining capacity is negative. I'm not sure why this happens, and suspect it is caused by an issue elsewhere, so add some verbose debugging, don't crash. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Fixes: https://github.com/ElementsProject/lightning/issues/8017	2025-01-28 10:53:22 +10:30
Rusty Russell	c31eed433c	askrene: expose fmt_flow_full and make more generic. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2025-01-28 10:53:22 +10:30
Rusty Russell	3387bba9be	askrene: don't create 0-msat flow in corner case. Shouldn't happen, but I can't prove it's impossible. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2025-01-28 10:53:22 +10:30
Rusty Russell	01bce4e7b7	askrene: don't crash if asked to route 0 msat. But don't allow it either. ``` Jan 19 13:18:52 boltz lightningd[2259911]: cln-askrene: plugins/askrene/algorithm.c:274: simple_feasibleflow: Assertion `amount > 0' failed. Jan 19 13:18:52 boltz lightningd[2259911]: cln-askrene: FATAL SIGNAL 6 (version v24.11.1-1-ge9dbdeb) Jan 19 13:18:52 boltz lightningd[2259911]: 0x5576e212b407 send_backtrace Jan 19 13:18:52 boltz lightningd[2259911]: common/daemon.c:33 Jan 19 13:18:52 boltz lightningd[2259911]: 0x5576e212b49e crashdump Jan 19 13:18:52 boltz lightningd[2259911]: common/daemon.c:75 Jan 19 13:18:52 boltz lightningd[2259911]: 0x7f964ba9251f ??? Jan 19 13:18:52 boltz lightningd[2259911]: ???:0 Jan 19 13:18:52 boltz lightningd[2259911]: 0x7f964bae69fc ??? Jan 19 13:18:52 boltz lightningd[2259911]: ???:0 Jan 19 13:18:52 boltz lightningd[2259911]: 0x7f964ba92475 ??? Jan 19 13:18:52 boltz lightningd[2259911]: ???:0 Jan 19 13:18:52 boltz lightningd[2259911]: 0x7f964ba787f2 ??? Jan 19 13:18:52 boltz lightningd[2259911]: ???:0 Jan 19 13:18:52 boltz lightningd[2259911]: 0x7f964ba7871a ??? Jan 19 13:18:52 boltz lightningd[2259911]: ???:0 Jan 19 13:18:52 boltz lightningd[2259911]: 0x7f964ba89e95 ??? Jan 19 13:18:52 boltz lightningd[2259911]: ???:0 Jan 19 13:18:52 boltz lightningd[2259911]: 0x5576e211695e simple_feasibleflow Jan 19 13:18:52 boltz lightningd[2259911]: plugins/askrene/algorithm.c:274 Jan 19 13:18:52 boltz lightningd[2259911]: 0x5576e2111495 minflow Jan 19 13:18:52 boltz lightningd[2259911]: plugins/askrene/mcf.c:1014 Jan 19 13:18:52 boltz lightningd[2259911]: 0x5576e210bc74 get_routes Jan 19 13:18:52 boltz lightningd[2259911]: plugins/askrene/askrene.c:414 Jan 19 13:18:52 boltz lightningd[2259911]: 0x5576e210c610 do_getroutes Jan 19 13:18:52 boltz lightningd[2259911]: plugins/askrene/askrene.c:615 Jan 19 13:18:52 boltz lightningd[2259911]: 0x5576e210cad8 listpeerchannels_done Jan 19 13:18:52 boltz lightningd[2259911]: plugins/askrene/askrene.c:741 Jan 19 13:18:52 boltz lightningd[2259911]: 0x5576e211b35a handle_rpc_reply Jan 19 13:18:52 boltz lightningd[2259911]: plugins/libplugin.c:1084 Jan 19 13:18:52 boltz lightningd[2259911]: 0x5576e211b54c rpc_read_response_one Jan 19 13:18:52 boltz lightningd[2259911]: plugins/libplugin.c:1388 Jan 19 13:18:52 boltz lightningd[2259911]: 0x5576e211b5fd rpc_conn_read_response Jan 19 13:18:52 boltz lightningd[2259911]: plugins/libplugin.c:1412 Jan 19 13:18:52 boltz lightningd[2259911]: 0x5576e214fe8f next_plan Jan 19 13:18:52 boltz lightningd[2259911]: ccan/ccan/io/io.c:60 Jan 19 13:18:52 boltz lightningd[2259911]: 0x5576e215036e do_plan Jan 19 13:18:52 boltz lightningd[2259911]: ccan/ccan/io/io.c:422 Jan 19 13:18:52 boltz lightningd[2259911]: 0x5576e215042b io_ready Jan 19 13:18:52 boltz lightningd[2259911]: ccan/ccan/io/io.c:439 Jan 19 13:18:52 boltz lightningd[2259911]: 0x5576e2151e2e io_loop Jan 19 13:18:52 boltz lightningd[2259911]: ccan/ccan/io/poll.c:455 Jan 19 13:18:52 boltz lightningd[2259911]: 0x5576e211cc29 plugin_main Jan 19 13:18:52 boltz lightningd[2259911]: plugins/libplugin.c:2488 Jan 19 13:18:52 boltz lightningd[2259911]: 0x5576e210cb38 main Jan 19 13:18:52 boltz lightningd[2259911]: plugins/askrene/askrene.c:1262 Jan 19 13:18:52 boltz lightningd[2259911]: 0x7f964ba79d8f ??? ``` Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Changelog-Fixed: JSON-RPC: `getroutes` will refuse, not crash, if asked to find a route fr 0msat.	2025-01-28 10:53:22 +10:30
michael1011	060368bb0a	xpay: add maxdelay parameter Changelog-Added: Plugins: `xpay` now supports a `maxdelay` parameter for better `xpay-handle-pay` compatibility.	2025-01-22 12:19:47 -08:00
Rusty Russell	b6c1ffa359	ccan/htable: update to explicit DUPS/NODUPS types. The updated API requires typed htables to explicitly state whether they allow duplicates: for most cases we don't, but we've had issues in the past. This is a big patch, but mainly mechanical. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2025-01-21 09:18:25 +10:30
Rusty Russell	fba738b65e	askrene: really fix race between layer creation and persistent layer loading. Using jsonrpc_request_sync, layers are loaded before we finish init, so we never can be asked to create a layer before we've loaded it (xpay creates a layer immediately on startup). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Changelog-None: persistent layers new this release.	2024-11-26 16:04:13 +10:30
Rusty Russell	d256e11108	askrene: reorder functions. No code changes. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2024-11-26 16:04:13 +10:30
Rusty Russell	af629e600e	askrene: don't re-save layers as we restore them! Create lower-level versions of routines to create biases, layers, constraints, etc and only save the ones called from the public APIs. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Changelog-None: persistent layers were only added in this release	2024-11-26 16:04:13 +10:30
Lagrang3	05514b46e3	Askrene: change median factor to 1. The ratio of the median of the fees and probability cost is overall not a bad factor to combine these two features. This is what the test_real_data shows. Changelog-None Signed-off-by: Lagrang3 <lagrang3@protonmail.com>	2024-11-21 16:17:52 +10:30
Lagrang3	2b3fd67dfb	askrene: don't skip fee_fallback test The fee_fallback test would fail after fixing the computation of the median. Now by we can restore it by making the probability cost factor 1000x higher than the ratio of the median. This shows how hard it is to combine fee and probability costs and why is the current approach so fragile. Changelog-None Signed-off-by: Lagrang3 <lagrang3@protonmail.com>	2024-11-21 16:17:52 +10:30
Lagrang3	9fdcc26d1d	askrene: bugfix queue overflow Changelog-none Signed-off-by: Lagrang3 <lagrang3@protonmail.com>	2024-11-21 16:17:52 +10:30
Lagrang3	9966969d4c	askrene: remove allocation checks Rusty: "allocations don't fail" Signed-off-by: Lagrang3 <lagrang3@protonmail.com>	2024-11-21 16:17:52 +10:30
Lagrang3	460a28bb32	askrene: add compiler flag ASKRENE_UNITTEST Rusty: "We don't generally use NDEBUG in our code" Instead use a compile time flag ASKRENE_UNITTEST to make checks on unit tests that we don't normally need on release code. Changelog-none Signed-off-by: Lagrang3 <lagrang3@protonmail.com>	2024-11-21 16:17:52 +10:30
Lagrang3	b1cd26373b	askrene: small fixes suggested by Rusty Russell - use graph_max_num_arcs/nodes instead of tal_count in bound checks, - don't use ccan/lqueue, use instead a minimalistic queue implementation with an array, - add missing const qualifiers to temporary tal allocators, - check preconditions with assert, - remove inline specifier for static functions, Changelog-None Signed-off-by: Lagrang3 <lagrang3@protonmail.com>	2024-11-21 16:17:52 +10:30
Lagrang3	44c9609f3a	askrene: add arbitrary precision flow unit Changelog-none: askrene: add arbitrary precision flow unit Signed-off-by: Lagrang3 <lagrang3@protonmail.com>	2024-11-21 16:17:52 +10:30
Lagrang3	4dc1a44cd9	askrene: fix the median The calculation of the median values of probability and fee cost in the linear approximation had a bug by counting on non-existing arcs. Changelog-none: askrene: fix the median Signed-off-by: Lagrang3 <lagrang3@protonmail.com>	2024-11-21 16:17:52 +10:30
Lagrang3	ee623616d2	askrene: fix CI check the return value of scanf in askrene unit tests, Changelog-none: askrene: fix CI Signed-off-by: Lagrang3 <lagrang3@protonmail.com>	2024-11-21 16:17:52 +10:30
Lagrang3	937cf7a554	askrene: use the new MCF solver Changelog-none: askrene: use the new MCF solver Signed-off-by: Lagrang3 <lagrang3@protonmail.com>	2024-11-21 16:17:52 +10:30
Lagrang3	84a9476311	askrene: add mcf_refinement to the public API Changelog-none: askrene: add mcf_refinement to the public API Signed-off-by: Lagrang3 <lagrang3@protonmail.com>	2024-11-21 16:17:52 +10:30
Lagrang3	2142094e43	askrene: fix bug, not all arcs exists We use an arc "array" in the graph structure, but not all arc indexes correspond to real topological arcs. We must be careful when iterating through all arcs, and check if they are enabled before making operations on them. Changelog-None: askrene: fix bug, not all arcs exists Signed-off-by: Lagrang3 <lagrang3@protonmail.com>	2024-11-21 16:17:52 +10:30
Lagrang3	e655fe7bbd	askrene: add bigger test for MCF Using zlib to read big test case file. Changelog-None: askrene: add bigger test for MCF Signed-off-by: Lagrang3 <lagrang3@protonmail.com>	2024-11-21 16:17:52 +10:30
Lagrang3	2ea8e49683	askrene: add a MCF refinement Add a new function to compute a MCF using a more general description of the problem. I call it mcf_refinement because it can start with a feasible flow (though this is not necessary) and adapt it to achieve optimality. Changelog-None: askrene: add a MCF refinement Signed-off-by: Lagrang3 <lagrang3@protonmail.com>	2024-11-21 16:17:52 +10:30
Lagrang3	1dfa562cd9	askrene algorithm add helper for flow conservation Changelog-None: askrene algorithm add helper for flow conservation Signed-off-by: Lagrang3 <lagrang3@protonmail.com>	2024-11-21 16:17:52 +10:30
Lagrang3	42d075cc97	askrene: add a simple MCF solver Changelog-EXPERIMENTAL: askrene: add a simple MCF solver Signed-off-by: Lagrang3 <lagrang3@protonmail.com>	2024-11-21 16:17:52 +10:30
Lagrang3	8558299dec	askrene: add algorithm to compute feasible flow Changelog-EXPERIMENTAL: askrene: add algorithm to compute feasible flow Signed-off-by: Lagrang3 <lagrang3@protonmail.com>	2024-11-21 16:17:52 +10:30
Lagrang3	f4f2985bdf	askrene: add dijkstra algorithm Changelog-EXPERIMENTAL: askrene: add dijkstra algorithm Signed-off-by: Lagrang3 <lagrang3@protonmail.com>	2024-11-21 16:17:52 +10:30
Lagrang3	507153a1cd	askrene: add graph algorithms module Changelog-EXPERIMENTAL: askrene: add graph algorithms module Signed-off-by: Lagrang3 <lagrang3@protonmail.com>	2024-11-21 16:17:52 +10:30
Lagrang3	59ce410699	askrene: add priorityqueue It is just a copy-paste of "dijkstra" but the name implies what it actually is. Not an implementation of minimum cost path Dijkstra algorithm, but a helper data structure. I keep the old "dijkstra.h/c" files for the moment to avoid breaking the current code. Changelog-EXPERIMENTAL: askrene: add priorityqueue Signed-off-by: Lagrang3 <lagrang3@protonmail.com>	2024-11-21 16:17:52 +10:30
Lagrang3	32548cf02b	askrene: add a new graph abstraction Changelog-EXPERIMENTAL: askrene new graph abstraction Signed-off-by: Lagrang3 <lagrang3@protonmail.com>	2024-11-21 16:17:52 +10:30
Rusty Russell	1b2d5acf16	askrene: don't create duplicate layers if xpay creates layer before we load them. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2024-11-19 17:51:18 +10:30
Rusty Russell	d85dcc0ce4	askrene: persistent layer support. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2024-11-08 21:48:55 +10:30
Rusty Russell	b2dcf7248d	askrene: add askrene-bias-channel. This lets you place annotated biases on channels, to influence routing. Uses include avoiding TOR nodes, slow channels or other local preferences. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Changelog-None: askrene is new anyway.	2024-11-08 21:48:55 +10:30
Rusty Russell	3f09e503ec	askrene: fix false positive memleak since we didn't scan local_updates. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2024-11-08 21:48:55 +10:30
Rusty Russell	c797b6fb20	libplugin: add method string to jsonrpc callbacks, implement generic helpers. Without knowing what method was called, we can't have useful general logging methods, so go through the pain of adding "const char *method" everywhere, and add: 1. ignore_and_complete - we're done when jsonrpc returned 2. log_broken_and_complete - we're done, but emit BROKEN log. 3. plugin_broken_cb - if this happens, fail the plugin. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2024-11-07 17:04:35 +10:30
Rusty Russell	4f4ec9aefd	askrene: make sure we depend on libplugin.h Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2024-11-07 17:04:35 +10:30
Rusty Russell	c5099b1647	libplugin: clean up API. When we used to allow cmd to be NULL, we had to hand the plugin everywhere. We no longer do. 1. Various jsonrpc_ functions no longer need the plugin arg. 2. send_outreq no longer needs a plugin arg. 3. The init function takes a command, not a plugin. 4. Remove command_deprecated_in_nocmd_ok. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2024-11-07 17:04:35 +10:30
Rusty Russell	318e49e9c7	askrene: more logging in explain_failure. Lagrang3 doesn't like the logging in here at all, but he suggested we at least be consistent! Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2024-10-15 09:58:04 +10:30
Rusty Russell	95c5fda79f	askrene: remove flowset_probability() now refine step calculates it. Now we've checked it gives the same answers, we can remove a lot of work in flow.c. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2024-10-15 09:58:04 +10:30
Rusty Russell	5501e4b13d	askrene: use refine step to calculate flowset probability. Since we know the total reservations on each hop, we can more easily determine probabilities than using flowset_probability() which has to replicate this collision detection. We leave both in place for now, to check. The results are not identical, due to slightly different calculation methods. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2024-10-15 09:58:04 +10:30
Rusty Russell	4b6a38fe0a	askrene: fix bug with reservations used during refinement. We were trying to get the max capacity of a flow to see if we could add some more sats, and hit an assertion: tests/test_askrene.py:707: ``` DEBUG plugin-cln-askrene: notify msg info: Flow reduced to deliver 88070161msat not 90008000msat, because 107x1x0/1 has remaining capacity 88071042msat DEBUG plugin-cln-askrene: notify msg info: Flow reduced to deliver 284138158msat not 284787000msat, because 108x1x0/1 has remaining capacity 284141000msat BROKEN plugin-cln-askrene: Flow delivers 129565000msat but max only 56506138msat INFO plugin-cln-askrene: Killing plugin: exited during normal operation ``` We need to unreserve our flow before asking for max capacity. We were also missing a few less important cases where we altered flows without altering the reservation, so fix those too. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2024-10-15 09:58:04 +10:30
Rusty Russell	bcc8bd59c8	askrene: don't completely ignore fees to start. I noticed that increasing mu a little bit sometimes made a big difference, because by completely ignoring fees we were choosing the worst of two channels in some cases. Start at 1% fees; this saves a lot on initial fees in this test! Here's the new stats on mu levels: 96 mu=1 90 mu=10 41 mu=20 30 mu=30 24 mu=40 19 mu=50 22 mu=60 8 mu=70 95 mu=80 19 mu=90 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Changelog-EXPERIMENTAL: `askrene` is now better at finding low-fee paths.	2024-10-15 09:58:04 +10:30
Rusty Russell	2a0f09fc2d	askrene: calculate `k` value dynamically, using medians. While the `k=8` value worked for the current main network tests with the amounts in those tests, it wasn't robust across a wider range of values (as demonstrated when other test changes broke tests!). Time to do this properly: calculate the ratio at the time we combine them, using median values. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2024-10-15 09:58:04 +10:30
Rusty Russell	32aa79a1e2	askrene: debug and check we actually reduce fees when mu increase. Even after the previous fix, we still occasionally increase fees when my increases. This is due to the difference between MCF's linear fees, and actual fees, and is unavoidable, but add a check if it somehow happens. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2024-10-15 09:58:04 +10:30
Rusty Russell	08df93cb25	askrene: fix base fee. I noticed this in the logs: plugin-cln-askrene: notify msg unusual: The flows had a fee of 151950msat, greater than max of 53697msat, retrying with mu of 10%... plugin-cln-askrene: notify msg unusual: The flows had a fee of 220126msat, greater than max of 53697msat, retrying with mu of 20%... We would expect increasing mu to reduce the fee! Turns out that our linear fee is a bad terrible approximation, because I was using base_fee_penalty of 10.0. \| \| / __ <- real fee, with base: fee = base + propfee * amount. \| / __/ \| _// \| __/ \| __/_/ \|/ _/ \| _/ <- linearized fee: fee = linear * amount \|/ +----------------------------------- These cross over where linear = propfee + base / amount. Assume we split the payment into 10 parts, this implies that the base_fee_penalty should be 10 / amount (this gives a slight penalty to the normal case, but that's ok). This gives better results, too: we get down to 650099 sats in fees, vs 801613 before. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2024-10-15 09:58:04 +10:30
Rusty Russell	6273adbe47	askrene: calculate prob_cost_factor using ratio of typical mainnet channel. During "test_real_data", then only successes with reduced fees were 92 on "mu=10", and only 1 on "mu=30": the rest went to mu=100 and failed. I tried numerous approaches, and in the end, opted for the simplest: The typical range of probability costs looks likes: min = 0, max = 924196240, mean = 10509.4, stddev = 1.9e+06 The typical range of linear fee costs looks like: min = 0, max = 101000000, mean = 81894.6, stddev = 2.6e+06 This implies a k factor of 8 makes the two comparable. This makes the two numbers comparable, and thus makes "mu" much more effective. Here are the number of different mu values we succeeded at: 87 mu=0 90 mu=10 42 mu=20 24 mu=30 17 mu=40 19 mu=50 19 mu=60 11 mu=70 95 mu=80 19 mu=90 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2024-10-15 09:58:04 +10:30
Rusty Russell	4897286c25	mcf: simplify mu -> cost translation. The current prob_cost_factor setting does not seem to make mu very effective, in fact, it gives strange results: plugin-cln-askrene: notify msg unusual: The flows had a fee of 151950msat, greater than max of 53697msat, retrying with mu of 10%... plugin-cln-askrene: notify msg unusual: The flows had a fee of 220126msat, greater than max of 53697msat, retrying with mu of 20%... We would expect increasing mu to reduce the fee! As a first step, simplify (it can't be infinite, and the -1 are weird). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2024-10-15 09:58:04 +10:30
Rusty Russell	83eee64fda	pytest: test askrene with worse maxfee argument. We ask it again, but reduce fees by 1msat from the previous answer. This is really nasty, as it frequently exercises the case where we only go over fee when we do the refinement step. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2024-10-15 09:58:04 +10:30
Rusty Russell	f17c5f5a6b	askrene: don't use tmpctx in minflow() I tested with a really large gossmap (hacked to be 4GB), and when we keep retrying to minimize cost (calling minflow 11 times), and we don't free tmpctx. Due to an issue with how gossmap estimates the index sizes, we ended up running out of memory. This fixes it. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2024-10-15 09:58:04 +10:30

1 2 3

119 commits