core-lightning

mirror of https://github.com/ElementsProject/lightning.git synced 2024-11-19 09:54:16 +01:00

Author	SHA1	Message	Date
Rusty Russell	2a0f09fc2d	askrene: calculate `k` value dynamically, using medians. While the `k=8` value worked for the current main network tests with the amounts in those tests, it wasn't robust across a wider range of values (as demonstrated when other test changes broke tests!). Time to do this properly: calculate the ratio at the time we combine them, using median values. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2024-10-15 09:58:04 +10:30
Rusty Russell	32aa79a1e2	askrene: debug and check we actually reduce fees when mu increase. Even after the previous fix, we still occasionally increase fees when my increases. This is due to the difference between MCF's linear fees, and actual fees, and is unavoidable, but add a check if it somehow happens. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2024-10-15 09:58:04 +10:30
Rusty Russell	08df93cb25	askrene: fix base fee. I noticed this in the logs: plugin-cln-askrene: notify msg unusual: The flows had a fee of 151950msat, greater than max of 53697msat, retrying with mu of 10%... plugin-cln-askrene: notify msg unusual: The flows had a fee of 220126msat, greater than max of 53697msat, retrying with mu of 20%... We would expect increasing mu to reduce the fee! Turns out that our linear fee is a bad terrible approximation, because I was using base_fee_penalty of 10.0. \| \| / __ <- real fee, with base: fee = base + propfee * amount. \| / __/ \| _// \| __/ \| __/_/ \|/ _/ \| _/ <- linearized fee: fee = linear * amount \|/ +----------------------------------- These cross over where linear = propfee + base / amount. Assume we split the payment into 10 parts, this implies that the base_fee_penalty should be 10 / amount (this gives a slight penalty to the normal case, but that's ok). This gives better results, too: we get down to 650099 sats in fees, vs 801613 before. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2024-10-15 09:58:04 +10:30
Rusty Russell	6273adbe47	askrene: calculate prob_cost_factor using ratio of typical mainnet channel. During "test_real_data", then only successes with reduced fees were 92 on "mu=10", and only 1 on "mu=30": the rest went to mu=100 and failed. I tried numerous approaches, and in the end, opted for the simplest: The typical range of probability costs looks likes: min = 0, max = 924196240, mean = 10509.4, stddev = 1.9e+06 The typical range of linear fee costs looks like: min = 0, max = 101000000, mean = 81894.6, stddev = 2.6e+06 This implies a k factor of 8 makes the two comparable. This makes the two numbers comparable, and thus makes "mu" much more effective. Here are the number of different mu values we succeeded at: 87 mu=0 90 mu=10 42 mu=20 24 mu=30 17 mu=40 19 mu=50 19 mu=60 11 mu=70 95 mu=80 19 mu=90 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2024-10-15 09:58:04 +10:30
Rusty Russell	4897286c25	mcf: simplify mu -> cost translation. The current prob_cost_factor setting does not seem to make mu very effective, in fact, it gives strange results: plugin-cln-askrene: notify msg unusual: The flows had a fee of 151950msat, greater than max of 53697msat, retrying with mu of 10%... plugin-cln-askrene: notify msg unusual: The flows had a fee of 220126msat, greater than max of 53697msat, retrying with mu of 20%... We would expect increasing mu to reduce the fee! As a first step, simplify (it can't be infinite, and the -1 are weird). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2024-10-15 09:58:04 +10:30
Rusty Russell	83eee64fda	pytest: test askrene with worse maxfee argument. We ask it again, but reduce fees by 1msat from the previous answer. This is really nasty, as it frequently exercises the case where we only go over fee when we do the refinement step. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2024-10-15 09:58:04 +10:30
Rusty Russell	55fc7fc2e5	pytest: test askrene on real network data. I checked the failures, they seem real. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2024-10-15 09:58:04 +10:30
Rusty Russell	f17c5f5a6b	askrene: don't use tmpctx in minflow() I tested with a really large gossmap (hacked to be 4GB), and when we keep retrying to minimize cost (calling minflow 11 times), and we don't free tmpctx. Due to an issue with how gossmap estimates the index sizes, we ended up running out of memory. This fixes it. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2024-10-15 09:58:04 +10:30
Lagrang3	bd8cc1fb1f	askrene: detect and cancel flow cycles Flow cycles can occur if we have arc zero arc costs. The previous path construction from the flow in the network assumed the absence of such cycles and would enter an infinite loop if it hit one. With his patch wee add cycle detection and removal during the path construction phase. Reported-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Lagrang3 <lagrang3@protonmail.com> Changelog-EXPERIMENTAL: `askrene` infinite loop fixed	2024-10-15 09:58:04 +10:30
Rusty Russell	d08a3bb9e6	askrene: don't give up if we hit htlc_max and have no other flows. This happens in the coming "real network" test! We add fees and hit htlc_max, but don't have another flow to add to. Rather than MCF again, we split the flow into two. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2024-10-15 09:58:04 +10:30
Rusty Russell	1b82a3ad5b	askrene: constrain to exact htlc_min/htlc_max values. The fp16_t values are approximations (overestimate for htlc_max, underestimate for htlc_min), so in the refinement step we should use the exact values. This also fixes a logic bug: flow_remaining_capacity returned the total capacity, not the additional capacity! Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Changelog-EXPERIMENTAL: `askrene` now honors exact htlc_maximum_msat limits.	2024-10-15 09:58:04 +10:30
Rusty Russell	0baac77a1c	gossmap: allow gossmap_chan_get_update_details on locally-modified channels. In particular, this lets you find the exact htlc_maximum_msat/htlc_minimum_msat values. This means we actually create real channel_updates for local mods, which requires a second "local" scratch region. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2024-10-15 09:58:04 +10:30
Rusty Russell	4ee9d1d2f2	gossmap: include cltv_expiry_delta in gossmap_chan_get_update_details for completeness. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2024-10-15 09:58:04 +10:30
Rusty Russell	22e7a57557	askrene: make `auto.sourcefree` a real layer, too. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Changelog-EXPERIMENTAL: `getroutes` now applies `auto.sourcefree` layer in the order specified, so doesn't alter channels changed in later layers.	2024-10-15 09:58:04 +10:30
Rusty Russell	3321ad5883	askrene: populate auto.localchans layer properly. Rather than adding to the gossmap modifications directly, populate the layer and have the normal layer application logic do it. This is consistent when we query layers in the next patch. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2024-10-15 09:58:04 +10:30
Rusty Russell	1230f1b832	askrene: give notifications back to caller as we go. And unify logging for better debugging. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Changelog-EXPERIMENTAL: `askrene` now has better logging, gives notifications of progress.	2024-10-15 09:58:04 +10:30
Rusty Russell	ca023f2b5e	pyln-testing: understand `gossip_store_file` arg in get_nodes. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2024-10-15 09:58:04 +10:30
Rusty Russell	7d02d5749a	gossipd: at startup don't send remote channel_update as init update from us. The "init_cupdate" message is for gossipd to tell lightningd about our own latest channel_update messages, not the remote ones! The "remote_channel_update" message is for messages from the peer. This appeared as an occasional BROKEN message in CI: ``` BROKEN 035d2b1192dfba134e10e540875d366ebc8bc353d5aa766b80c090b39c3a5d885d-chan#4: gossipd gave us channel_update for channel in gossip_state CGOSSIP_NEED_PEER_SIGS ``` Where we had sent (and not received) announcement_signatures, and restarted: the peer had meanwhile sent us their channel_announcement and channel_update. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Changelog-Fixed: Protocol: we could get confused on restart and not re-transmit our own channel_updates.	2024-10-14 16:58:49 +01:00
Rusty Russell	4e6bac6d36	connectd: fix double-free crash on connection timeout. tmpctx may not get cleaned immediately, so the timeout (a child of the struct early_peer at this point) can still outlast the conn. Do the clearer thing, and explicitly free the timeout. Changelog-Fixed: connectd: crash on erroneous timeout. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2024-10-14 16:36:58 +01:00
Se7enZ	f9e28b9bfa	keysend: Add `maxfee` to keysend for consistency with pay. ([#7227 ]) Changelog-Added: keysend: Add `maxfee` to keysend for consistency with pay. ([#7227])	2024-10-14 11:58:00 +02:00
Chris Guida	c7189f644a	make: Change `mv` to `mv -f` when replacing tools/headerversions file Whenever we build without `make clean` first, this file gets overwritten. It asks the user for input because it is not run with `-f`. This causes the build to hang until the user inputs `y`. Here is an example build: ``` $ make -j7 && sudo make install CC: cc -DCLN_NEXT_VERSION="v24.08" -DPKGLIBEXECDIR="/usr/local/libexec/c-lightning" -DBINDIR="/usr/local/bin" -DPLUGINDIR="/usr/local/libexec/c-lightning/plugins" -DCCAN_TAL_NEVER_RETURN_NULL=1 -Wall -Wundef -Wmissing-prototypes -Wmissing-declarations -Wstrict-prototypes -Wold-style-definition -Werror -Wno-maybe-uninitialized -Wshadow=local -std=gnu11 -g -fstack-protector-strong -Og -I ccan -I external/libwally-core/include/ -I external/libwally-core/src/secp256k1/include/ -I external/jsmn/ -I external/libbacktrace/ -I external/gheap/ -I external/build-x86_64-linux-gnu/libbacktrace-build -I external/libsodium/src/libsodium/include -I external/libsodium/src/libsodium/include/sodium -I external/build-x86_64-linux-gnu/libsodium-build/src/libsodium/include -I . -I/usr/local/include -I/usr/include/postgresql -DSHACHAIN_BITS=48 -DJSMN_PARENT_LINKS -DCOMPAT_V052=1 -DCOMPAT_V060=1 -DCOMPAT_V061=1 -DCOMPAT_V062=1 -DCOMPAT_V070=1 -DCOMPAT_V072=1 -DCOMPAT_V073=1 -DCOMPAT_V080=1 -DCOMPAT_V081=1 -DCOMPAT_V082=1 -DCOMPAT_V090=1 -DCOMPAT_V0100=1 -DCOMPAT_V0121=1 -c -o LD: cc -Og config.vars -Lexternal/build-x86_64-linux-gnu -lwallycore -lsecp256k1 -ljsmn -lbacktrace -lsodium -L/usr/local/include -lm -lsqlite3 -L/usr/lib/x86_64-linux-gnu -lpq -o mv: replace 'tools/headerversions', overriding mode 0755 (rwxr-xr-x)? cc lightningd/test/run-check_node_announcement.c cc lightningd/test/run-find_my_abspath.c cc lightningd/test/run-invoice-select-inchan.c cc lightningd/test/run-jsonrpc.c cc lightningd/test/run-log_filter.c cc lightningd/test/run-log-pruning.c cc lightningd/test/run-shuffle_fds.c ld lightningd/test/run-find_my_abspath ld lightningd/test/run-log-pruning ld lightningd/test/run-check_node_announcement ld lightningd/test/run-log_filter ld lightningd/test/run-jsonrpc ld lightningd/test/run-shuffle_fds ld lightningd/test/run-invoice-select-inchan ^Cmake: *** [tools/Makefile:16: tools/headerversions] Interrupt ``` One workaround is to just know that you need to enter `y` here. But the best solution is probably to update the Makefile like so. Changelog-None	2024-10-14 15:00:36 +10:30
Jesse de Wit	8330e3a0df	pay-plugin: set gossmods directly Multiple places in the pay lifecycle depend on mods to be set. By setting the mods directly after the first listpeerchannels call, subsequent calls to listpeerchannels are avoided. Changelog-Fixed: pay-plugin: only call listpeerchannels once during a payment lifecycle.	2024-10-08 19:26:14 +02:00
Jesse de Wit	281c639b57	pay-plugin: direct_pay only destination channels The direct_pay payment modifier would query all peer channels, while only the channels of the given peer suffices. Changelog-None	2024-10-08 19:26:14 +02:00
Tommy Volk	6bd5933a50	chore: bump rust bitcoin to 0.31	2024-10-08 18:56:05 +02:00
ShahanaFarooqui	910c014183	ci: Prebuild action fails if Changelog is missing from all commits of a PR Changelog-None.	2024-10-08 18:31:40 +02:00
Rusty Russell	d067066b17	common/gossmap: use u64 for all offsets. Since we don't compact the gossmap on the fly (FIXME!) we can easily surpass 4GB in the gossmap, and 32 bit offsets are not sufficient. I'm a bit surprised we don't crash immediately, but we've definitely seen issues. Changelog-Fixed: gossipd: crash errors with large gossip_store (>4MB) growth on longer-running nodes. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2024-10-08 09:50:17 +02:00
Rusty Russell	ebf784ef9c	gossipd: use u64 for the one offset we don't. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2024-10-08 09:50:17 +02:00
Jesse de Wit	cc1362ead3	libplugin-pay: use map for channel hints For nodes with many channels this is a tremendous improvement in pay performance. PR #7611 improves payment performance from 15 seconds to 13.5 seconds on one of our nodes. This commit improves payment performance from 13.5 seconds to 5.7 seconds. Changelog-Fixed: Improved pathfinding speed for large nodes.	2024-10-07 15:16:46 +02:00
Christian Decker	a6a7dd8f71	pay: Switch to msat for total_capacity This minimizes the need to convert back and forth from and to sat values, and it also removes a new instance of sats in the public interface (`channel_hints`). Suggested-By: Rusty Russell <@rustyrussell>	2024-10-07 14:05:47 +02:00
Christian Decker	ddc199ff41	route: Re-add the assertion that we're one side of a channel	2024-10-07 14:05:47 +02:00
Christian Decker	3845b84dbc	pay: Simplify the `channel_hint` update logic We were ignoring more reliable information because of the stricter-is-better logic. This meant that we were also ignoring local channel information, despite knowing better. By simplifying the logic to pick the one with the newer timestamp we ensure that later observations always trump earlier ones. There is a bit of interplay between the relaxation of the constraints updating the timestamp, and comparing to real observation timestamps, but that should not impact us for local channels.	2024-10-07 14:05:47 +02:00
Christian Decker	db36449408	pytest: Fix up the `test_sendpay_grouping` test It was failing because the channel_hint from one attempt would prevent us from retrying. By changing the amounts so that the channel_hints do not concern them (value smaller than estimate) we can make things work as before again.	2024-10-07 14:05:47 +02:00
Christian Decker	e839c0ebcc	test: Fix up the `test_pay_routeboost` test	2024-10-07 14:05:47 +02:00
Christian Decker	63e663ec9c	pytest: Fix up the `test_mutual_connect_race` A failing payment would doom all subsequent ones. Now we step down the amount a single satoshi so any prior channel_hints do not doom the payment outright. Changelog-None	2024-10-07 14:05:47 +02:00
Christian Decker	f6d410d924	pay: Remove use of temporary local `channel_hint` We were using `channel_hint` to temporarily tweak the graph inside of a payment. However, these ad-hoc `channel_hints` are stickier than their predecessors, in that they outlive the payment attempt itself, and interfere with later ones. Changelog-Changed: pay: Discarding an overly long or expensive route does not blocklist channels anymore.	2024-10-07 14:05:47 +02:00
Christian Decker	0a62416c4b	pay: Inject `channel_hint`s we receive via plugin notifications Making sure that we don't accidentally create an endless loop.	2024-10-07 14:05:47 +02:00
Christian Decker	30d2a57f50	pay: Log when and why we exclude a channel from the route	2024-10-07 14:05:47 +02:00
Christian Decker	91ffa8e424	pay: Add `channel_hint_set_count` primitive	2024-10-07 14:05:47 +02:00
Christian Decker	30a2933a94	pay: Add a hysteresis for channel_hint relaxation If we have a large channel, fail to send through a small amount, and then add a `channel_hint`, then it can happen that the call to `channel_hint_set_update` is already late enough that it refills the 1msat we removed from the attempted amount, thus making the edge we just excluded eligible again, which can lead into an infinite loop. We slow down the updating of the channel_hints to once every hysteresis timeout. This allows us to set tight constraints, while not incurring in the accidental relaxation issue.	2024-10-07 14:05:47 +02:00
Christian Decker	50a0321759	pay: Use the global `channel_hint_set` and remember across payments	2024-10-07 14:05:47 +02:00
Christian Decker	603a70e7e2	pytest: Test that we remember learnt channel hints across payments	2024-10-07 14:05:47 +02:00
Christian Decker	3ad0085478	route: Change the type of the funding capacity to `amount_sat` Keeping it in `amount_msat` made the comparisons easier, but it was the wrong type for this.	2024-10-07 14:05:47 +02:00
Christian Decker	f803af782a	route: Use safe `amount_sat_to_msat` conversion Suggested-by: Rusty Russell <@rustyrussell>	2024-10-07 14:05:47 +02:00
Christian Decker	904eb3795c	pay: Subscribe to the `channel_hint_update` notifications	2024-10-07 14:05:47 +02:00
Christian Decker	29df2c9f40	route: Simplify direction	2024-10-07 14:05:47 +02:00
Christian Decker	cf09314b3b	pay: Rename overall_capacity to just capacity Suggested-by: Rusty Russell <@rustyrussell>	2024-10-07 14:05:47 +02:00
Christian Decker	37a204df41	plugin: Split out the `struct channel_hint` handling We're getting serious about how we manage the channel_hints, so let's give them a proper home.	2024-10-07 14:05:47 +02:00
Christian Decker	1144088e14	make: Weaken over aggressive check-amount-access test Changelog-None	2024-10-07 14:05:47 +02:00
Christian Decker	b897b4365d	pay: Make the `channel_hint`s global We attach the hints to the plugin, so they get shared across multiple payments.	2024-10-07 14:05:47 +02:00
Christian Decker	5225218094	pay: Use the total_mast amount as the upper limit for channel_hints	2024-10-07 14:05:47 +02:00

1 2 3 4 5 ...

15391 Commits