1. When we add a shadow amount, we were using the wrong channel for
the fee calculation.
2. Similarly, when calculating the delay amount.
The result is that we can get WIRE_INCORRECT_CLTV_EXPIRY repeatedly
from nodes.
Reported-by: https://github.com/SjorsFixes: #6620
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changlog-Experimental: Fixed: `renepay` handles ctlv correctly when it varies along a path.
```
Flow 391: amount=23528000msat prob=0.000 fees=1023msat delay=140 path=-2471854x37x4/1(min=max=23528783msat)->-2414928x98x0/0->
Flow 391: Failure of 23529023msat for 2471854x37x4/1 capacity [23528783msat,23528783msat] -> [23528783msat,23528783msat]
```
We added fees and went over capacity! This screams of a deeper logic
bug, but renepay is experimental and it's release day so hack around
it for now...
Reported-by: https://github.com/daywalker90
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Eduardo is on holiday right now, but he pinged me asking for this. It
makes some sense, and using half the *failed value* covers the case where
it's less than half what we expected.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It now looks like (for test_hardmpp):
```
# we have computed a set of 1 flows with probability 0.328, fees 0msat and delay 23
# Flow 1: amount=1800000000msat prob=0.328 fees=0msat delay=12 path=-103x2x0/1(min=max=4294967295msat)->-103x5x0/0->-103x3x0/1->
# Flow 1: Failed at node #1 (WIRE_TEMPORARY_CHANNEL_FAILURE): failed: WIRE_TEMPORARY_CHANNEL_FAILURE (reply from remote)
# Flow 1: Failure of 1800000000msat for 103x5x0/0 capacity [0msat,3000000000msat] -> [0msat,1799999999msat]
# we have computed a set of 2 flows with probability 0.115, fees 0msat and delay 23
# Flow 2: amount=500000000msat prob=0.475 fees=0msat delay=12 path=-103x6x0/0(min=max=4294967295msat)->-103x1x0/1->-103x4x0/1->
# Flow 3: amount=1300000000msat prob=0.242 fees=0msat delay=12 path=-103x2x0/1(min=max=4294967295msat)->-103x5x0/0(max=1799999999msat)->-103x3x0/1->
# Flow 3: Failed at node #1 (WIRE_TEMPORARY_CHANNEL_FAILURE): failed: WIRE_TEMPORARY_CHANNEL_FAILURE (reply from remote)
# Flow 3: Failure of 1300000000msat for 103x5x0/0 capacity [0msat,1799999999msat] -> [0msat,1299999999msat]
# we have computed a set of 2 flows with probability 0.084, fees 0msat and delay 23
# Flow 4: amount=260000000msat prob=0.467 fees=0msat delay=12 path=-103x6x0/0(500000000msat in 1 htlcs,min=max=4294967295msat)->-103x1x0/1(500000000msat in 1 htlcs)->-103x4x0/1(500000000msat in 1 htlcs)->
# Flow 5: amount=1040000000msat prob=0.179 fees=0msat delay=12 path=-103x2x0/1(min=max=4294967295msat)->-103x5x0/0(max=1299999999msat)->-103x3x0/1->
# Flow 5: Failed at node #1 (WIRE_TEMPORARY_CHANNEL_FAILURE): failed: WIRE_TEMPORARY_CHANNEL_FAILURE (reply from remote)
# Flow 5: Failure of 1040000000msat for 103x5x0/0 capacity [0msat,1299999999msat] -> [0msat,1039999999msat]
# we have computed a set of 2 flows with probability 0.052, fees 0msat and delay 23
# Flow 6: amount=120000000msat prob=0.494 fees=0msat delay=12 path=-103x6x0/0(760000000msat in 2 htlcs,min=max=4294967295msat)->-103x1x0/1(760000000msat in 2 htlcs)->-103x4x0/1(760000000msat in 2 htlcs)->
# Flow 7: amount=920000000msat prob=0.105 fees=0msat delay=12 path=-103x2x0/1(min=max=4294967295msat)->-103x5x0/0(max=1039999999msat)->-103x3x0/1->
# Flow 7: Success
```
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
I am doing to add more more debugging, but sent here is 0.
Document that clearly, and put a real value in sent.
Also: since we already sub 1 msat from x, amount_msat_less_eq should
be amount_msat_less (it may be equal to our min, in theory).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
```
Aug 18 13:45:13 lightningd: 0x7fa921f8ffcf ???
Aug 18 13:45:13 lightningd: ./signal/../sysdeps/unix/sysv/linux/x86_64/libc_sigaction.c:0
Aug 18 13:45:13 lightningd: 0x55b3bb54e6d3 pay_flow_finished_adding_gossip
Aug 18 13:45:13 lightningd: plugins/renepay/pay_flow.c:675
Aug 18 13:45:13 lightningd: 0x55b3bb54af25 addgossip_done
Aug 18 13:45:13 lightningd: plugins/renepay/pay.c:171
```
The assert we fail is almost certainly due to the flow being freed:
```
struct pf_result *pay_flow_finished_adding_gossip(struct pay_flow *pf)
{
assert(pf->state == PAY_FLOW_FAILED_GOSSIP_PENDING);
```
Reported-by: https://github.com/daywalker90Fixes: #6567
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
There's a fascinating bug report which suggests this happens on local channels,
implying spendable_msat is wrong?
See-also: #6567
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Compiler can't tell that we always set have_state[PAY_FLOW_FAILED_FINAL]
when we set this:
```
plugins/renepay/payment.c: In function ‘payment_reconsider’:
plugins/renepay/payment.c:287:25: error: ‘final_error’ may be used uninitialized [-Werror=maybe-uninitialized]
287 | payment_fail(payment, final_error, "%s", final_msg);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
plugins/renepay/payment.c:194:30: note: ‘final_error’ was declared here
194 | enum jsonrpc_errcode final_error, ecode;
| ^~~~~~~~~~~
plugins/renepay/payment.c:287:25: error: ‘final_msg’ may be used uninitialized [-Werror=maybe-uninitialized]
287 | payment_fail(payment, final_error, "%s", final_msg);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
plugins/renepay/payment.c:195:21: note: ‘final_msg’ was declared here
195 | const char *final_msg;
| ^~~~~~~~~
```
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We cannot carry pointers into the gossmap across localmod addition
or removal.
We didn't notice because the map->chan_arr is not normally resized,
but if we change gossmap.c line 689 to only allocate 1 to start, we see this:
```
VALGRIND=1 valgrind -q --error-exitcode=7 --track-origins=yes --leak-check=full --show-reachable=yes --errors-for-leak-kinds=all plugins/renepay/test/run-mcf > /dev/null
==2349744== Invalid read of size 4
==2349744== at 0x1788C2: gossmap_chan_scid (gossmap.c:558)
==2349744== by 0x1872A2: get_chan_extra_half_by_chan (flow.c:346)
==2349744== by 0x187797: remove_completed_flow (flow.c:488)
==2349744== by 0x187927: remove_completed_flow_set (flow.c:518)
==2349744== by 0x18DF4D: main (run-mcf.c:393)
==2349744== Address 0x4b80f38 is 88 bytes inside a block of size 136 free'd
==2349744== at 0x4848C63: realloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==2349744== by 0x173D71: tal_resize_ (tal.c:744)
==2349744== by 0x177E36: next_free_chan (gossmap.c:336)
==2349744== by 0x177ED3: new_channel (gossmap.c:351)
==2349744== by 0x178441: add_channel (gossmap.c:458)
==2349744== by 0x1798D4: gossmap_apply_localmods (gossmap.c:904)
==2349744== by 0x18DEDB: main (run-mcf.c:388)
==2349744== Block was alloc'd at
==2349744== at 0x4848C63: realloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==2349744== by 0x173D71: tal_resize_ (tal.c:744)
==2349744== by 0x177E36: next_free_chan (gossmap.c:336)
==2349744== by 0x177ED3: new_channel (gossmap.c:351)
==2349744== by 0x178441: add_channel (gossmap.c:458)
==2349744== by 0x178B6D: map_catchup (gossmap.c:635)
==2349744== by 0x178F45: load_gossip_store (gossmap.c:697)
==2349744== by 0x179D71: gossmap_load (gossmap.c:978)
==2349744== by 0x18D22F: main (run-mcf.c:295)
```
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It's not there if it's a local error:
```
{
"code": 202,
"message": "Parsing '{message:%,data:{erring_index:%,failcode:%,raw_message:': object does not have member raw_message"
}
```
Reported-by: https://github.com/daywalker90Fixes: #6553
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Unifies the pay_flow resolve functions, and moves remove_htlc_payflow
and commit_htlc_payflow to the top.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We want to make sure that on every path, we terminate the flow. The simplest
way to do this is encourage the pattern "return pay_flow_xxx(flow)".
Indeed, this caught a few places I missed!
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The main function here is payment_reconsider:
* Each payment has a list of pay_flow.
* This is populated in try_paying(), calling add_payflows & sendpay_new_flows.
* When we get a notification, we resolve a pay_flow using one of the pay_flow_failedxxx
or pay_flow_succeeded functions.
* They call payment_reconsider() which cleans up finished flows decides what to do:
often calling try_paying again.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It's not required, but it should be there so we might as well use it
(though we sometimes don't put one in, esp if it's a private channel).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Avoids a gratuitous "ctx" field, and the simplified declaration
is now understood by `make update-mocks`.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Treat it just like "PAY_TRY_OTHER_ROUTE", except it is from the final node:
this means we correctly process that it "succeeded".
Add a test: this crashes sometimes, but it's cleaned up soon...
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
As recommended by your TODO, a bit simpler: we also make the hash function
return a ptr rather than the (now rather large) struct.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Use json_scan(), and use the new pay_flow_from_notification() routine.
Also, the tal_dup_or_null can be tal_dup, since &preimage is never NULL.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
There are a few fields in `struct renepay` which are genuinely
transient, but it makes the code much harder to follow than simply
having a single structure.
More cleanups will follow, but this is the minimal set.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The general pattern for xxx_new is that it should populate all
fields, for encapsulation and so you never can have a half-formed
object.
This means a fair bit of work for now, but it pays off in the next
patch.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
You cannot refresh the gossmap with localmods applied, nor apply localmods
when others have applied localmods in the same process.
There are optimizations we could do, but for now always apply/unapply before
querying gossmap.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Fixes 32-bit builds:
```
In file included from plugins/renepay/pay.c:5:
./plugins/renepay/pay_flow.h: In function 'fmt_payflow_key':
./plugins/renepay/pay_flow.h:54:17: error: format '%ld' expects argument of type 'long int', but argument 4 has type 'u64' {aka 'long long unsigned int'} [-Werror=format=]
54 | "key: groupid=%ld, partid=%ld, payment_hash=%s",
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
55 | k->groupid,k->partid,
| ~~~~~~~~~~
| |
| u64 {aka long long unsigned int}
```
etc
It's expressed in bits, but really it's clearer as a quantity, given
how it's used.
Suggested-by: @Lagrang3
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>