On `dev-memleak`, if someone is using rpc_command_hook, we'll call
it when the hook returns. But it will see these contexts as a leak.
So attach them to tmpctx (which is excluded from leak detection).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Without knowing what method was called, we can't have useful general logging
methods, so go through the pain of adding "const char *method" everywhere,
and add:
1. ignore_and_complete - we're done when jsonrpc returned
2. log_broken_and_complete - we're done, but emit BROKEN log.
3. plugin_broken_cb - if this happens, fail the plugin.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
When we used to allow cmd to be NULL, we had to hand the plugin
everywhere. We no longer do.
1. Various jsonrpc_ functions no longer need the plugin arg.
2. send_outreq no longer needs a plugin arg.
3. The init function takes a command, not a plugin.
4. Remove command_deprecated_in_nocmd_ok.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We were dereferencing the first character of the id, (always '"') which meant
everything was id 34.
Before:
plugin-pay: cmd 34 partid 5
After:
cmd pytest:pay#62/cln:pay#105 partid 0
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Fixed: `pay`: debug logging now uses correct JSON ids.
This means we replace p->cmd with an auxillary command after we've
finished, so we have a valid command to use.
It also means we weave `struct command_result` returns back through
all the callers.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This does not code changes, but makes the next changes easier.
We short-cut the "we are a child" case and de-indent the main
cases.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is cleaner: everything can now be associated with a command
context.
You're supposed to eventually dispose of it using timer_complete().
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Sometimes we want to clean up *after* a command has completed, but
we're moving to a model where all libplugin operations require a
`struct command`. This adds `aux_command` to create an
independent-lifetime command with the same id.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
I started getting "WIRE_TEMPORARY_CHANNEL_FAILURE: Too many HTLCs" after
two hundred xpay attempts.
This was nice (it found some bugs in injectpaymentonion's handling of
local errors, and in xpay's reporting), but shouldn't happen.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Start with a random capacity (linear prob), and remember in-progess
payments so we can simulate them using capacity properly.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Note the impedence mismatch between sendpay and getroutes: we have to shift
amounts and delays by 1.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Our gossmap_store uncompresser generates nodeids with well-known
privkeys, so we can decrypt and respond to HTLCs sent to such nodes.
By replacing channeld with a fake, we can connect a node to another
node, but then once the channel is established, allow payments to be
sent into the generated network, and respond appropriately.
This minimal version handles MPP timeouts, but doesn't insert any
delays or runtime capacity for channels.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-None: Testing only
secp256k1_ctx is used by pubkey_from_node_id. Don't try to pick and
choose where to initialize secp256k1_ctx, just always do it.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
If you change branches and have a generated .md file, index.rst
will pick it up. Use the Makefile variable, not the contents of
the filesystem!
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This remove an unnecessary check for existing description field if the
description_hash is provided in the invoice. The bolt11_decode function
already checks the description against the hash if both are provided.
Changelog-Fix: renepay: allow to pay BOLT11 invoices with description_hash, the description field is made optional
Signed-off-by: Lagrang3 <lagrang3@protonmail.com>
Now pay learns, it sometimes learns not to try again:
```
> assert(len(l1.rpc.listpays()['pays']) == 2)
E AssertionError: assert 1 == 2
E + where 1 = len([{'amount_sent_msat': 0, 'bolt11': 'lnbcrt1pnjj7mysp5tfx8n6nyx7ehszgqn7gqm2r6n079p22u2yddtg797ka3pa9557tspp5f89z6genjqrl3knymvav9ajwcxrm5w7arxux06rrhjux88derjyqdq8v3jhxccxqyjw5qcqp9rzjqgkjyd3q5dv6gllh77kygly9c3kfy0d9xwyjyxsq2nq3c83u5vw4jqqqvuqqqqsqqqqqqqqpqqqqqzsqqc9qxpqysgqcuyr7qlyctf9w96fqg4wetqt7t5v938dagmv0r777n902utjufujzjxl3289r97yngft966zly3ehxfp469dh3lq0hkv6r684snvunqppuyvsl', 'created_at': 1730771812, 'destination': '035d2b1192dfba134e10e540875d366ebc8bc353d5aa766b80c090b39c3a5d885d', ...}])
tests/test_pay.py:5147: AssertionError
```
We fix this by creating a fresh channel, so it will try.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We can see peer_in before the state changes:
```
l3.daemon.wait_for_log("peer_in WIRE_ERROR")
> assert only_one(l3.rpc.listpeerchannels(l2.info['id'])['channels'])['state'] == 'AWAITING_UNILATERAL'
E AssertionError: assert 'CHANNELD_NORMAL' == 'AWAITING_UNILATERAL'
E - AWAITING_UNILATERAL
E + CHANNELD_NORMAL
```
From the logs, there is 0.2 seconds between them:
```
lightningd-3 2024-11-05T01:58:41.695Z DEBUG 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-channeld-chan#1: peer_in WIRE_ERROR
lightningd-3 2024-11-05T01:58:41.726Z DEBUG 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-channeld-chan#1: billboard perm: Received ERROR channel cecf36a62a09d4f1bdb42aa61d0770964bf3b245b8943a3e5b86dafc572f63d1: Forcibly closed by `close` command timeout
lightningd-3 2024-11-05T01:58:41.745Z UNUSUAL 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-chan#1: Peer permanent failure in CHANNELD_NORMAL: channeld: received ERROR channel cecf36a62a09d4f1bdb42aa61d0770964bf3b245b8943a3e5b86dafc572f63d1: Forcibly closed by `close` command timeout (reason=protocol)
lightningd-3 2024-11-05T01:58:41.890Z DEBUG 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-chan#1: We have 3 anchor points to use
lightningd-3 2024-11-05T01:58:41.897Z DEBUG lightningd: Broadcasting txid ff2f44b37d96a81fcaa2a4b11746a06be70f3f800fbce941e61a47abb61f70c0
lightningd-3 2024-11-05T01:58:41.901Z DEBUG lightningd: sendrawtransaction: 02000000000101cecf36a62a09d4f1bdb42aa61d0770964bf3b245b8943a3e5b86dafc572f63d1000000000096b64c80044a01000000000000220020525df7a97bd0506c9ec41ee4e5f095e6e5316db01846a0a687404628017e88494a010000000000002200206db2ec9041ba3ccb6309dcad26015f32637e6869be09d3c3d3a1cb1439296f0b400d030000000000220020c2468acf761754e2533fcb13235326d1f1173b697b132048d6409be7818ed9a96a1f0c0000000000220020b1b561b95c1bccd50fb21bd417a95cabbc3efc351b1353063fe5e9f185d21c8a0400473044022036ab981a2642527c2019a4d15847f6b2fb1d0b92f1a2faff463ab109ed70c4d002205eeb0fcd610a9dba477ee7d66f02d0384080383fbb003d4ee423bdae0970b0430147304402202f7de3481ce00478e539cd9d6bac95ec3c80ec906c46ad2561118dca321b1458022048efd1aa04ff5fafbb99d14f7c9671072b035cbd2c031547793eb0cb6a62302d0147522102d595ae92b3544c3250fb772f214ad8d4c51425033740a5bcc357190add6d7e7a2102d6063d022691b2490ab454dee73a57c6ff5d308352b461ece69f3c284f2c241252ae6e14d920
lightningd-3 2024-11-05T01:58:41.907Z INFO 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-chan#1: State changed from CHANNELD_NORMAL to AWAITING_UNILATERAL
```
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The original complaint which caused my investigation was the 100% CPU
consumption of connectd, which we traced to the queue to gossipd.
However, the issue is not really connectd's overproduction, but
gossipd's underconsumption, probably caused by its own queueing issues
with the trace messages to lightningd, which the prior patch fixed.
Nonetheless, gossipd *can* get busy, and if we were to ask multiple
nodes for full gossip, we could see a few hundred thousand messages
come it at once. Hence I'm increasing the warning limit to 250,000
messages.
This commit is also where we attach the Changelog message, even
though it's really "common/msg_queue: use membuf for greater efficiency."
and "gossipd: fix excessive msg_queue length from status_trace()" which
solved the problem.
Here's the backtrace from a previous debug patch:
```
lightning_connectd: msg_queue length excessive (version v24.08.1-17-ga780ad4-modded)
0x5580534051f0 send_backtrace
common/daemon.c:33
0x55805340bd5b do_enqueue
common/msg_queue.c:66
0x55805340bde5 msg_enqueue
common/msg_queue.c:82
0x5580534057ce daemon_conn_send
common/daemon_conn.c:161
0x5580533fe3ff handle_gossip_in
connectd/multiplex.c:624
0x5580533ff23b handle_message_locally
connectd/multiplex.c:763
0x5580533ff2d6 read_body_from_peer_done
connectd/multiplex.c:1112
```
Reported-by: https://github.com/JssDWt
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Fixed: `connectd` and `gossipd` message queues are much more efficient.
When this (very spammy) "handle_recv_gossip" message was changed
from debug to trace, the suppression code wasn't updated: we suppress
overly active debug messages, but not trace messages.
This is the backtrace from an earlier version of the "too large queue"
patch:
```
lightning_gossipd: msg_queue length excessive (version v24.08.1-17-ga780ad4-modded)
0x557e521e833f send_backtrace
common/daemon.c:33
0x557e521eefb9 do_enqueue
common/msg_queue.c:66
0x557e521ef043 msg_enqueue
common/msg_queue.c:82
0x557e521e891d daemon_conn_send
common/daemon_conn.c:161
0x557e521f14f0 status_send
common/status.c:90
0x557e521f1804 status_vfmt
common/status.c:169
0x557e521f1433 status_fmt
common/status.c:180
0x557e521de7c6 handle_recv_gossip
gossipd/gossipd.c:206
0x557e521de9f5 connectd_req
gossipd/gossipd.c:307
0x557e521e862d handle_read
common/daemon_conn.c:35
```