We spend quite a bit of time in libsecp256k1 moving them to and from
DER encoding. With a bit of care, we can transfer the raw bytes from
gossipd and manually decode them so a malformed one can't make us
abort().
Before:
real 0m0.629000-0.695000(0.64985+/-0.019)s
After:
real 0m0.359000-0.433000(0.37645+/-0.023)s
At this point, the main issues are 11% of time spent in ccan/io's
backend_wake (I tried using a hash table there, but that actually makes
the small-number-of-fds case slower), and 65% of gossipd's time is
in marshalling the response (all those tal_resize add up!).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
My test case is a mainnet gossip store with 22107 channels, and
time to do `lightning-cli listchannels`:
Before: `lightning-cli listchannels` DEVELOPER=0
real 0m1.303000-1.324000(1.3114+/-0.0091)s
After:
real 0m0.629000-0.695000(0.64985+/-0.019)s
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It's a very ugly one-liner; really ccan/io should have an io_replan
for this, but it would have to be written carefully as it makes
assumptions currently about plans not changing. In this case, we know
it's in io_write, and we're just moving a pointer.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Such an API is required for when we stream it directly. Almost all our
handlers fit this pattern already, or nearly do.
We remove new_json_result() in favor of explicit json_stream_success()
and json_stream_fail(), but still allowing command_fail() if you just
want a simple all-in-one fail wrapper.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This isn't a big change, since we basically dump the entire JSON
resuly string into the membuf then write it out, but it's prep for the
next changes.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We occasionaly had a travis hang in test_multirpc, and it's due to a
thinko in the prior patch: if a command completes immediately, it will
do the wake before we go to sleep. That means we don't digest the
rest of the buffer until the next write.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
There's a DoS if we keep reading commands and don't insist the client
read the responses.
My initial implementation simply removed the io_duplex, but that
doesn't work if we want to inject notifications in the stream (as we
will eventually want to do), so we operate it as duplex but have each
side wake the other when it's done.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
My test case is a mainnet gossip store with 22107 channels, and
time to do `lightning-cli listchannels`:
Before: `lightning-cli listchannels` DEVELOPER=0
real 0m1.396000-1.409000(1.4022+/-0.005)s
After:
real 0m1.307000-1.320000(1.3128+/-0.005)s
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It's the only user of them, and it's going to get optimized.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
gossip.pydiff --git a/common/test/run-json.c b/common/test/run-json.c
index 956fdda35..db52d6b01 100644
Some people were alarmed that the state was set to "Loaded from
database" indefinitely. Saying that we are trying to reconnect may be
more informative.
And use wallet_forward_status_in_db() everywhere in db code.
And clean up extra CHANGELOG.md entry (looks like rebase error?)
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The left join should make sure we still get the results but
referencing the fields and/or attempting to write them to the JSON-RPC
result will cause unforeseen problems. So just omit if we forgot
something.
We initialize it to 30 seconds, but it's *always* overridden by the
gossip_init message (and usually to 60 seconds, so it's doubly
misleading).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Gossipd provided a generic "get endpoints of this scid" and we only
use it in one place: to look up htlc forwards. But lightningd just
assumed that one would be us.
Instead, provide a simpler API which only returns the peer node
if any, and now we handle it much more gracefully.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Right now, the `config` file is read *after* the configuration working directory is moved to in the software. However one configuration option `lightning-dir` settable in the `config` file sets this working directory. As the directory is already opened (which defaults to `$HOME/.lightning`) before the configuration is read, the configured directory will not be used.
This patch parses the configuration file before opening the working directory, fixing this bug.
[ Update CHANGELOG.md and man pages -- RR ]
It went something like:
niftynei: Hey, cppcheck complains this might be NULL, so I put in a check.
rusty: cppcheck is dumb. Make it an assert("Rusty always right!").
niftynei: You seem certain of this so I shall do that.
https://github.com/ElementsProject/lightning/pull/1994
...
renepickhardt: I asked fiatjaf to run
`lightning-cli sendpay "[{'id':'02db8f487fcc0a'}]" 4efe0ba89b`
and his node crashed!
rusty: grep Assertion logs/*
lightningd/jsonrpc.c:326: connection_complete_error: Assertion `Rusty is always right!' failed.
It turns out that in the 'can't parse' error case, we hand NULL cmd to
connection_compete_error.
Next time, less asserting, more grepping!
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is a bit of overkill now that we simply accumulate the entire
JSON response in the buffer before flushing, but when we move to
streamed responses it allows us to have a single command that has
exclusive access to the out direction of the JSON-RPC connection.
This is the source of failure in the test_restart_many_payments stress
test: we don't commit the outgoing HTLC immediately, instead waiting for
gossip to tell us the peer for the outgoing channel, then waiting for
that channeld to tell is it's committed. The result was incoming HTLCs
with no outgoing.
I initially pushed the HTLCs through that same path, but of course
(since peers are not connected yet!) the only result was that we failed
these HTLCs immediately. So I chose the far simpler course of just
failing them directly.
To reproduce this, I had to increase the test_restart_many_payments
num to 10, and run it with nice -20 taskset -c 0.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
During tests, this is half our log! And Travis truncates it if we get
a failure in test_restart_many_payments.
Interestingly, test_logging had a bug which relied on this spam :)
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Instead of two code paths that return different help objects, simplify things by
always returning the full help object. This not only includes description and
the command name, but the verbose description as well.
Signed-off-by: William Casarin <jb55@jb55.com>
If another channel has set the optional `htlc_maximum_msat` field,
we should correctly parse that field and respect it when drawing up
routes for payments.
Noted by @cdecker, the term 'local' is grossly overused, and the hout
preimage is basically only used as a sanity check (though I've just put
a FIXME there for now).
Also eliminated spurious blank line which crept into wallet.c.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We don't expect payment or payment->route_channels to be NULL without an
old db, but putting an assert there reveals that we try to fail an HTLC
which has already succeeded in 'test_onchain_unwatch'.
Obviously we only want to fail an HTLC which goes onchain if we don't
already have the preimage!
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
failoutchannel tells us which channel to send an update for (specifically
for temporary_channel_failure); but we don't save it into the db. It's
not even clear we should, since it's a corner case and the channel might
not even exist when we come back.
So on db restore, change such errors to WIRE_TEMPORARY_NODE_FAILURE
which doesn't need an update.
We also don't memset it to 0 in the normal case (we only access if it
failcode has the UPDATE bit set) so valgrind will trigger if we're
wrong.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We don't save them to the database, so fix things up as we load them.
Next patch will actually save them into the db, and this will become
COMPAT code.
Also: call htlc_in_check() with NULL on db load, as otherwise it aborts
internally.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This means we need to check when we've altered the state, so the checks
are moved to the callers of htlc_in_update_state and htlc_out_update_state.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
globalfeatures should not be accessed if we haven't received a
channel_update. Treat it like the other fields which are only
initialized and marshalled/unmarshalled if the timestamp is positive.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
And use ARRAY_SIZE() everywhere which will break compile if it's not a
literal array, plus assertions that it's the same length.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We split json_invoice(), as it now needs to round-trip to the gossipd,
and uniqueness checks need to happen *after* gossipd replies to avoid
a race.
For every candidate channel gossipd gives us, we check that it's in
state NORMAL (not shutting down, not still waiting for lockin), that
it's connected, and that it has capacity. We then choose one with
probability weighted by excess capacity, so larger channels are more
likely.
As a side effect of this, we can tell if an invoice is unpayble (no
channels have sufficient incoming capacity) or difficuly (no *online*
channels have sufficient capacity), so we add those warnings.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
For routeboost, we want to select from all our enabled channels with
sufficient incoming capacity. Gossipd knows which are enabled (ie. we
have received a `channel_update` from the peer), but doesn't know the
current incoming capacity.
So we get gossipd to give us all the candidates, and lightningd
selects from those.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>