This fixes block parsing on testnet; specifically, non-standard tx versions.
We hit a type bug in libwally (wallt_get_secp_context()) which I had to
work around for the moment, and the updated libsecp adds an optional hash
function arg to the ECDH function.
Fixes: #2563
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Don't turn them to/from pubkeys implicitly. This means nodeids in the store
don't get converted, but bitcoin keys still do.
MCP results from 5 runs, min-max(mean +/- stddev):
store_load_msec:33934-35251(34531.4+/-5e+02)
vsz_kb:2637488
store_rewrite_sec:34.720000-35.130000(34.94+/-0.14)
listnodes_sec:1.020000-1.290000(1.146+/-0.086)
listchannels_sec:51.110000-58.240000(54.826+/-2.5)
routing_sec:30.000000-33.320000(30.726+/-1.3)
peer_write_all_sec:50.370000-52.970000(51.646+/-1.1)
MCP notable changes from previous patch (>1 stddev):
-store_load_msec:46184-47474(46673.4+/-4.5e+02)
+store_load_msec:33934-35251(34531.4+/-5e+02)
-vsz_kb:2638880
+vsz_kb:2637488
-store_rewrite_sec:46.750000-48.280000(47.512+/-0.51)
+store_rewrite_sec:34.720000-35.130000(34.94+/-0.14)
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
I tried to just do gossipd, but it was uncontainable, so this ended up being
a complete sweep.
We didn't get much space saving in gossipd, even though we should save
24 bytes per node.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is what all of this has been working towards: ripping out the handwoven
transaction handling. By removing the custom parsing we can finally switch
over to using `wally_tx` as sole representation of transactions in
memory. The commit is a bit larger but it's mostly removing setters and old
references to the input and output fields.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
The `wally_tx_input`s do not keep track of their input value, which means we
need to track them ourselves if we try to sign these transactions at a later
point in time.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
1. Rename channel_funding_locked to channel_funding_depth in
channeld/channel_wire.csv.
2. Add minimum_depth in struct channel in common/initial_channel.h and
change corresponding init function: new_initial_channel().
3. Add confirmation_needed in struct peer in channeld/channeld.c.
4. Rename channel_tell_funding_locked to channel_tell_depth.
5. Call channel_tell_depth even if depth < minimum, and still call
lockin_complete in channel_tell_depth, iff depth > minimum_depth.
6. channeld ignore the channel_funding_depth unless its >
minimum_depth(except to update billboard, and set
peer->confirmation_needed = minimum_depth - depth).
We need to do it in various places, but we shouldn't do it lightly:
the primitives are there to help us get overflow handling correct.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Basically we tell it that every field ending in '_msat' is a struct
amount_msat, and 'satoshis' is an amount_sat. The exceptions are
channel_update's fee_base_msat which is a u32, and
final_incorrect_htlc_amount's incoming_htlc_amt which is also a
'struct amount_msat'.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
As a side-effect of using amount_msat in gossipd/routing.c, we explicitly
handle overflows and don't need to pre-prune ridiculous-fee channels.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We used to just throw htlcs into the channel with a flag to tell it to
ignore overflow. Instead, we can insert them in order (which is the same as
id order) which always must be valid.
This helps when we turn the balance into a struct amount_msat which will get
upset with overflows.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
They're generally used pass-by-copy (unusual for C structs, but
convenient they're basically u64) and all possibly problematic
operations return WARN_UNUSED_RESULT bool to make you handle the
over/underflow cases.
The new #include in json.h means we bolt11.c sees the amount.h definition
of MSAT_PER_BTC, so delete its local version.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
LND seems to do this occasionally, though fixed in new versions. Workaround
in the meantime.
I tested this by hacking our code to send it prematurely, and this worked.
Fixes: #2219
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This solves (or at least reduces probability of) a deadlock in channeld
when there is lot of gossip traffic, see issue #2286. That issue is
almost identical to #1943 (deadlock in openingd) and so is the fix.
Spurious errors were occuring around checking the provided
current commitment point from the peer on reconnect when
option_data_loss_protect is enabled. The problem was that
we were using an inaccurate measure to screen for which
commitment point to compare the peer's provided one to.
This fixes the problem with screening, plus makes our
data_loss test a teensy bit more robust.
Christian and I both unwittingly used it in form:
*tal_arr_expand(&x) = tal(x, ...)
Since '=' isn't a sequence point, the compiler can (and does!) cache
the value of x, handing it to tal *after* tal_arr_expand() moves it
due to tal_resize().
The new version is somewhat less convenient to use, but doesn't have
this problem, since the assignment is always evaluated after the
resize.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is based on Christian's change, but removes all trace of the old codes.
I've proposed another spec change which removes this code altogether:
https://github.com/lightningnetwork/lightning-rfc/pull/544
Signed-off-by: Christian Decker <decker.christian@gmail.com>
Reported-by: Rusty Russell <@rustyrussell>
This is mainly just copying over the copy-editing from the
lightning-rfc repository.
[ Split to just perform changes prior to the UNKNOWN_PAYMENT_HASH change --RR ]
Signed-off-by: Christian Decker <decker.christian@gmail.com>
Reported-by: Rusty Russell <@rustyrussell>
Fortunately, we can calculate the sha256 ourselves, so the
outgoing channeld doesn't need to tell us.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
When the next node tells us the onion is malformed, we now actually
report the failcode to lightningd (rather than generating an invalid
error as we do now).
We could generate the onion at this point, but except we don't know
the shared secret; we'd have to plumb that through from the incoming
channeld's HTLC.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This covers all the cases where an onion can be malformed; this means
we know in advance that it's bad. That allows us to distinguish two
cases: where lightningd rejects the onion as bad, and where the next
peer rejects the next onion as bad. Both of those (will) set failcode
to one of the BADONION values.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It's more natural than using a zero-secret when something goes wrong.
Also note that the HSM will actually kill the connection if the ECDH
fails, which is fortunately statistically unlikely.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Funder can't spend the fee it needs to pay for the commitment transaction:
we were not converting to millisatoshis, however!
This breaks our routeboost test, which no longer has sufficient funds
to make payment.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We have an incompatibility with lnd it seems: I've lost channels on
reconnect with 'sync error'. Since I never got this code to be reliable,
disable it for next release since I suspect it's our fault :(
And reenable the check which didn't work, for others to untangle.
I couldn't get option_data_loss_protect to be reliable, and I disabled
the check. This was a mistake, I should have either spent even more
time trying to get to the bottom of this (especially, writing test
vectors for the spec and testing against other implementations).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is prep work for when we sign htlc txs with
SIGHASH_SINGLE|SIGHASH_ANYONECANPAY.
We still deal with raw signatures for the htlc txs at the moment, since
we send them like that across the wire, and changing that was simply too
painful (for the moment?).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We only use them for re-transmitting the last commitment tx,
and the HSM signs them sync so it's straight-line code.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This simplifies lifetime assumptions. Currently all callers keep the
original around, but everything broke when I changed that in the next
patch.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
They were not universally used, and most are trivial accessors anyway.
The exception is getting the channel reserve: we have to multiply by 1000
as well as flip direction, so keep that one.
The BOLT quotes move to `struct channel_config`.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We probably also want to call secp_randomise/wally_secp_randomize here
too, and since these calls all call setup_tmpctx, it probably makes
sense to have a helper function to do all that. Until thats done, I
modified the tests so grepping will show the places where the sequence
of calls is repeated.
Signed-off-by: Jon Griffiths <jon_p_griffiths@yahoo.com>
This avoids some very ugly switch() statements which mixed the two,
but we also take the chance to rename 'towire_gossip_' to
'towire_gossipd_' for those inter-daemon messages; they're messages to
gossipd, not gossip messages.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This way there's no need for a context pointer, and freeing a msg_queue
frees its contents, as expected.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This was suggested by Pierre-Marie as the solution to the 'same HTLC,
different CLTV' signature mismatch.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Have c-lightning nodes send out the largest value for
`htlc_maximum_msat` that makes sense, ie the lesser of
the peer's max_inflight_htlc value or the total channel
capacity minus the total channel reserve.
LND does this, and we get upset with it. I had assumed we would only
do this after funding_locked (since we don't consider the channel
shortid stable until that point), but TBH 6 confirms is probably
enough.
Fixes: #1985
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Under stress, it fails (test_restart_many_payments, the next test).
I suspect a deep misunderstanding in the comparison code, will chase
separately.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We do this a lot, and had boutique helpers in various places. So add
a more generic one; for convenience it returns a pointer to the new
end element.
I prefer the name tal_arr_expand to tal_arr_append, since it's up to
the caller to populate the new array entry.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
That matches the other CSV names (HSM was the first, so it was written
before the pattern emerged).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
@renepickhardt: why is it actually lightningd.c with a d but hsm.c without d ?
And delete unused gossipd/gossip.h.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
When in this state, we send a canned error "Awaiting unilateral close".
We enter this both when we drop to chain, and when we're trying to get
them to drop to chain due to option_data_loss_protect.
As this state (unlike channel errors) is saved to the database, it means
we will *never* talk to a peer again in this state, so they can't
confuse us.
Since we set this state in channel_fail_permanent() (which is the only
place we call drop_to_chain for a unilateral close), we don't need to
save to the db: channel_set_state() does that for us.
This state change has a subtle effect: we return WIRE_UNKNOWN_NEXT_PEER
instead of WIRE_TEMPORARY_CHANNEL_FAILURE as soon as we get a failure
with a peer. To provoke a temporary failure in test_pay_disconnect we
take the node offline.
Reported-by: Christian Decker @cdecker
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Firstly, if they claim to know a future value, we ask the HSM; if
they're right, we tell master what the per-commitment-secret it gave
us (we have no way to validate this, though) and it will not broadcast
a unilateral (knowing it will cause them to use a penalty tx!).
Otherwise, we check the results they sent were valid. The spec says
to do this (and close the channel if it's wrong!), because otherwise they
could continually lie and give us a bad per-commitment-secret when we
actually need it.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
For option_data_loss_protect, the peer can prove to us that it's ahead;
it gives us the (hopefully honest!) per_commitment_point it will use,
and we make sure we don't broadcast the commitment transaction we have.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We ignore incoming for now, but this means we advertize the option and
we send the required fields.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We quote BOLT 2 on *local* above the *remote* checks (we quote it
again below when we do the local checks).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
1. l1 update_fee -> l2
2. l1 commitment_signed -> l2 (using new feerate)
3. l1 <- revoke_and_ack l2
4. l1 <- commitment_signed l2 (using new feerate)
5. l1 -> revoke_and_ack l2
When we break the connection after #3, the reconnection causes #4 to
be retransmitted, but it turns out l1 wasn't telling the master to set
the local feerate until it received the commitment_signed, so on
reconnect it uses the old feerate, with predictable results (bad
signature).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Now sending a ping makes sense: it should force the other end to send
a reply, unblocking the commitment process.
Note that rather than waiting for a reply, we're actually spinning on
a 100ms loop in this case. But it's simple and it works.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This doesn't do much (though we might get an error before we send the
commitment_signed), but it's infrastructure for the next patch.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We were adding channels without their capacity, and eventually annotated them
when we exchanged `channel_update`s. This worked as long as we weren't
considering the channel capacity, but would result in local-only channels to be
unusable once we start checking.
Also means we simplify the handle_gossip_msg() since everyone wants it to
use sync_crypto_write().
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is clearer and neater, and even slightly more efficient, since
read_peer_msg() was calling poll() again on gossipfd.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
tal_count() is used where there's a type, even if it's char or u8, and
tal_bytelen() is going to replace tal_len() for clarity: it's only needed
where a pointer is void.
We shim tal_bytelen() for now.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We use these for receiving arrays at init time, we should also use them
for fulfull/fail of HTLCs in normal operation. That we we benefit from all
those assertions.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The master tells us the short_channel_id of the outgoing channel, and
channeld is supposed to get the corresponding channel_update from gossipd.
Instead, it got the channel_update for the *local* channel and ignored
that one.
The master tells us the short_channel_id of the outgoing channel when
failing an HTLC, but channeld didn't store it anywhere. It also
didn't tell channeld the short_channel_id in the case where we're
reconnecting and it's feeding us an array of failed htlcs.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>