Since `fundchannel` now supports the 'close_to' argument, we can remove
all the logic needed to call fundchannel_start here.
Underneath, we're still calling `fundchannel_start`, we're just one (or
two, if you count multifundchannel) call levels away from it now.
We had one report of this, and then Eugene and Roasbeef of Lightning
Labs confirmed it; they saw misordered HTLCs on reconnection too.
Since we didn't enforce this when we receive HTLCs, we never noticed :(
Fixes: #3920
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Fixed: Protocol: fixed retransmission order of multiple new HTLCs (causing channel close with LND)
We didn't care, but other implementations (particularly lnd) do. And it
does violate the spec.
(We need to use skip not xfail on the test which catches this, since
xfail doesn't seem to stop errors reported by cleanup)
(Includes Christian's typo fix!)
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Marking spent means if the transaction doesn't confirm for some
reason, the user will need to force a rescan to find the funds. Now
we have timed reservations, reserving for (an additional) 12 hours
should be sufficient.
We also take this opportunity (now we have our own callback path)
to record the tx in the wallet only on success.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Some minor phrasing differences cause test changes.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Changed: txprepare reservations stay across restarts: use fundpsbt/reservepsbt/unreservepsbt
Changelog-Removed: txprepare `destination` `satoshi` argument form removed (deprecated v0.7.3)
With a feerate of 7500perkw and subtracting 660 sats for anchors, a
20,000 sat channel has capacity about 9800 sat, below our default:
You gave bad parameters: channel capacity with funding 20000sat, reserves 546sat/546sat, max_htlc_value_in_flight_msat is 18446744073709551615msat, channel capacity is 9818sat, which is below 10000000msat
So bump channel amounts.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
And document exactly what it does: insist that an HTLC can pass of
this value (module assumptions of feerate).
Note that we remove the "is_opener" test from the capacity calculation
for anchor fees: it doesn't matter which side it is, someone has to pay
for anchor fees to it deducts from capacity.
This change breaks the test, which we rewrite.
Changelog-Changed: config: `min-capacity-sat` is now stricter about checking usable capacity of channels.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It's actually not possible to currently tell if you're using anchor_outputs
with a peer (since it depends on whether you both supported it at *channel open*).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-added: JSON-RPC: `listpeers` shows `features` list for each channel.
Anchor outputs break many assumptions in our tests:
1. Remove some hardcoded numbers in favor of a fee calc, so we only have to
change in one place.
FIXME: This should also be done for elements!
2. Do binary search to get feerate for a given closing fee.
3. Don't assume output #0: anchor outputs perturb them.
4. Don't assume we can make 1ksat channels (anchors cost 660 sats!).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
And when it's set, and we're SLOW_MACHINE, simply disable valgrind.
Since Travis (SLOW_MACHINE=1) only does VALGRIND=1 DEVELOPER=1 tests,
and VALGRIND=0 DEVELOPER=0 tests, it was missing tests which needed
DEVELOPER and !VALGRIND.
Instead, this demotes them to non-valgrind tests for SLOW_MACHINEs.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
I started replacing all get_node() calls, but got bored, so then just did the
tests which call get_node() 3 times or more.
Ends up not making a measurable speed difference, but it does make some
things neater and more standard.
Times with SLOW_MACHINE=1 (given that's how Travis tests):
Time before (non-valgrind):
393 sec (had 3 failures?)
Time after (non-valgrind):
410 sec
Time before (valgrind):
890 seconds (had 2 failures)
Time after (valgrind):
892 sec
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
I thought this was timing out because I made it slow with the
change to txprepare as a plugin. In fact, it was timing out
because sometimes gossip comes so fast it gets suppressed
and we never get the log messags.
Still, before this it took 98 seconds under valgrind and
24 under non-valgrind, so it's an improvement to time as
well as robustness.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Changed: `fundchannel_cancel` will now succeed even when executed while a `fundchannel_complete` is ongoing; in that case, it will be considered as cancelling the funding *after* the `fundchannel_complete` succeeds.
Let me introduce the concept of "Sequential Consistency":
All operations on parallel processes form a single total order agreed upon by all processes.
So for example, suppose we have parallel invocations of `fundchannel_complete` and `fundchannel_cancel`:
+--[fundchannel_complete]-->
|
--[fundchannel_start]-+
|
+--[fundchannel_cancel]---->
What "Sequential Consistency" means is that the above parallel operations can be serialized as a single total order as:
--[fundchannel_start]--[fundchannel_complete]--[fundchannel_cancel]-->
Or:
--[fundchannel_start]--[fundchannel_cancel]--[fundchannel_complete]-->
In the first case, `fundchannel_complete` succeeds, and the `fundchannel_cancel` invocation also succeeds, sending an `error` to the peer to make them forget the chanel.
In the second case, `fundchannel_cancel` succeeds, and the succeeding `fundchannel_complete` invocation fails, since the funding is already cancelled and there is nothing to complete.
Note that in both cases, `fundchannel_cancel` **always** succeeds.
Unfortunately, prior to this commit, `fundchannel_cancel` could fail with a `Try fundchannel_cancel again` error if the `fundchannel_complete` is ongoing when the `fundchannel_cancel` is initiated.
This violates Sequential Consistency, as there is no single total order that would have caused `fundchannel_cancel` to fail.
This commit is a minimal patch which just reschedules `fundchannel_cancel` to occur after any `fundchannel_complete` that is ongoing.
I noticed the following in logs for tests/test_connection.py::test_feerate_stress:
```
DEBUG 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-chan#1: Failing HTLC 18446744073709551615 due to peer death
DEBUG 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-chan#1: local_routing_failure: 8194 (WIRE_TEMPORARY_NODE_FAILURE)
```
This is because it reports the (transient) node_failure error, because
our channel_failure message is incomplete. Fix this wart up.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
If we don't wait for close tx to reach mempool, it might not get to
depth 100, and we don't get 'onchaind complete, forgetting peer'.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Previously we've used the term 'funder' to refer to the peer
paying the fees for a transaction; v2 of openchannel will make
this no longer true. Instead we rename this to 'opener', or the
peer sending the 'open_channel' message, since this will be universally
true in a dual-funding world.
The new `keysend` plugin modifies the node features that we send to
peers. This commit breaks out the 'expected_features' we use for tests
to encompass this differentiation.
Sending update_fee immediately after channel establishment seems to
upset LND, so work around it by deferring it. The reason we increase
the fee after establishment is because now we might need to close the
channel in a hurry due to htlcs, but until there are htlcs that's
unnecessary.
Fixes: #3596
Changelog-Changed: Added workaround for lnd rejecting our commitment_signed when we send an update_fee after channel confirmed.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Note that now we check capacity once we've figured out which peer, which
broke a test (we returned "unknown peer" instead of "capacity exceeded"),
so we rework that too.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
A CONSERVATIVE/3 target for them.
Some noisy changes to the tests as we had to update the estimatesmartfee
mock.
Changelog-Changed: We now use a higher feerate for resolving onchain HTLCs and for penalty transactions
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Removed: JSON: `fundchannel` and `fundchannel_start` `satoshi` parameter removed (renamed to `amount` in 0.7.3).
Add new check if we're funder trying to add HTLC, keeping us
with enough extra funds to pay for another HTLC the peer might add.
We also need to adjust the spendable_msat calculation, and update
various tests which try to unbalance channels. We eliminate
the now-redundant test_channel_drainage entirely.
Changelog-Fixed: Corner case where channel could become unusable (https://github.com/lightningnetwork/lightning-rfc/issues/728)
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
restrict fundchannel_cancel usage to only the opener side
Changelog-Changed: Only the opener of a fundchannel can cancel the channel open with fundchannel_cancel
it's that time of year (merry xmas!)
enables the ability to push_msat on fundchannel
Changelog-Added: RPC: `fundchannel` and `fundchannel_start` can now accept an optional parameter, `push_msat`, which will gift that amount of satoshis to the peer at channel open.
We still close the channel if we *send* an error, but we seem to have hit
another case where LND sends an error which seems transient, so this will
make a best-effort attempt to preserve our channel in that case.
Some test have to be modified, since they don't terminate as they did
previously :(
Changelog-Changed: quirks: We'll now reconnect and retry if we get an error on an established channel. This works around lnd sending error messages that may be non-fatal.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is an intermediary step: we still don't save it to the database,
but we do use the fee_states struct to track it internally.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The spec is (RSN!) going to explicitly denote where each feature should
be presented, so create that infrastructure.
Incorporate the new proposed bolt11 features, which need this.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-changed: .lightningd plugins and files moved into <network>/ subdir
Changelog-changed: WARNING: If you don't have a config file, you now may need to specify the network to lightning-cli
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rounds out the application of `upfront_shutdown_script`, allowing
an accepting node to specify a close_to address.
Prior to this, only the opening node could specify one.
Changelog-Added: Plugins: Allow the 'accepter' to specify an upfront_shutdown_script for a channel via a `close_to` field in the openchannel hook result
Spaces just make life a little harder for everyone.
(Plus, fix documentation: it's 'jsonrpc' not 'json' subsystem).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This simplifies our tests, too, since we don't need a magic option to
enable io logging in subdaemons.
Note that test_bad_onion still takes too long, due to a separate minor
bug, so that's marked and left dev-only for now.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
1. Printed form is always "[<nodeid>-]<prefix>: <string>"
2. "jcon fd %i" becomes "jsonrpc #%i".
3. "jsonrpc" log is only used once, and is removed.
4. "database" log prefix is use for db accesses.
5. "lightningd(%i)" becomes simply "lightningd" without the pid.
6. The "lightningd_" prefix is stripped from subd log prefixes, and pid removed.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-changed: Logging: formatting made uniform: [NODEID-]SUBSYSTEM: MESSAGE
Changelog-removed: `lightning_` prefixes removed from subdaemon names, including in listpeers `owner` field.
A log can have a default node_id, which can be overridden on a per-entry
basis. This changes the format of logging, so some tests need rework.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
If a 'upfront_shutdown_script' was specified, show the address +
scriptpubky in `listpeers`
Changelog-added: JSON API: `listpeers` channels now include `close_to` and `close_to_addr` iff a `close_to` address was specified at channel open
Reduce test_feerate_stress iterations, and simply don't run
test_pay_retry under VALGRIND with SLOW_MACHINE at all.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
A long time ago (93dcd5fed7), I
simplified the htlc reload code so it adjusted the amounts for HTLCs
in id order. As we presumably allowed them to be added in that order,
this avoided special-casing overflow (which was about to deliberately
be made harder by the new amount_msat code).
Unfortunately, htlc id order is not canonical, since htlc ids are
assigned consecutively in both directions! Concretely, we can have two HTLCs:
HTLC #0 LOCAL->REMOTE: 500,000,000 msat, state RCVD_REMOVE_REVOCATION
HTLC #0 REMOTE->LOCAL: 10,000 msat, state SENT_ADD_COMMIT
On a new remote-funded channel, in which we have 0 balance, these
commits *only* work in this order. Sorting by HTLC ID is not enough!
In fact, we'd have to worry about redemption order as well, as that
matters.
So, regretfully, we offset the balances halfway to UINT64_MAX, then check
they didn't underflow at the end. This loses us this one sanity check,
but that's probably OK.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Feerate changes are asymmetric, as they can only be sent by the funder.
For FUNDER, the remote feerate is set when upon send of
commitment_signed, and the local feerate is set on receipt of
revoke_and_ack.
For non-funder, the local feerate is set on receipt of
commitment_signed, and the remote feerate set on send of
revoke_and_ack. In our code, these two happen together.
channeld gets this right, but lightningd ignored the funder/fundee
distinction, and as a result, receipt of a commitment_signed by the
funder altered fees in the database. If there was a reconnection
event or restart, then these (incorrect) values would be used, causing
us to complain about a 'Bad commit_sig signature' and close the
channel.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
A 'Bad commit_sig signature' was reported by @Javier on Telegram and
@DarthCoin. This was between two c-lightning peers, so definitely our fault.
Analysis of this message revealed the signature was using the wrong
feerate. I finally managed to make a test case which triggered this.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Check behavior for user supplied upfront_shutdown_script via close_to
Header from folded patch 'fix__return__not__iff_well_close_to_the_provided_addr.patch':
fix: return not iff we'll close to the provided addr
This is mainly an internal-only change, especially since we don't
offer any globalfeatures.
However, LND (as of next release) will offer global features, and also
expect option_static_remotekey to be a *global* feature. So we send
our (merged) feature bitset as both global and local in init, and fold
those bitsets together when we get an init msg.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It sometimes fail with a bad_gossip error because the sending node
might not have found out about the channel when it gets a
channel_update. Make sure the whole network knows everything before
we start.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Since elements addresses look quite different from the bitcoin mainnet
addresses I just added a sample to the chainparams fixture. In addition I
extracted some of the fixed strings to reference chainparams instead.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
It only had an effect if the peer didn't support option_gossip_queries, but
still, we don't want a gossip blast any more.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The test was implicitly relying on us selecting the larger output and then not
touching the smaller, leaving it there for the final `withdraw` to claim. This
ordering of UTXOs is not guaranteed, and in particular can fail when switching
DB backends. To stabilize we just need to make sure to select the change
output as well.
It's generally clearer to have simple hardcoded numbers with an
#if DEVELOPER around it, than apparent variables which aren't, really.
Interestingly, our pruning test was always kinda broken: we have to pass
two cycles, since l2 will refresh the channel once to avoid pruning.
Do the more obvious thing, and cut the network in half and check that
l1 and l3 time out.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
After switching to a plugin, we verify that we can fund a channel
before we check to contact a peer. We'll need to have a funded wallet
to pass the check in this test that verifies that 'fundchannel' cannot
be called for a peer after fundchannel_start is.
For now, we can't fully ensure that the broadcast was catched from a third pary. Only when the transaction (broadcast by a third pary) is onchain, we can catch it.
531c8d7d9b
In this one, we always send my_current_per_commitment_point, though it's
ignored. And we have our official feature numbers.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
As per BOLT02 #message-retransmission :
if `next_commitment_number` is 1 in both the `channel_reestablish` it sent and received:
- MUST retransmit `funding_locked`
I don't remember ever seeing a bug which only showed up in VALGRIND=1 with developer
mode disabled, so don't test that, and spread out the other test more evenly.
In addition, disable the worst-performing tests in DEVELOPER=0 mode.
Here timings from my build machine: the worst 6 (- DEVELOPER=0 VALGRIND=0)
with the same tests (+ DEVELOPER=1 VALGRIND=1)
-452.42s call tests/test_pay.py::test_channel_spendable
+87.69s call tests/test_pay.py::test_channel_spendable
-335.66s call tests/test_gossip.py::test_gossip_store_compact_on_load
+47.41s call tests/test_gossip.py::test_gossip_store_compact_on_load
-332.07s call tests/test_connection.py::test_opening_tiny_channel
+89.71s call tests/test_connection.py::test_opening_tiny_channel
-331.97s call tests/test_pay.py::test_channel_spendable_large
+56.23s call tests/test_pay.py::test_channel_spendable_large
-305.28s call tests/test_invoices.py::test_invoice_routeboost
+37.57s call tests/test_invoices.py::test_invoice_routeboost
-284.28s call tests/test_plugin.py::test_htlc_accepted_hook_forward_restart
+49.12s call tests/test_plugin.py::test_htlc_accepted_hook_forward_restart
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We're about to change the API, so this makes the tests still work
across the transition (and, as a bonus, tests our backwards compat
shim).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This test is spawning 100 nodes concurrently, which is a lot even when not
running with `valgrind`, especially when executing tests in parallel.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
Instead of taking over the ->cmd pointer, append ourselves to a list
of cancels. This fixes the test_funding_cancel_race.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
And clean up some dev ones which actually happen (mainly by calling
channel_fail_permanent which logs UNUSUAL, rather than
channel_internal_error which logs BROKEN).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
@pm47 gave a great bug report showing c-lightning sending the same
UPDATE_FEE over and over, with the final surprise result being that we
blamed the peer for sending us multiple empty commits!
The spam is caused by us checking "are we at the desired feerate?" but
then if we can't afford the desired feerate, setting the feerate we
can afford, even though it's a duplicate. Doing the feerate cap before
we test if it's what we have already eliminates this.
But the empty commits was harder to find: it's caused by a heuristic in
channel_rcvd_revoke_and_ack:
```
/* For funder, ack also means time to apply new feerate locally. */
if (channel->funder == LOCAL &&
(channel->view[LOCAL].feerate_per_kw
!= channel->view[REMOTE].feerate_per_kw)) {
status_trace("Applying feerate %u to LOCAL (was %u)",
channel->view[REMOTE].feerate_per_kw,
channel->view[LOCAL].feerate_per_kw);
channel->view[LOCAL].feerate_per_kw
= channel->view[REMOTE].feerate_per_kw;
channel->changes_pending[LOCAL] = true;
}
```
We assume we never send duplicates, so we detect an otherwise-empty
change using the difference in feerates. If we don't set this flag,
we will get upset if we receive a commitment_signed since we consider
there to be no changes to commit.
This is actually hard to test: the previous commit adds a test which
spams update_fee and doesn't trigger this bug, because both sides
use the same "there's nothing outstanding" logic.
Fixes: #2701
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Remote node may (incorrectly) not send announcement_signatures when
reconnecting, so we we use a copy and can still re-announce.
Also checks that we still send our announcement_signatures when reconnecting.
The old value of 1000 sat was too small to cover the dust reserves.
This lead to the situation when trying to open a channel with minimal
amount, the channels got refused because they were not able cover the
commitment fees.
For this reason the minimal capacity should be increased to i.e. 10k
satoshi, as the technical minimum that also accounts for fees and
reserves is somewhere around 6k sat.
New name is less confusing, and most people should be transitioning to
listpays rather than this anyway.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We want to disallow using unconfirmed outputs by default, so making the
default 1 confirmation seems a good idea. This also matches `bitcoind`s
minimum confirmation requirement.
Arming however breaks some of our tests, so I used `minconf=0` for the
breaking tests and added a new test specifically for the `minconf` parameter
for `fundchannel`.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
An uncommitted channel should not keep the peer in the db, since the
uncommitted channel isn't in the db itself.
Fixes: #2367
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
- result fundchannel command now depends on successful or failed broadcast of the funding tx
- failure returns error code FUNDING_BROADCAST_FAIL
- don't fail the channel when broadcast failed, but keep in CHANNELD_AWAITING_LOCKIN
- after fixing the initial broadcast failure, the user could manually rebroadcast the tx and
keep the channel
openingd/opening_funder_finished:
- broadcast_tx callback function now handles both success and failure
jsonrpc: added error code FUNDING_BROADCAST_FAIL
manpage: added error code returned by fundchannel command
This makes the user more aware of broadcast failure, so it hopefully doesn't
try to broadcast new tx's that depend on its change_outputs. Some users have reported (see
issue #2171) a whole sequence of fundings failing, because each funding was using the change
output of the previous one, which would not confirm.
We actually produce an invalid JSON error at the moment: bitcoin-cli
complains "JSON value is not an integer as expected" rather than returning
the given error. Make our error a valid JSON RPC error.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Spurious errors were occuring around checking the provided
current commitment point from the peer on reconnect when
option_data_loss_protect is enabled. The problem was that
we were using an inaccurate measure to screen for which
commitment point to compare the peer's provided one to.
This fixes the problem with screening, plus makes our
data_loss test a teensy bit more robust.
Currently only used by gossipd for channel elimination.
Also print them in canonical form (/[01]), so tests need to be
changed.
Suggested-by: @cdecker
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We have an incompatibility with lnd it seems: I've lost channels on
reconnect with 'sync error'. Since I never got this code to be reliable,
disable it for next release since I suspect it's our fault :(
And reenable the check which didn't work, for others to untangle.
I couldn't get option_data_loss_protect to be reliable, and I disabled
the check. This was a mistake, I should have either spent even more
time trying to get to the bottom of this (especially, writing test
vectors for the spec and testing against other implementations).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Because gossip in this case takes up to a minute, this test took 10
minutes. The workaround is to do the waiting-for-gossip all at once.
Now it takes 362 seconds.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
After Ubuntu 18.10 upgrade, lots of new flake8 warnings.
$ flake8 --version:
3.5.0 (mccabe: 0.6.1, pycodestyle: 2.4.0, pyflakes: 1.6.0) CPython 3.6.7rc1 on Linux
Note it seems that W503 warned about line breaks before binary
operators, and W504 complains about them after. I prefer W504, so
disable W503.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
When the wrong key is used, the remote end simply hangs up.
We used to get a random errno, which tends to be "Operation now in progress."
Now it's defined to be 0, detect and provide a better error.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We don't save them to the database, so fix things up as we load them.
Next patch will actually save them into the db, and this will become
COMPAT code.
Also: call htlc_in_check() with NULL on db load, as otherwise it aborts
internally.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>