If we're quick (or the node is slow) we end up reconnecting before our
counterparty has realized the state transition, resulting in an
unexpected re-establish.
There's a case where a dropped funding_locked will result in the peer
moving onto channeld, while we stay in dualopend. As we haven't
received their funding_locked, we retransmit tx_sigs, which channeld
will need to handle.
With the patch the peer drops it on the floor; the peer will resend
funding_locked on reconnect, which will correctly advance us to
channeld and CHANNELD_NORMAL
If we're already attempting to connect to a peer, we would ignore
new connection requests. This is problematic if your node has bad
connection details for the node -- you can't update it while inflight.
This patch appends new connection suggestions to the list of connections
to try.
Fixes#4154
Since `fundchannel` now supports the 'close_to' argument, we can remove
all the logic needed to call fundchannel_start here.
Underneath, we're still calling `fundchannel_start`, we're just one (or
two, if you count multifundchannel) call levels away from it now.
We had one report of this, and then Eugene and Roasbeef of Lightning
Labs confirmed it; they saw misordered HTLCs on reconnection too.
Since we didn't enforce this when we receive HTLCs, we never noticed :(
Fixes: #3920
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Fixed: Protocol: fixed retransmission order of multiple new HTLCs (causing channel close with LND)
We didn't care, but other implementations (particularly lnd) do. And it
does violate the spec.
(We need to use skip not xfail on the test which catches this, since
xfail doesn't seem to stop errors reported by cleanup)
(Includes Christian's typo fix!)
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Marking spent means if the transaction doesn't confirm for some
reason, the user will need to force a rescan to find the funds. Now
we have timed reservations, reserving for (an additional) 12 hours
should be sufficient.
We also take this opportunity (now we have our own callback path)
to record the tx in the wallet only on success.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Some minor phrasing differences cause test changes.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Changed: txprepare reservations stay across restarts: use fundpsbt/reservepsbt/unreservepsbt
Changelog-Removed: txprepare `destination` `satoshi` argument form removed (deprecated v0.7.3)
With a feerate of 7500perkw and subtracting 660 sats for anchors, a
20,000 sat channel has capacity about 9800 sat, below our default:
You gave bad parameters: channel capacity with funding 20000sat, reserves 546sat/546sat, max_htlc_value_in_flight_msat is 18446744073709551615msat, channel capacity is 9818sat, which is below 10000000msat
So bump channel amounts.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
And document exactly what it does: insist that an HTLC can pass of
this value (module assumptions of feerate).
Note that we remove the "is_opener" test from the capacity calculation
for anchor fees: it doesn't matter which side it is, someone has to pay
for anchor fees to it deducts from capacity.
This change breaks the test, which we rewrite.
Changelog-Changed: config: `min-capacity-sat` is now stricter about checking usable capacity of channels.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It's actually not possible to currently tell if you're using anchor_outputs
with a peer (since it depends on whether you both supported it at *channel open*).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-added: JSON-RPC: `listpeers` shows `features` list for each channel.
Anchor outputs break many assumptions in our tests:
1. Remove some hardcoded numbers in favor of a fee calc, so we only have to
change in one place.
FIXME: This should also be done for elements!
2. Do binary search to get feerate for a given closing fee.
3. Don't assume output #0: anchor outputs perturb them.
4. Don't assume we can make 1ksat channels (anchors cost 660 sats!).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
And when it's set, and we're SLOW_MACHINE, simply disable valgrind.
Since Travis (SLOW_MACHINE=1) only does VALGRIND=1 DEVELOPER=1 tests,
and VALGRIND=0 DEVELOPER=0 tests, it was missing tests which needed
DEVELOPER and !VALGRIND.
Instead, this demotes them to non-valgrind tests for SLOW_MACHINEs.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
I started replacing all get_node() calls, but got bored, so then just did the
tests which call get_node() 3 times or more.
Ends up not making a measurable speed difference, but it does make some
things neater and more standard.
Times with SLOW_MACHINE=1 (given that's how Travis tests):
Time before (non-valgrind):
393 sec (had 3 failures?)
Time after (non-valgrind):
410 sec
Time before (valgrind):
890 seconds (had 2 failures)
Time after (valgrind):
892 sec
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
I thought this was timing out because I made it slow with the
change to txprepare as a plugin. In fact, it was timing out
because sometimes gossip comes so fast it gets suppressed
and we never get the log messags.
Still, before this it took 98 seconds under valgrind and
24 under non-valgrind, so it's an improvement to time as
well as robustness.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Changed: `fundchannel_cancel` will now succeed even when executed while a `fundchannel_complete` is ongoing; in that case, it will be considered as cancelling the funding *after* the `fundchannel_complete` succeeds.
Let me introduce the concept of "Sequential Consistency":
All operations on parallel processes form a single total order agreed upon by all processes.
So for example, suppose we have parallel invocations of `fundchannel_complete` and `fundchannel_cancel`:
+--[fundchannel_complete]-->
|
--[fundchannel_start]-+
|
+--[fundchannel_cancel]---->
What "Sequential Consistency" means is that the above parallel operations can be serialized as a single total order as:
--[fundchannel_start]--[fundchannel_complete]--[fundchannel_cancel]-->
Or:
--[fundchannel_start]--[fundchannel_cancel]--[fundchannel_complete]-->
In the first case, `fundchannel_complete` succeeds, and the `fundchannel_cancel` invocation also succeeds, sending an `error` to the peer to make them forget the chanel.
In the second case, `fundchannel_cancel` succeeds, and the succeeding `fundchannel_complete` invocation fails, since the funding is already cancelled and there is nothing to complete.
Note that in both cases, `fundchannel_cancel` **always** succeeds.
Unfortunately, prior to this commit, `fundchannel_cancel` could fail with a `Try fundchannel_cancel again` error if the `fundchannel_complete` is ongoing when the `fundchannel_cancel` is initiated.
This violates Sequential Consistency, as there is no single total order that would have caused `fundchannel_cancel` to fail.
This commit is a minimal patch which just reschedules `fundchannel_cancel` to occur after any `fundchannel_complete` that is ongoing.
I noticed the following in logs for tests/test_connection.py::test_feerate_stress:
```
DEBUG 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-chan#1: Failing HTLC 18446744073709551615 due to peer death
DEBUG 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-chan#1: local_routing_failure: 8194 (WIRE_TEMPORARY_NODE_FAILURE)
```
This is because it reports the (transient) node_failure error, because
our channel_failure message is incomplete. Fix this wart up.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
If we don't wait for close tx to reach mempool, it might not get to
depth 100, and we don't get 'onchaind complete, forgetting peer'.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Previously we've used the term 'funder' to refer to the peer
paying the fees for a transaction; v2 of openchannel will make
this no longer true. Instead we rename this to 'opener', or the
peer sending the 'open_channel' message, since this will be universally
true in a dual-funding world.
The new `keysend` plugin modifies the node features that we send to
peers. This commit breaks out the 'expected_features' we use for tests
to encompass this differentiation.
Sending update_fee immediately after channel establishment seems to
upset LND, so work around it by deferring it. The reason we increase
the fee after establishment is because now we might need to close the
channel in a hurry due to htlcs, but until there are htlcs that's
unnecessary.
Fixes: #3596
Changelog-Changed: Added workaround for lnd rejecting our commitment_signed when we send an update_fee after channel confirmed.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Note that now we check capacity once we've figured out which peer, which
broke a test (we returned "unknown peer" instead of "capacity exceeded"),
so we rework that too.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
A CONSERVATIVE/3 target for them.
Some noisy changes to the tests as we had to update the estimatesmartfee
mock.
Changelog-Changed: We now use a higher feerate for resolving onchain HTLCs and for penalty transactions
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Removed: JSON: `fundchannel` and `fundchannel_start` `satoshi` parameter removed (renamed to `amount` in 0.7.3).
Add new check if we're funder trying to add HTLC, keeping us
with enough extra funds to pay for another HTLC the peer might add.
We also need to adjust the spendable_msat calculation, and update
various tests which try to unbalance channels. We eliminate
the now-redundant test_channel_drainage entirely.
Changelog-Fixed: Corner case where channel could become unusable (https://github.com/lightningnetwork/lightning-rfc/issues/728)
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
restrict fundchannel_cancel usage to only the opener side
Changelog-Changed: Only the opener of a fundchannel can cancel the channel open with fundchannel_cancel
it's that time of year (merry xmas!)
enables the ability to push_msat on fundchannel
Changelog-Added: RPC: `fundchannel` and `fundchannel_start` can now accept an optional parameter, `push_msat`, which will gift that amount of satoshis to the peer at channel open.
We still close the channel if we *send* an error, but we seem to have hit
another case where LND sends an error which seems transient, so this will
make a best-effort attempt to preserve our channel in that case.
Some test have to be modified, since they don't terminate as they did
previously :(
Changelog-Changed: quirks: We'll now reconnect and retry if we get an error on an established channel. This works around lnd sending error messages that may be non-fatal.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is an intermediary step: we still don't save it to the database,
but we do use the fee_states struct to track it internally.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The spec is (RSN!) going to explicitly denote where each feature should
be presented, so create that infrastructure.
Incorporate the new proposed bolt11 features, which need this.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-changed: .lightningd plugins and files moved into <network>/ subdir
Changelog-changed: WARNING: If you don't have a config file, you now may need to specify the network to lightning-cli
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rounds out the application of `upfront_shutdown_script`, allowing
an accepting node to specify a close_to address.
Prior to this, only the opening node could specify one.
Changelog-Added: Plugins: Allow the 'accepter' to specify an upfront_shutdown_script for a channel via a `close_to` field in the openchannel hook result
Spaces just make life a little harder for everyone.
(Plus, fix documentation: it's 'jsonrpc' not 'json' subsystem).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This simplifies our tests, too, since we don't need a magic option to
enable io logging in subdaemons.
Note that test_bad_onion still takes too long, due to a separate minor
bug, so that's marked and left dev-only for now.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
1. Printed form is always "[<nodeid>-]<prefix>: <string>"
2. "jcon fd %i" becomes "jsonrpc #%i".
3. "jsonrpc" log is only used once, and is removed.
4. "database" log prefix is use for db accesses.
5. "lightningd(%i)" becomes simply "lightningd" without the pid.
6. The "lightningd_" prefix is stripped from subd log prefixes, and pid removed.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-changed: Logging: formatting made uniform: [NODEID-]SUBSYSTEM: MESSAGE
Changelog-removed: `lightning_` prefixes removed from subdaemon names, including in listpeers `owner` field.