It wasn't invalid due to a missing channel_update, but in fact was a
bad checksum due to a cut & paste bug. Fix that, and assert it's not
actually truncating.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
If something went wrong and there was an old one, we were
appending to it!
Reported-by: @SimonVrouwe
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We might have channel_announcements which have no channel_update: normally
these don't get written into the store until there is one, but if the
store was truncated it can happen. We then get upset on compaction, since
we don't have an in-memory representation of the channel_announcement.
Similarly, we leave the node_announcement pending until after that
channel_announcement, leading to a similar case.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
I decided to try a faster implementation, only to find our crc32c was
not correct! Ouch.
I removed the crc32c functions from ccan/crc, and added a new crc32c
module which has the Mark Adler x86-64-optimized variants.
We bump gossip_store version again, since csums have changed.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Remove gratuitous prints, add explanations of what's going on,
and demonstrate that we can add a final trimmed HTLC but not
a non-trimmed one.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Subtracting both arbitrarily reduces our capacity, even for ourselves
since the routing logic uses this maximum.
I also changed 'advertise' to 'advertize', since we use american
spelling.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Turns out we needed more comprehensive testing; we ended up with three
separate tests. To avoid changing test_channel_drainage as we fix
spendable_msat, I substituted raw numbers there.
The first is a variation of the existing tests, testing we can't
exceed spendable_msat, and we can pay it, both ways.
The second is with a larger amount, which triggers a different problem.
The final is with a giant channel, which tests our 2^32-1 msat cap.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is where payment tests should go. Also mark it xfail for the moment,
and remove developer-only tag (propagating gossip is only 60 seconds, which
is OK).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
There were several gossip breakages in master; bumping version means
upgrades get a clean store (not just those upgrading from stable version).
Fixes: #2719
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
@pm47 gave a great bug report showing c-lightning sending the same
UPDATE_FEE over and over, with the final surprise result being that we
blamed the peer for sending us multiple empty commits!
The spam is caused by us checking "are we at the desired feerate?" but
then if we can't afford the desired feerate, setting the feerate we
can afford, even though it's a duplicate. Doing the feerate cap before
we test if it's what we have already eliminates this.
But the empty commits was harder to find: it's caused by a heuristic in
channel_rcvd_revoke_and_ack:
```
/* For funder, ack also means time to apply new feerate locally. */
if (channel->funder == LOCAL &&
(channel->view[LOCAL].feerate_per_kw
!= channel->view[REMOTE].feerate_per_kw)) {
status_trace("Applying feerate %u to LOCAL (was %u)",
channel->view[REMOTE].feerate_per_kw,
channel->view[LOCAL].feerate_per_kw);
channel->view[LOCAL].feerate_per_kw
= channel->view[REMOTE].feerate_per_kw;
channel->changes_pending[LOCAL] = true;
}
```
We assume we never send duplicates, so we detect an otherwise-empty
change using the difference in feerates. If we don't set this flag,
we will get upset if we receive a commitment_signed since we consider
there to be no changes to commit.
This is actually hard to test: the previous commit adds a test which
spams update_fee and doesn't trigger this bug, because both sides
use the same "there's nothing outstanding" logic.
Fixes: #2701
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
- mock_rpc function now returns full JSON-RPC response, is much cleaner
- Since reached_announce_depth counting is fixed when starting
channeld, we don't need the 7th block to tell depth anymore.
Remote node may (incorrectly) not send announcement_signatures when
reconnecting, so we we use a copy and can still re-announce.
Also checks that we still send our announcement_signatures when reconnecting.
Broken by 909913c265, but since Travis
skips this test ("temporarily", according to the commit msg in January!)
it wasn't caught.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
1. Create a plugin: ./lightning/tests/plugins/pretend_badlog.py
This plugin subscribes 'warning' notification and log the payload of
'warning';
2. Add a new test: tests/test_plugin.py::test_warning_notification
This test runs the plugin-pretend_badlog.py and check if 'warning'
notification can be normal triggered and subscribed.
We reserve inputs when we're going to send a transaction, but we don't
unreserve them if we crash. This is most graphically demonstrated by
the txprepare case, which makes it easier to trigger.
Instead, we should query bitcoind to see whether the tx made it out or
not, as we would do manually with dev-rescan-outputs.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We fail this at the moment, since we rely on shutdown to do the cleanups
for us.
(Also had to fix the unclean shutdown path: the caller checks the rc unless
mayfail is set, and of course it's not zero since we just SIGTERM'd it).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It would always return bech32; fix that, and don't bother printing
it if they use the (new) 'all' parameter.
This API was introduced in 3e67c09d5e,
which means it wasn't in a release so no CHANGELOG entry necessary.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We didn't count some records before, so we could compare the two counters.
This is much simpler, and avoids reliance on bs.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This means we intercept the peer's gossip_timestamp_filter request
in the per-peer subdaemon itself. The rest of the semantics are fairly
simple however.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(We don't increment the gossip_store version, since there are only a
few commits since the last time we did this).
This lets the reader simply filter messages; this is especially nice since
the channel_announcement timestamp is *derived*, not in the actual message.
This also creates a 'struct gossip_hdr' which makes the code a bit
clearer.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Keeping the uintmap ordering all the broadcastable messages is expensive:
130MB for the million-channels project. But now we delete obsolete entries
from the store, we can have the per-peer daemons simply read that sequentially
and stream the gossip itself.
This is the most primitive version, where all gossip is streamed;
successive patches will bring back proper handling of timestamp filtering
and initial_routing_sync.
We add a gossip_state field to track what's happening with our gossip
streaming: it's initialized in gossipd, and currently always set, but
once we handle timestamps the per-peer daemon may do it when the first
filter is sent.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We use the high bit of the length field: this way we can still check
that the checksums are valid on deleted fields.
Once this is done, serially reading the gossip_store file will result
in a complete, ordered, minimal gossip broadcast. Also, the horrible
corner case where we might try to delete things from the store during
load time is completely gone: we only load non-deleted things.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We're about to bump version again, and the code to upgrade it was
quite hairy (and buggy!). It's not worthwhile for such a
poorly-tested path: I will just add code to limit how much incoming
gossip we get to avoid flooding when we upgrade, however.
I also use a modern gossip_store version in our test_gossip_store_load
test, instead of relying on the upgrade path.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
When we first receive a channel_update, we write both the
channel_announcement and that channel_update to the store: we need
that first update so we can set the channel_announcement timestamp.
However, the channel_update can be replaced later. This means we can
have a channel_announcement, a node_update which relies on it, then
the channel_update later.
So move the "this applies to a pending announcement" check lower, where
gossip_store can use it too. Has a nice side-effect of avoiding
one lookup of the node id.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
First, we should have a channel_update so we actually do some compaction!
(Reported-by @SimonVrouwe). But we should also handle the cases where:
1. A channel_announcement is *not* directly followed by a
channel_update (happens when the channel_update is replaced).
2. A node_announcement predates a channel_update for the peer
(again, can happen once a channel_update is replaced).
3. A local/private channel_creation is not directly followed by an
update.
In addition, we might as well check that we can *load* such a store,
before compaction.
This checks the corner cases which occur in real gossip stores.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It's possible that it hasn't got the node_announcement messages;
it will still list the nodes, however (the channel_announcement tells
it the nodes exist). Check for the alias field instead.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Basically, any "Bad" message from gossipd is something we should look
at. This covers failures loading the gossip_store, too!
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Reorg changes short_channel_id after lockin of private channel, while
one node restarts.
test that:
- peer->depth_togo in billboard decrements
- reorg and scid change is detected by running node and restarting node
- both `old` and `new` scids are in rtable
Also added a comment to test_blockchaintrack to clarify.
Now without bitcoind restart.
bitcoin-cli `prioritisetransaction` came to the rescue!
Its argument `fee_delta` (apparently) lowers the txs _effective_ feerate
soo low that bitcoind wont mine it ... untill we raise it when we want
it to be mined.
Because the call (wallet_extract_owned outputs) that prints that line can happen
_before_ or _after_ confirmation in block, adding `CONFIRMED` in the later.
This brings up an interesting quirk though, in that we report "3
attempts", where we really should have done one.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
My raspberry pi node hung up on my other node:
lightning_openingd-... chan #1: Got bad message from gossipd: 0db1
This is because we didn't handle that message in one path.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We try to look up the funding tx, but it's already spent that to fund
the channel, so we need txindex if this test is to work reliably.
It's not clear to me why this *ever* worked, but if fails on my new
ThreadRipper build machine with valgrind:
> wallettx = l1.bitcoin.rpc.getrawtransaction(wallettxid, True)
...
E bitcoin.rpc.InvalidAddressOrKeyError: {'code': -5, 'message': 'No such mempool transaction. Use -txindex to enable blockchain transaction queries. Use gettransaction for wallet transactions.'}
/usr/lib/python3/dist-packages/bitcoin/rpc.py:231: InvalidAddressOrKeyError
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Save some overhead, plus gets us ready for giving subdaemons direct
store access. This is the first time we *upgrade* the gossip_store,
rather than just discarding.
The downside is that we need to add an extra message after each
channel_announcement, containing the channel capacity.
After:
store_load_msec:28337-30288(28975+/-7.4e+02)
vsz_kb:582304-582316(582306+/-4.8)
store_rewrite_sec:11.240000-11.800000(11.55+/-0.21)
listnodes_sec:1.800000-1.880000(1.84+/-0.028)
listchannels_sec:22.690000-26.260000(23.878+/-1.3)
routing_sec:2.280000-9.570000(6.842+/-2.8)
peer_write_all_sec:48.160000-51.480000(49.608+/-1.1)
Differences:
-vsz_kb:582320
+vsz_kb:582316
-listnodes_sec:2.100000-2.170000(2.118+/-0.026)
+listnodes_sec:1.800000-1.880000(1.84+/-0.028)
-peer_write_all_sec:51.600000-52.550000(52.188+/-0.34)
+peer_write_all_sec:48.160000-51.480000(49.608+/-1.1)
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
For performance reasons we start the lightningd instances in
parallel. However, if we only assign the numeric ids (used for log-prefixes
and home directories) when we are already running in parallel, we are not
guaranteed to get the numeric ids matching the return value of `get_nodes` or
`line_graph`. With this patch we now select numeric ids before parallelizing
the start.
Signed-off-by: Christian Decker <@cdecker>
Here I add the test for this 5 local_failed case in this commit.
There 5 cases for FORWARD_LOCAL_FAILED status:
1. When Msater resolves the reply about the next peer infor(sent by Gossipd), and need handle unknown next peer failure in channel_resolve_reply();
2. When Master handle the forward process with the htlc_in and the id of next hop, it tries to drive a new htlc_out but fails in forward_htlc();
3. When we send htlc_out, Master asks Channeld to add a new htlc into the outgoing channel but Channeld fails. Master need handle and store this failure in rcvd_htlc_reply();
4. When Channeld receives a new revoke message, if the state of corresponding htlc is RCVD_ADD_ACK_REVOCATION, Master will tries to resolve onionpacket and handle the failure before resolving the next hop in peer_got_revoke();
5. When Onchaind finds the htlc time out or missing htlc, Master need handle these failure as FORWARD_LOCAL_FAILED in if it's forward payment case.
We were triggering a second exception in the directory cleanup step by
attempting to access a field that'd only be set upon entering the test code
itself. That error did not contribute to the problem resolution, so now we
check whether that field is set before accessing it.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
Fixes#2518
Signed-off-by: Christian Decker <decker.christian@gmail.com>
Changelog-fixed: `minconf` no longer gets wrapped around for large values, which was causing funds with insufficient confirmations to be selected.
I misunderstood the API, this ended up nesting a result inside the JSON-RPC
result.
No concerns about backwards compatibility since this is so new.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We generated blocks to announce the channel, but it can also expire
the HTLC if the timing is wrong. We don't need to anyway, since we
fixed the FIXME; we store local unannounced channels for restoration
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The old value of 1000 sat was too small to cover the dust reserves.
This lead to the situation when trying to open a channel with minimal
amount, the channels got refused because they were not able cover the
commitment fees.
For this reason the minimal capacity should be increased to i.e. 10k
satoshi, as the technical minimum that also accounts for fees and
reserves is somewhere around 6k sat.
Refactored the weighted-reservoir-sampling algo to make it more straightforward.
It now uses the excess as fraction of capacity as weight. This favors channels that
are more _relatively_ unbalanced to be used for incoming payment.
Now passes test_invoice_routeboost_private() when using max fundamount=16777215.
make it a bit easier to track mutual channel closures by
adding broadcast txid to the listpeers billboard.
since lightningd manages the 'identity' of the closing tx we need
to send it back to closingd so it can update the billboard
appropriately.
We're going to make it async, so start by moving the core code into
invoice.c and having that directly call fail/success functions for the
htlc.
We add an extra check in fulfill_htlc() that the HTLC state is correct:
that can't happen now, but may once we're async.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is currently done higher up, in handle_channel_update(), but
that's one reason why handle_channel_update() has to do a channel
lookup. Moving the check down means handle_channel_update() can do a
minimal "get node id for this channel" so it can check the signature.
This helps, because the chan lookup semantics are changing in the next
few patches.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The user can explicitly create such things (within [] or ") as we paste
those cases literally, but not for the simple cases.
Fixes: #2550
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Plugins don't do it right anyway, and we're about to remove it from
lightningd. Produces same format as json_pp.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
They don't clean up after themselves, so best we do it here (by this
point we've already done the pid check to make sure we're the only
lightningd here anyway).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
In particular, the assert when `--addr=/sockname` is used, and that it
doesn't clean up on restart, requiring manual deletion of the socket.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
gossipd in l1 might not have registered l2 reconnecting, thus considering
the channel local-disabled.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
lightning_connectd(19780): STATUS_FAIL_INTERNAL_ERROR: Failed to bind on 2 socket: Address family not supported by protocol
"Untested code is buggy code"
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
1. We need to read in as a byte string, then decode into utf8 once we
have a marker. Otherwise we seem to mangle it horribly, and we
might have a bad utf8 string anyway.
2. We need to suppress the JSON \u escapes on output.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We should be able to pass UTF-8 strings to and from plugins without
python turning them into JSON-\u escapes.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rather than using LightningJSONDecoder's implicit "field name and
value ends in msat, try converting to Millisatoshi", we do it to
parameters using type annotations.
If you had a parameter which was an array or dict itself, we don't
delve into that, but that's probably OK.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We don't, but we should, like we do for normal RPC. However, I chose
to use function annotations, rather than names-ending-in-'msat'
because it's more Pythony.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
New name is less confusing, and most people should be transitioning to
listpays rather than this anyway.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is the same deprecation, but one level up. For the moment, we
still support invoices with a `h` field (where description will be
necessary) but that will be removed once this option is removed.
Note that I just changed pylightning without backwards compatibility,
since the field was unlikely to be used, but we could do something
more complex here?
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
In particular this matches the case of `their_unilateral/to_us` outputs, which
were missing their addresses so far.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
We want to disallow using unconfirmed outputs by default, so making the
default 1 confirmation seems a good idea. This also matches `bitcoind`s
minimum confirmation requirement.
Arming however breaks some of our tests, so I used `minconf=0` for the
breaking tests and added a new test specifically for the `minconf` parameter
for `fundchannel`.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
An uncommitted channel should not keep the peer in the db, since the
uncommitted channel isn't in the db itself.
Fixes: #2367
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is particularly interesting because we handle overflow during route
calculation now; this could happen in theory once we wumbo.
It fixes a thinko when we print out routehints, too: we want to print
them out literally, not print out the effect they have on fees (which
is in the route, which we also print).
This ABI change doesn't need a CHANGELOG, since paystatus is new since
release.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Little point having users handle the postfixes manually, this
translates them, and also allows Millisatoshi to be used wherever an
'int' would be previously.
There are also helpers to create the formatting in a way c-lightning's
JSONRPC will accept.
All standard arithmetic operations with integers work.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We had occasional failures, because the fuzz could overwhelm the difference
in routes. Increasing the amount to 2,000,000 millisatoshis makes the
riskfactor 53msat (2000000 * 14 * 10 / 5259600) which is always greater
than the worst-case fuzz of 5% on the fee of 1002msat.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
I got a spurious failure because the final node gave a CLTV error and
so it decided to use a different channel. It should probably handle
this corner case better, but meanwhile make the test robust.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It was waiting for a remote channel, but not for all the interesting
channels we want to check. It can sometimes happen that further away
channels are added before closer ones are added, depending on
propagation path, flush timers and bitcoind poll timers. This now just
checks for all channels, which also reduces the ambiguity of whether
we selected a path solely because we were lacking alternatives.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
Travis timed out.
Waiting for three fundchannel commands depends on the bitcoind polling
interval (30 seconds), and then waiting for gossip propagation
requires two propagation intervals (120 seconds).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Up until now, riskfactor was useless due to implementation bugs, and
also the default setting is wrong (too low to have an effect on
reasonable payment scenarios).
Let's simplify the definition (by assuming that P(failure) of a node
is 1), to make it a simple percentage. I examined the current network
fees to see what would work, and under this definition, a default of
10 seems reasonable (equivalent to 1000 under the old definition).
It is *this* change which finally fixes our test case! The riskfactor
is now 40msat (1500000 * 14 * 10 / 5259600 = 39.9), comparable with
worst-case fuzz is 50msat (1001 * 0.05 = 50).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The test sometimes passes: our routing logic always chooses between
the shorter of two equal-cost routes (because we compare best with <
not <=).
By adding another hop, we add more noise, and by making the alternate
route fee 0 we provide the worst case.
But to be fair, we make the amount of the payment ~50c (15,000,000
msat), and increase our cltv-delay to 14 and fee-base 1000 to match
mainnet. The final patch shows the effect of this choice.
Otherwise our risk penalty is completely in the noise on
mainnet which has the vast majority of fees set at 1000msat + 1ppm.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is the direct cause of the failure of the original
test_pay_direct test and it makes sense: invoice routehints may not be
necessary, so try without them *first* rather than last.
We didn't mention the use of routehints in CHANGELOG at all yet, so
do that now.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
- result fundchannel command now depends on successful or failed broadcast of the funding tx
- failure returns error code FUNDING_BROADCAST_FAIL
- don't fail the channel when broadcast failed, but keep in CHANNELD_AWAITING_LOCKIN
- after fixing the initial broadcast failure, the user could manually rebroadcast the tx and
keep the channel
openingd/opening_funder_finished:
- broadcast_tx callback function now handles both success and failure
jsonrpc: added error code FUNDING_BROADCAST_FAIL
manpage: added error code returned by fundchannel command
This makes the user more aware of broadcast failure, so it hopefully doesn't
try to broadcast new tx's that depend on its change_outputs. Some users have reported (see
issue #2171) a whole sequence of fundings failing, because each funding was using the change
output of the previous one, which would not confirm.
We actually produce an invalid JSON error at the moment: bitcoin-cli
complains "JSON value is not an integer as expected" rather than returning
the given error. Make our error a valid JSON RPC error.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It is suddenly timing out a lot and is breaking master, so we
temporarily disable it until it is fixed.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
We were restarting the with the nodes before, which was causing some
port contention. This is more natural since `bitcoind` will take care
of terminating all proxies it returned.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
Internally libplugin turns ' into ", which causes these messages to produce
bad JSON.
The real fix is to remove the '->" convenience substitution and port the
JSON creation APIs into common/ from lightningd/
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Spurious errors were occuring around checking the provided
current commitment point from the peer on reconnect when
option_data_loss_protect is enabled. The problem was that
we were using an inaccurate measure to screen for which
commitment point to compare the peer's provided one to.
This fixes the problem with screening, plus makes our
data_loss test a teensy bit more robust.
So add a new 'strategy' field. This makes it clearer what is going
on, currently one of:
* "Initial attempt"
* "Excluded channel <scid>"
* "Removed route hint"
* "Excluded expensive channel <scid>"
* "Excluded delaying channel <scid>"
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We sanitize the routes: firstly, we assume appending so eliminate the
first hop if the route points straight to us. Secondly, eliminate empty
hints. Thirdly, trim overlong hints.
Then we just use the first route hint.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We use the 'exclude' option to getroute for successive attempts. This
is more robust than having gossipd disable for some limited time.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
I wrote this sync first, then rewrote async, then developed libplugin.
But committing all that just wastes reviewer time, so I present it as
if it was always asnc and using the library helper.
Currently the command it registers is 'pay2', but when it's complete
we'll remove the internal 'pay' and rename it. This does a single
'getroute/sendpay' call. No retries, no options.
Shockingly, this by itself is almost sufficient to pass our current test
suite with `pay`->`pay2`.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Don't do this:
(gdb) bt
#0 0x00007f37ae667c40 in ?? () from /lib/x86_64-linux-gnu/libz.so.1
#1 0x00007f37ae668b38 in ?? () from /lib/x86_64-linux-gnu/libz.so.1
#2 0x00007f37ae669907 in deflate () from /lib/x86_64-linux-gnu/libz.so.1
#3 0x00007f37ae674c65 in compress2 () from /lib/x86_64-linux-gnu/libz.so.1
#4 0x000000000040cfe3 in zencode_scids (ctx=0xc1f118, scids=0x2599bc49 "\a\325{", len=176320) at gossipd/gossipd.c:218
#5 0x000000000040d0b3 in encode_short_channel_ids_end (encoded=0x7fff8f98d9f0, max_bytes=65490) at gossipd/gossipd.c:236
#6 0x000000000040dd28 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17290511, number_of_blocks=8) at gossipd/gossipd.c:576
#7 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17290511, number_of_blocks=16) at gossipd/gossipd.c:595
#8 0x000000000040ddee in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17290495, number_of_blocks=32) at gossipd/gossipd.c:596
#9 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17290495, number_of_blocks=64) at gossipd/gossipd.c:595
#10 0x000000000040ddee in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17290431, number_of_blocks=128) at gossipd/gossipd.c:596
#11 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17290431, number_of_blocks=256) at gossipd/gossipd.c:595
#12 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17290431, number_of_blocks=512) at gossipd/gossipd.c:595
#13 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17290431, number_of_blocks=1024) at gossipd/gossipd.c:595
#14 0x000000000040ddee in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17289408, number_of_blocks=2047) at gossipd/gossipd.c:596
#15 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17289408, number_of_blocks=4095) at gossipd/gossipd.c:595
#16 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17289408, number_of_blocks=8191) at gossipd/gossipd.c:595
#17 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17289408, number_of_blocks=16382) at gossipd/gossipd.c:595
#18 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17289408, number_of_blocks=32764) at gossipd/gossipd.c:595
#19 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17289408, number_of_blocks=65528) at gossipd/gossipd.c:595
#20 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17289408, number_of_blocks=131056) at gossipd/gossipd.c:595
#21 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17289408, number_of_blocks=262112) at gossipd/gossipd.c:595
#22 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17289408, number_of_blocks=524225) at gossipd/gossipd.c:595
#23 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17289408, number_of_blocks=1048450) at gossipd/gossipd.c:595
#24 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17289408, number_of_blocks=2096900) at gossipd/gossipd.c:595
#25 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17289408, number_of_blocks=4193801) at gossipd/gossipd.c:595
#26 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17289408, number_of_blocks=8387603) at gossipd/gossipd.c:595
#27 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17289408, number_of_blocks=16775207) at gossipd/gossipd.c:595
#28 0x000000000040ddee in queue_channel_ranges (peer=0x3868fc8, first_blocknum=514201, number_of_blocks=33550414) at gossipd/gossipd.c:596
#29 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=514201, number_of_blocks=67100829) at gossipd/gossipd.c:595
#30 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=514201, number_of_blocks=134201659) at gossipd/gossipd.c:595
#31 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=514201, number_of_blocks=268403318) at gossipd/gossipd.c:595
#32 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=514201, number_of_blocks=536806636) at gossipd/gossipd.c:595
#33 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=514201, number_of_blocks=1073613273) at gossipd/gossipd.c:595
#34 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=514201, number_of_blocks=2147226547) at gossipd/gossipd.c:595
#35 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=514201, number_of_blocks=4294453094) at gossipd/gossipd.c:595
#36 0x000000000040df26 in handle_query_channel_range (peer=0x3868fc8, msg=0x37e0678 "\001\ao\342\214\n\266\361\263r\301\246\242F\256c\367O\223\036\203e\341Z\b\234h\326\031") at gossipd/gossipd.c:625
The cause was that converting a block number to an scid truncates it
at 24 bits. When we look through the index from (truncated number) to
(real end number) we get every channel, which is too large to encode,
so we iterate again.
This fixes both that problem, and also the issue that we'd end up
dividing into many empty sections until we get to the highest block
number. Instead, we just tack the empty blocks on to then end of the
final query.
(My initial version requested 0xFFFFFFFE blocks, but the dev code
which records what blocks were returned can't make a bitmap that big
on 32 bit).
Reported-by: George Vaccaro
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Make the two channels adjacent, and specify exactly the number of
divide-and-conquer steps there are.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We used to have a bug where decoderawtransaction would fail, fixed in
fedcfd661 (pytest: hand 'True' to decoderawtransaction so it doesn't
get confused.).
So we can remove the fallback decode, and might as well extract the
ugliness into a helper function.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Currently only used by gossipd for channel elimination.
Also print them in canonical form (/[01]), so tests need to be
changed.
Suggested-by: @cdecker
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is based on Christian's change, but removes all trace of the old codes.
I've proposed another spec change which removes this code altogether:
https://github.com/lightningnetwork/lightning-rfc/pull/544
Signed-off-by: Christian Decker <decker.christian@gmail.com>
Reported-by: Rusty Russell <@rustyrussell>
Since we are planning to release a bug fix release, and the plugin
subsystem is not yet complete, it is better to make plugin support
opt-in while we continue testing.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Fortunately, we can calculate the sha256 ourselves, so the
outgoing channeld doesn't need to tell us.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The node which sent the error is doing so because the following
one sent WIRE_UPDATE_FAIL_MALFORMED_HTLC.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Funder can't spend the fee it needs to pay for the commitment transaction:
we were not converting to millisatoshis, however!
This breaks our routeboost test, which no longer has sufficient funds
to make payment.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We have an incompatibility with lnd it seems: I've lost channels on
reconnect with 'sync error'. Since I never got this code to be reliable,
disable it for next release since I suspect it's our fault :(
And reenable the check which didn't work, for others to untangle.
I couldn't get option_data_loss_protect to be reliable, and I disabled
the check. This was a mistake, I should have either spent even more
time trying to get to the bottom of this (especially, writing test
vectors for the spec and testing against other implementations).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Both of these plugins will fail in interesting ways, and we should
still handle them correctly.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
It wasn't JSON formatted either so there was no nice pretty-printing
way. This jsonifies and pretty prints it.
Signed-off-by: Christian Decker <@cdecker>
We no longer use `dev-override-fees` to set the fees, rather we
instrument the `bitcoind` proxy to return the desired feerates.
Signed-off-by: Christian Decker <@cdecker>
pytest was an indirect dependency so far, making that one
explicit, and the timeout plugin should allow us to kill a stuck test
before travis kills it, and thus allow us to see where it got stuck.
Signed-off-by: Christian Decker <@cdecker>
Because gossip in this case takes up to a minute, this test took 10
minutes. The workaround is to do the waiting-for-gossip all at once.
Now it takes 362 seconds.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Unlike other daemons, closingd doesn't listen to the master, but runs
simply to its own beat. So instead of responding to the JSON dev_memleak
command, we always check for memory leaks, and make sure that the
python tests fail if they see MEMLEAK in the logs.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We now keep multiple commands for a json_connection, and an array of
json_streams.
When a command wants to write something, we allocate a new json_stream
at the end of the array.
We always output from the first available json_stream; once that
command has finished, we free that and move to the next. Once all are
done, we wake the reader.
This means we won't read a new command if output is still pending, but
as most commands don't start writing until they're ready to write
everything, we still get command parallelism.
In particular, you can now 'waitinvoice' and 'delinvoice' and it will
work even though the 'waitinvoice' blocks.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We need to keep the remaining buffer, and we need to try to parse it
before we read the next. I first tried keeping it in the object, but
its lifetime is that of the *socket*, which we actually reopen for
every command.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This was hanging sometimes in travis, but actually checking the result
of the commands makes it *always* hang. We remove the waitinvoice
which will not return.
ZmnSCPxj points out that this behavior, introduced in
ce0bd7abd3, is a regression: it would be
nice to be able to cancel a waitinvoice. But that fix is more complex,
and will have to be another PR.
This test will now hang, but it's OK: we're about to fix it!
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
When developing in regtest or testnet it is really inconvenient to
have to fake traffic and generate blocks just to get estimatesmartfee
to return a valid estimate. This just sets the minfee if bitcoind
doesn't return a valid estimate.
Reported-by: Rene Pickhardt <@renepickhardt>
Signed-off-by: Christian Decker <@cdecker>
In one case we can reduce, in the others we eliminated if VALGRIND.
Here are the ten slowest tests on my laptop:
469.75s call tests/test_closing.py::test_closing_torture
243.61s call tests/test_closing.py::test_onchain_multihtlc_our_unilateral
222.73s call tests/test_closing.py::test_onchain_multihtlc_their_unilateral
217.80s call tests/test_closing.py::test_closing_different_fees
146.14s call tests/test_connection.py::test_dataloss_protection
138.93s call tests/test_connection.py::test_restart_many_payments
129.66s call tests/test_gossip.py::test_gossip_persistence
128.73s call tests/test_connection.py::test_no_fee_estimate
122.46s call tests/test_misc.py::test_htlc_send_timeout
118.79s call tests/test_closing.py::test_onchain_dust_out
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
generate was deprecated some time ago, so we added the generate_block()
helper. But many calls crept back in, and git master refuses it.
(test_blockchaintrack relied on the return value, so make generate_block
return the list of blocks).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
After Ubuntu 18.10 upgrade, lots of new flake8 warnings.
$ flake8 --version:
3.5.0 (mccabe: 0.6.1, pycodestyle: 2.4.0, pyflakes: 1.6.0) CPython 3.6.7rc1 on Linux
Note it seems that W503 warned about line breaks before binary
operators, and W504 complains about them after. I prefer W504, so
disable W503.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Occasional failure in test_fulfill_incoming_first where the channel
closed before the final message from dev_disonnect was read. Cause
was the peer writing a gossip msg and failing due to ECONNRESET, before
it read the final message.
(Managed to reproduce under strace -f, FTW).
This is really a symptom of the fact that line_graph's announce=True
didn't wait for node announcements. Let's do that.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
When we have multiple HTLCs with the same preimage and the same CLTV,
it doesn't matter what order we treat them (they're literally
identical). But when we offer HTLCs with the same preimage but
different CLTVs, the commitment tx outputs look identical, but the
HTLC txs are different: if we simply take the first HTLC which matches
(and that's not the right one), the HTLC signature we got from them
won't match. As we rely on the signature matching to detect the fee
paid, we get:
onchaind: STATUS_FAIL_INTERNAL_ERROR: grind_fee failed
So we alter match_htlc_output() to return an array of all matching
HTLC indices, which can have more than one entry for offered HTLCs.
If it's our commitment, we loop through until one of the HTLC
signatures matches. If it's their commitment, we choose the HTLC with
the largest CLTV: we're going to ignore it once that hits anyway, so
this is the most conservative approach. If it's a penalty, it doesn't
matter since we steal all HTLC outputs the same independent of CLTV.
For accepted HTLCs, the CLTV value is encoded in the witness script,
so this confusion isn't possible. We nonetheless assert that the
CLTVs all match in that case.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We set up HTLCs with the same preimage and both different and same
CLTVs in both directions, then make sure that onchaind is OK and that
the HTLCs are failed without causing downstream failure.
We do this for both our-unilateral and their-unilateral cases.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Create a second HTLC with a different CTLV but same preimage; onchaind
uses the wrong signature and fails to grind it.
Reported-by: molz (#c-lightning)
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This was suggested by Pierre-Marie as the solution to the 'same HTLC,
different CLTV' signature mismatch.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
My test case is a mainnet gossip store with 22107 channels, and
time to do `lightning-cli listchannels`:
Before: `lightning-cli listchannels` DEVELOPER=0
real 0m1.303000-1.324000(1.3114+/-0.0091)s
After:
real 0m0.629000-0.695000(0.64985+/-0.019)s
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This also highlights the danger of searching the logs: that error
appeared previously in the logs, so we didn't notice that the actual
withdraw call gave a different error.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
And use wallet_forward_status_in_db() everywhere in db code.
And clean up extra CHANGELOG.md entry (looks like rebase error?)
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Adapts the `test_forward_stats` test to include checks for the
`forwarded_payments` table. Will add checks for the `listforwardings`
RPC call next.
Signed-off-by: Christian Decker <@cdecker>
When the wrong key is used, the remote end simply hangs up.
We used to get a random errno, which tends to be "Operation now in progress."
Now it's defined to be 0, detect and provide a better error.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Have c-lightning nodes send out the largest value for
`htlc_maximum_msat` that makes sense, ie the lesser of
the peer's max_inflight_htlc value or the total channel
capacity minus the total channel reserve.
We don't create unannouncable channels, but other implementations can.
Not only is it rude to expose these via invoices, it's probably not
useable anyway.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
There is an issue with the way we retrieve the channel accounting info
that will result in always showing 0 for all stats. This tests for it
and the next commit will fix it.
During tests, this is half our log! And Travis truncates it if we get
a failure in test_restart_many_payments.
Interestingly, test_logging had a bug which relied on this spam :)
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We don't save them to the database, so fix things up as we load them.
Next patch will actually save them into the db, and this will become
COMPAT code.
Also: call htlc_in_check() with NULL on db load, as otherwise it aborts
internally.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Usually, we only close an incoming HTLC once the outgoing HTLC is completely
resolved. However, we short-cut this in the FULFILL case: we have the
preimage, so might as well use it immediately (in fact, we wait for it to
be committed, but we don't need to in theory).
As a side-effect of this, our assumption that every outgoing HTLC has
a corresponding incoming HTLC is incorrect, and this test (xfail) tickles
that corner case.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We split json_invoice(), as it now needs to round-trip to the gossipd,
and uniqueness checks need to happen *after* gossipd replies to avoid
a race.
For every candidate channel gossipd gives us, we check that it's in
state NORMAL (not shutting down, not still waiting for lockin), that
it's connected, and that it has capacity. We then choose one with
probability weighted by excess capacity, so larger channels are more
likely.
As a side effect of this, we can tell if an invoice is unpayble (no
channels have sufficient incoming capacity) or difficuly (no *online*
channels have sufficient capacity), so we add those warnings.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We're about to slow down the invoice rpc (esp. under valgrind), which
breaks the delicate timing of the autocleaninvoice test.
Change that so the autocleaner (and timestamp) starts after the invoices
are added.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Travis failures:
valgrind: m_scheduler/sema.c:104 (vgModuleLocal_sema_down): Assertion 'sema->owner_lwpid != lwpid' failed.
host stacktrace:
==1296== at 0x38083F48: ??? (in /usr/lib/valgrind/memcheck-amd64-linux)
==1296== by 0x38084064: ??? (in /usr/lib/valgrind/memcheck-amd64-linux)
==1296== by 0x380841F1: ??? (in /usr/lib/valgrind/memcheck-amd64-linux)
==1296== by 0x38135DAE: ??? (in /usr/lib/valgrind/memcheck-amd64-linux)
==1296== by 0x380D328D: ??? (in /usr/lib/valgrind/memcheck-amd64-linux)
==1296== by 0x3809A4AC: ??? (in /usr/lib/valgrind/memcheck-amd64-linux)
==1296== by 0x3809AE43: ??? (in /usr/lib/valgrind/memcheck-amd64-linux)
==1296== by 0x380988CF: ??? (in /usr/lib/valgrind/memcheck-amd64-linux)
sched status:
running_tid=0
Thread 1: status = VgTs_WaitSys (lwpid 1296)
==1296== at 0x5729730: __poll_nocancel (syscall-template.S:84)
==1296== by 0x4348DF: daemon_poll (daemon.c:78)
==1296== by 0x4169E7: io_poll_lightningd (lightningd.c:543)
==1296== by 0x471ECD: io_loop (poll.c:282)
==1296== by 0x416E06: main (lightningd.c:744)
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We have a lot of infrastructure to delay local channel_updates to
avoid spamming on each peer reconnect; we had to keep tracking of
pending ones though, in case we needed the very latest for sending an
error when failing an HTLC.
Instead, it's far simpler to set the local_disabled flag on a channel
when we disconnect, but only send a disabling channel_update if we
actually fail an HTLC.
Note: handle_channel_update() TAKES update (due to tal_arr_dup), but we
didn't use that before. Now we do, add annotation.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We trade channel_update before channel_announce makes the channel
public, and currently forget them when we finally get the
channel_announce. We should instead apply them, and not rely on
retransmission (which we remove in the next patch!).
This earlier channel_update means test_gossip_jsonrpc triggers too
early, so have that wait for node_announcement.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The help command now adds command usage to its output by calling each
command handler in CMD_USAGE mode.
Instead of seeing, for example:
decodepay
Decode {bolt11}, using {description} if necessary
we see:
decodepay bolt11 [description]
Decode {bolt11}, using {description} if necessary
Signed-off-by: Mark Beckwith <wythe@intrig.com>
Incrementing version number means stores which were prior to the previous
commit will be removed, and refreshed. The simplest fix, if not the most
efficient.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
There's no reason for the db to ever return non-NULL if it's spent. And there's
only one caller, for which that is definitely true.
Suggested-by: @cdecker
Fixes: #1934
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We extract the tx from the logs, and then we wait until that hits
the mempool. This is more reliable than 'sendrawtx' in the logs,
which might catch a previous sendrawtx; it's also more explicit
that we expect that tx exactly.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
BOLT 7's been updated to split the flags field in `channel_update`
into two: `channel_flags` and `message_flags`. This changeset does the
minimal necessary to get to building with the new flags.
We currently just ignore them. This is one reason the hsm (in some places)
explicitly calls log_broken so we get some idea.
This was the only subdaemon which had a NULL msgcb and msgname, so eliminate
those checks in subd.c.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Got a spurious failure in test_no_fee_estimate; we fired too soon from the logs (presumably
we raced in on the first response, but estimatesmartfee gets called 3 times).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is a simple reverse proxy that `bitcoin-cli` can talk to when invoked by
`lightningd`. It allows us to trace `bitcoin-cli` calls, and intercept calls to
mock the replies, better than the current bash-script based method.
It's an array: we were only saving the single element; if there was more than
one changed HTLC we'd get a bad signature!
The report in #1907 is probably caused by the other side re-requesting
something we considered already finalized; to avoid this particular error,
we should set the field to NULL if there's no last_sent_commit.
I'm increasingly of the opinion we want to just save all the update
packets to the db and blast them out, instead of doing this
second-guessing dance.
Fixes: #1907
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
These happen after we compact the store; every log I've seen of a
restart on a real node has a message about truncating the store,
because node_announcements predate channel_announcements.
I extracted one such case from testnet, and reduced it to test here.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
1. Wait for a 'sendrawtransaction' *after* the dev-fail message; don't be
fooled by a previous one.
2. Turning on estimate fee sets fees exactly; just wait for it to be processed.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It's probably unnecessary to have this weird way of injecting results
now we have explicit feerate args.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
And, reluctantly, default to bitcoind style.
"It's wrong to be right too soon."
Suggested-by: @cdecker
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We could refine this later (based on existing wallet, for example), but
this gives some estimate.
[ Rename onchain_estimates -> onchain_fee_estimates Suggested-by: @SimonVrouwe ]
[ Factor of 1000 fix Reported-by: @SimonVrouwe ]
Suggested-by: @molxyz
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We don't know what our peer is doing, but if we see those values, maybe
they did too, and for longer. And add the min/max acceptable values
into our JSON API.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is useful mainly in the case where bitcoind is not giving estimates,
but can also be used to bias results if you want.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
And no more filtering out messages, as we should no longer spam the
logs with them (the 'Connected json input' one was removed some time
ago).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The comment was wrong: the channel being locked in was triggering
the fee update and hence the disconnect. But that can actually
happen before fund_channel returns, as that waits for the gossipd
to see the channel active.
Best to do the fee update manually, so it's exactly what we want.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
If feerates change, L2 sends L3 a commit for that, which causes us to
fail the assert (which says we won't send a commitment_signed).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We use feerate in several places, and each one really should react
differently when it's not available (such as when bitcoind is still
catching up):
1. For general fee-enforcement, we use the broadest possible limits.
2. For closingd, we use it as our opening negotiation point: just use half
the last tx feerate.
3. For onchaind, we can use the last tx feerate as a guide for our own txs;
it might be too high, but at least we know it was sufficient to be mined.
4. For withdraw and fund_channel, we can simply refuse.
Fixes: #1836
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Manipulate fees via fake-bitcoin-cli. It's not quite the same, as
these are pre-smoothing, so we need a restart to override that where
we really need an exact change. Or we can wait until it reaches a
certain value in cases we don't care about exact amounts.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We don't respond to fee changes until we're locked in: make sure we catch
up at that point.
Note that we use NORMAL fees during opening, but IMMEDIATE after, so
this often sends a fee update. The tests which break, we set those
feerates to be equal.
This (sometimes) changes the behavior of test_permfail, as we now
get an immediate commit, so that is fixed too so we always wait for
that to complete.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is a noop if we're opening a new channel (channel_fees_can_change(channel)
is false until funding locked in), but important if we're restarting.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We don't update a channel's feerate on reestablishment: we insert a restart
in test_onchain_different_fees() (which we'll need soon anyway) to show it.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
When in this state, we send a canned error "Awaiting unilateral close".
We enter this both when we drop to chain, and when we're trying to get
them to drop to chain due to option_data_loss_protect.
As this state (unlike channel errors) is saved to the database, it means
we will *never* talk to a peer again in this state, so they can't
confuse us.
Since we set this state in channel_fail_permanent() (which is the only
place we call drop_to_chain for a unilateral close), we don't need to
save to the db: channel_set_state() does that for us.
This state change has a subtle effect: we return WIRE_UNKNOWN_NEXT_PEER
instead of WIRE_TEMPORARY_CHANNEL_FAILURE as soon as we get a failure
with a peer. To provoke a temporary failure in test_pay_disconnect we
take the node offline.
Reported-by: Christian Decker @cdecker
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This means we don't try to unilaterally close after a restart, *and*
we can tell onchaind to try to use the point to recover funds when the
peer unilaterally closes.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Firstly, if they claim to know a future value, we ask the HSM; if
they're right, we tell master what the per-commitment-secret it gave
us (we have no way to validate this, though) and it will not broadcast
a unilateral (knowing it will cause them to use a penalty tx!).
Otherwise, we check the results they sent were valid. The spec says
to do this (and close the channel if it's wrong!), because otherwise they
could continually lie and give us a bad per-commitment-secret when we
actually need it.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Tests were failing when in the same thread after a test which set
log_all_io=True, because SIGUSR1 seemed to be turning logging *off*.
This is due to Python using references not copies for assignment.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is required for the next test, which has to log messages from channeld
as soon as it starts (so might be too late if it sends SIGUSR1).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We ignore incoming for now, but this means we advertize the option and
we send the required fields.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
peer features are only kept for connected peers (as they can change),
but we didn't update them on reconnect. The main effect was that
after a restart we displayed the features as empty, even after
reconnect.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
1. l1 update_fee -> l2
2. l1 commitment_signed -> l2 (using new feerate)
3. l1 <- revoke_and_ack l2
4. l1 <- commitment_signed l2 (using new feerate)
5. l1 -> revoke_and_ack l2
When we break the connection after #3, the reconnection causes #4 to
be retransmitted, but it turns out l1 wasn't telling the master to set
the local feerate until it received the commitment_signed, so on
reconnect it uses the old feerate, with predictable results (bad
signature).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Useful it we want to intercept bitcoin-cli first.
We move the getinfo() caching into start(), as that's when we can actually
use RPC.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We're going to use it to override specific commands. It's non-valgrinded
already since we use '--trace-children-skip=*bitcoin-cli*' so the overhead
should be minimal.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We got an index error, because status had only one field (onchaind not
started yet).
> wait_for(lambda: only_one(p.rpc.listpeers(l1.info['id'])['peers'][0]['channels'])['status'][1] == 'ONCHAIN:Tracking mutual close transaction')
E IndexError: list index out of range1
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Normal wallet txs get reconfirmed as blocks come in, but ones which need
closeinfo are more fragile, so we do it manually using txwatch for them.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
In one case, the channel_update which we expected to activate the channel
from l2 was suppressed as redundant. This is certainly valid, so just
check the results.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Waiting for three node_announcments isn't always enough, since l2 can
publish two of them (an independent bug). Do the more Right Thing and
just wait for 30 seconds of no input...
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
1. connect convenience variable for improved readabilty.
2. a comment explaining that timer is on channel, not HTLC.
3. use modern python style in test_htlc_send_timeout
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Now sending a ping makes sense: it should force the other end to send
a reply, unblocking the commitment process.
Note that rather than waiting for a reply, we're actually spinning on
a 100ms loop in this case. But it's simple and it works.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Currently, if we don't realize a TCP connection is down, we almost
certainly don't find out until *after* we're sent the
commitment_signed message, in which case we cannot fail the incoming
HTLC.
This test demonstrates that. Note the 30 second sleep: we should really
run Travis tests in parallel!
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
If we have an address hint, we start with that, but we'll use
node_announcement information if required.
Note: we (ab)use the address hint when restoring from the database
or reconnecting, even if the connection was *incoming*. That meant
that the recipient of a connection would *never* manage to connect out.
We still don't take multiple addresses from the DNS seeds: I assume we
should, since there could be IPv4 and IPv6.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
connectd tells master about every disconnection, and master knows
whether it's important to reconnect. Just get the master to invoke a new
connect command if it considers the peer important!
The only twist is timeouts: we don't want to immediately reconnect if
we've failed to connect. To solve this, connectd passes a 'delaytime'
to the master when a connection fails, and the master passes it back
when it asks for a connection.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
In particular, all opening_read_peer_msg() callers need to know there
was an error (presumably, negotiating) so they can stop, but we should
not exit.
This lets us reenable the final disabled test.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We now simply maintain a pubkey set for connected peers (we only care
if there's a reconnect), not the entire peer structure.
lightningd no longer queries us for getpeers: it knows more than we do
already.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
There's now a potential race: the source peer connect returns, but in
destination peer the master hasn't read the connect message from
connectd, so the peer isn't in listpeers yet.
(Previously the connection stayed in connectd, so there was no such
window).
This is an occasional issue in a few places.
Note that we take the opportunity to speed up test_disconnectpeer too
while we're there.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Prior to this, lightningd would hand uninteresting peers back to connectd,
which would then return it to lightningd if it sent a non-gossip msg,
or if lightningd asked it to release the peer.
Now connectd hands the peer to lightningd once we've done the init
handshake, which hands it off to openingd.
This is a deep structural change, so we do the minimum here and cleanup
in the following patches.
Lightningd:
1. Remove peer_nongossip handling from connect_control and peer_control.
2. Remove list of outstanding fundchannel command; it was only needed to
find the race between us asking connectd to release the peer and it
reconnecting.
3. We can no longer tell if the remote end has started trying to fund a
channel (until it has succeeded): it's very transitory anyway so not
worth fixing.
4. We now always have a struct peer, and allocate an uncommitted_channel
for it, though it may never be used if neither end funds a channel.
5. We start funding on messages for openingd: we can get a funder_reply
or a fundee, or an error in response to our request to fund a channel.
so we handle all of them.
6. A new peer_start_openingd() is called after connectd hands us a peer.
7. json_fund_channel just looks through local peers; there are none
hidden in connectd any more.
8. We sometimes start a new openingd just to send an error message.
Openingd:
1. We always have information we need to accept them funding a channel (in
the init message).
2. We have to listen for three fds: peer, gossip and master, so we opencode
the poll.
3. We have an explicit message to start trying to fund a channel.
4. We can be told to send a message in our init message.
Testing:
1. We don't handle some things gracefully yet, so two tests are disabled.
2. 'hand_back_peer .*: now local again' from connectd is no longer a message,
openingd says 'Handed peer, entering loop' once its managing it.
3. peer['state'] used to be set to 'GOSSIPING' (otherwise this field doesn't
exist; 'state' is now per-channel. It doesn't exist at all now.
4. Some tests now need to turn on IO logging in openingd, not connectd.
5. There's a gap between connecting on one node and having connectd on
the peer hand over the connection to openingd. Our tests sometimes
checked getpeers() on the peer, and didn't see anything, so line_graph
needed updating.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
In particular, I found lightning_openingd processes after running
tests. When we use the dev_disconnect blackhole '0' option, they
stick around until the dev_disconnect file is truncated (there is only
so much you can do with only a file descriptor), so let's do that.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The following changes revealed this race, where expecting listchannels()
to contain two channels immediately after fund_channel() was racy.
We also derive the short_channel_id first, so we can search logs for the
exact messages.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The next patches get better at reconecting, so if we use dev-allow-localhost
nodes can often find each other and reconnect before shutting down; only
use that option where we actually need it.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Saw this in Travis: technically we return from the dev_set_max_scids...
cmd after sending it to gossipd, but we should wait for it to log.
Adding an internal reply message for a dev command seems overkill.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
1. If the IPv6 address was public, that changed the wireaddr and thus the ipv4 bind
would not be to a wildcard and would fail.
2. Binding two fds to the same port on both wildcard IPv4 and IPv6 succeeds; we only
fail when we try to listen, so allow error at this point.
For some reason this triggered on my digital ocean machine.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
openingd calculates our reserve based on the channel amount (even if
we're funding, to keep the calculation in one place), but it wasn't
reporting it back to the master daemon. We initialized it to 0 so that
valgrind wouldn't get upset, as it's part of a structure we send over
the wire.
Have openingd report back, and also initialize it to an impossible value
as extra assurance. And remove a stray (harmless but weird) semicolon.
Reported-by: Gálli Zoltán
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This adds one line with the onion and the channel_update we extract from
it. This in turn allows us to check that the channel_update in the onion is not
type prefixed, and that we patch it correctly before passing it to gossipd.
The easiest way to do this is to play with the 'wallet_tx' semantics
and have 'amount' have meaning even when 'all_funds' is set.
Note that we change the string 'Cannot afford funding transaction' to
'Cannot afford transaction' as this code is also used for withdrawls.
Inspired-by: molz on #c-lightning
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The logs in various Travis failures show that it takes 20 seconds just for
closingd to read the init message. As a result, the close times out (default
is 30 seconds).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We were *supposed* to be waiting for the next commitment tx so we
made sure the one we broadcast was old, *but* the 'revoke_and_ack'
we were waiting for could be matched by the completion of the previous
'revoke_and_ack'.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This needs to be done separately from the rest of the daemon since we can
otherwise not make sure that it happens before the DB is freed and we might
still need the DN, and be running in a DB transaction, for some destructors to
run.
That was the cause of the bad gossip order failures: gossipd thought our
channel was live, but the other end didn't receive message last time.
Now gossipd doesn't use fd to kill us (connectd tells master to do so), we
can implement read_peer_msg_nogossip().
Fixes: #1706
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This patch guts gossipd of all peer-related functionality, and hands
all the peer-related requests to channeld instead.
gossipd now gets the final announcable addresses in its init msg, since
it doesn't handle socket binding any more.
lightningd now actually starts connectd, and activates it. The init
messages for both gossipd and connectd still contain redundant fields
which need cleaning up.
There are shims to handle the fact that connectd's wire messages are
still (mostly) gossipd messages.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Gossipd combines the information if it knows it, but that's really the
job of 'listnodes'. More importantly, channeld won't have access to
this information.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
I saw an error in test_gossip_weirdalias in Travis, where listnodes(nodeid)
returned *BOTH* nodes; it happened to fail because [0] was the wrong one, but
it would have passed if the order had been different.
This helper asserts that we really do only have one element, and should
catch such bugs faster.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
I could not figure out why test_announce_address suddenly stopped working:
I had previously been using DEVELOPER=1 on the cmdline for historical
reasons when testing locally.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We weren't waiting for gossipd to actually process the
dev_set_max_scids_encode_size message, so under Travis it sometimes
split the reply before processing that.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
In this test we tell l3 to disconnect on sending WIRE_CHANNEL_ANNOUNCEMENT.
This is hit by gossipd (to disconnect from l2) but *also* channeld to
disconnect from l4. That's OK, because normally by this point l4 has
sent its real channel_update.
However, the next patch introduces a delay in sending channel_updates,
meaning l4 hasn't sent it yet. If l3 doesn't reconnect to l4, we
never get the channel_update and the test which expects l1 to eventually
see both sides of the channel fails.
So we manually reconnect then. Note that we remove the redundant
'dev-no-reconnect' option from l2: it's added automatically as it
doesn't set 'may_reconnect'.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Gossipd will ignore the second one, but doing it in the front end
gives an explicit error message.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Python dict can't have duplicate entries, but some options can be specified
multiple times. The easiest way is to put a list in the dict.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
E ConnectionRefusedError: [Errno 111] Connection refused
And in debug.log:
2018-05-17T04:06:35Z Warning: Config setting for -rpcport only applied on regtest network when in [regtest] section.
Unfortunately, current versions including 0.16.1 *ignore* the contents of
a '[regtest]' section, so we need it in *both* places.
Also remove the misleading 'rpcport' initialization which we always
override.
Note that we don't fix this message though:
2018-05-17T04:06:35Z Config options rpcuser and rpcpassword will soon be deprecated. Locally-run instances may remove rpcuser to use cookie-based auth, or may be replaced with rpcauth. Please see share/rpcuser for rpcauth auth generation.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
For some reason, we created a second bitcoin.conf in the regtest/ directory,
which AFAICT nothing uses.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It's not optional for our test setup, and this makes it easier to invoke
bitcoin-cli manually, for example.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
As of bitcoind 0.16.1, you can't send a single-input OP_RETURN output,
as you get 'tx-too-small'.
sendrawtx exit 26, gave error code: -26?error message:?tx-size-small (code 64)?'
So instead we use the minimum fee we can, but otherwise ignore it and
don't wait for it to be mined.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
New codes: FUND_MAX_EXCEEDED, FUND_CANNOT_AFFORD, FUND_DUST_LIMIT_UNMET.
The error message "Cannot afford fee" was not exactly correct because
it would also occur if the amount requested could not be afforded. So
I changed it to the more generic "Cannot afford transaction".
Other things:
* Fixed off-by-one satoshi in fundchannel manpage.
* Changed 'arror' to 'error' because we are not pirates.
I still believe that 2 weeks is way too much, but we were promised that these
defaults would be slowly reduced to saner values as the stability increases.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
Waiting for 'Disabling channel' is not enough, since it's async to the
actual tx broadcast: I caught a case where the unilateral close wasn't
in the block, and so failed.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
72d103d6bb deprecated DEVELOPER env var
in favor of config.vars, but didn't update test_closing.py or test_gossip.py.
5d0a54b7f0 then removed the explicit
DEVELOPER= setting from Travis.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The modern, pytest based, tests now clean up after themselves by removing
directories of successful tests and the base directory if there was no failure.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
Previous replies weren't large enough; add another channel and then
use IO tracing to make sure the reply is zlib encoded.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Since we currently only (ab)use it to send everything, we need a way to
generate boutique queries for testing.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The gossip-query spec enhancements say not to forward an channel_announcement until
you have receive a channel_update. This test fails for now.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
When opening a number channels from a single node we could end up not waiting
for the funding tx to make it into the mempool, instead triggering on a previous
`sendrawtransaction` or `CHANNEL_NORMAL` in the logs. This now checks that the
actual funding transaction makes it into the mempool and that we wait for the
depth change for that specific channel.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
We don't have any connection yet, so how could they be active? Disable both
sides to avoid trying to route through them or telling others to use them as
`contact_points` in invoices.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
Mostly `lightningd` complaining about not being able to estimate fees. Safes us
a lot of log space when some tests time out, and safes us a few context switches
between log appender and log watchers.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
We were sort of assuming that one side telling it's ok would automagically make
the other side succeed as well.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
While 1 second is very generous, it resolving the HTLCs may take longer, so we
just loop over this instead of making it one-shot
Signed-off-by: Christian Decker <decker.christian@gmail.com>
The RPC calls aren't really free, so let's wait for absolute times, computed
from the `start_time` instead.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
We can restore them once we get parallel testing on Travis, but meanwhile
we time out because of the 30 seconds bitcoind poll.
Running on my laptop with --duration=5:
=========================== slowest 5 test durations ===========================
184.07s call tests/test_lightningd.py::LightningDTests::test_multiple_channels
156.66s call tests/test_lightningd.py::LightningDTests::test_forward
155.77s call tests/test_lightningd.py::LightningDTests::test_closing
126.83s call tests/test_lightningd.py::LightningDTests::test_waitinvoice
126.11s call tests/test_lightningd.py::LightningDTests::test_waitanyinvoice
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Because we have too many which are never used and I don't want to document
them.
1. Remove unused anchor_onchain_wait. When implemented, it should be
hardcoded to 100 or more.
2. Remove anchor_confirms_max. 10 always reasonable, and we can readd
an override option should someone need it.
3. max_htlc_expiry should be the same as locktime_max (which increases
from 3 to 5 days by default): they're both a limit on how long
funds can be locked up.
4. channel_update_interval should always be a dev option.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Make --override-fee-rates a dev option. We use default-fee-rate in
its place, which (since bitcoind won't give fee estimates in regtest
mode for short chains) gives an effective feerate of 15000/7500/3750.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
@cdecker points out that in test_forward, where we manually create a route,
we get an error back which contains an update for an unknown channel.
We should still note this, but it's not an error for testing.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is something which generally shouldn't happen, but we didn't
notice it previously.
We ignore this warning in the case where a channel was deleted: this
happens because one side can send an update while the other notices
that the channel is closed.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It gets confused if re-run due to flaky, since:
1. Using the same HSM, it generates the same spend and sees a previous one.
2. The block height numbers are off.
Fixes: #1479
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We have a 'bitcoind' global: getting it from inside one of the daemons
was a mistake I've copied widely.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It simply wasn't waiting for us to register the peer before attempting to open a
connection. Moved into a separate test to be able rerun multiple times.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
Someone could try to announce an internal address, and we might probe
it.
This breaks tests, so we add '--dev-allow-localhost' for our tests, so
we don't eliminate that one. Of course, now we need to skip some more
tests in non-developer mode.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
If we're given a wildcard address, we can't announce it like that: we need
to try to turn it into a real address (using guess_address). Then we
use that address. As a side-effect of this cleanup, we only announce
*any* '--addr' if it's routable.
This fix means that our tests have to force '--announce-addr' because
otherwise localhost isn't routable.
This means that gossipd really controls the addresses now, and breaks
them into two arrays: what we bind to, and what we announce. That is
now what we return to the master for json_getinfo(), which prints them
as 'bindings' and 'addresses' respectively.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
1. Add special option where an empty host means 'wildcard for IPv4 and/or IPv6'
which means ':1234' can be used to set only the portnum.
2. Only add this protocol wildcard if --autolisten=1 (default)
and no other addresses specified.
3. Pass it down to gossipd, so it can handle errors correctly: in most cases,
it's fatal not to be able to bind to a port, but for this case, it's OK
if we can only bind to one of IPv4/v6 (fatal iff neither).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It's become clear that our network options are insufficient, with the coming
addition of Tor and unix domain support.
Currently:
1. We always bind to local IPv4 and IPv6 sockets, unless --port=0, --offline,
or any address is specified explicitly. If they're routable, we announce.
2. --addr is used to announce, but not to control binding.
After this change:
1. --port is deprecated.
2. --addr controls what we bind to and announce.
3. --bind-addr/--announce-addr can be used to control one and not the other.
4. Unless --autolisten=0, we add local IPv4 & IPv6 port 9735 (and announce if they are routable).
5. --offline still overrides listening (though announcing is still the same).
This means we can bind to as many ports/interfaces as we want, and for
special effects we can announce different things (eg. we're sitting
behind a port forward or a proxy).
What remains to implement is semi-automatic binding: we should be able
to say '--addr=0.0.0.0:9999' and have the address resolve at bind
time, or even '--addr=0.0.0.0:0' and have the port autoresolve too
(you could determine what it was from 'lightning-cli getinfo'.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We're about to change the JSONRPC, so let's put an explicit 'port' into
our node class.
We initialize it at startup time: in future I hope to use ephemeral ports
to make our tests more easily parallelizable.
Suggested-by: @cdecker
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This used to be the port, but since we no longer have fixed ports, and we start
them in random order we can't easily distinguish them by the port anymore. Just
use a numeric ID that matches their lightning-dirs.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
We add an attempt number to the test directory to improve the test-isolation and
allow for multiple reruns of the same test, without re-using any of the
lightning-dirs or bitcoin-datadirs.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
This is the first example of the py.test style fixtures which should allow us to
write much cleaner and nicer tests.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
We never really look at the output, and it's rather noisy, so we just stop
writing to the log.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
Especially with valgrind this allows us to safe quite some time on multi-core
machines since startup is about the most expensive operation in the tests. In a
simple test spinning up 3 nodes this gave me about 25% - 30% test time
reduction. The effects will be smaller for single core machines, hoping Travis
handles these gracefully.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
We can create the hsm file from python directly; that works even if we
don't have DEVELOPER set, and is simpler.
We add a test that the aliases are correct.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>