We are now too quick in disabling the channel for us to attempt a
payment. We need to separate into getroute and sendpay to trigger this
now.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
This was flaky because we didn't wait for the fee update to complete
and were using the old, way too small, fees, which upset bitcoind.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
delinvoice was orginally documented to only allow deletion of unpaid
invoices, but there might be reasons to delete paid ones or unexpired ones.
But we have to avoid the race where someone pays as it's deleted: the
easiest way is to have the caller tell us the status, and fail if
it's wrong.
Fixes: #477
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We're going to have to support multiple channels per peer, even if only
when some are onchain. This would break the current listpeers, so
change it to an array (single element for now).
Other cleanups:
1. Only set connected true if daemon is not onchaind.
2. Only show netaddr if connected; don't make it an array, call it `address`
in comparison with `addresses` in listnodes.
3. Rename `channel` to `short_channel_id`
4. Add `funding_txid` field for voyeurism.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Individual tests can always re-enable them, though.
[ More test fallout fixes by Christian Decker ]
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Seems to avoid the nasty python resource warnings, as well as the
fatal 'ValueError: PyMemoryView_FromBuffer(): info->buf must not be NULL'
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This, of course, should never be used. But it helps maintain connections
for the moment while we dig deeper into feerates.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
For now this just tests that we are sending out keepalive
channel_updates for all local channels.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
Since most callers use positional arguments, we should allow a 'null'
literal where we require no value at all.
Also adds some more value tests.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Paid invoices need to know how much was actually paid: both for the case
where no 'msatoshi' amount was specified, and for the normal case, where
clients are permitted to overpay in order to help them disguise their
payments.
While we migrate the db, we leave this field as 0 for old paid
invoices. This is unhelpful for accounting, but at least clearly
indicates what happened if we find this in the wild.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
'rhash' is the old terminology, but 'payment_preimage' and
'payment_hash' were decided on for the BOLTs, so we should fix that here.
We still use rhash internally, but that's much easier to fix.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
> assert [c['active'] for c in l2.rpc.getchannels()['channels']] == [True, True]
E AssertionError: assert [False, True] == [True, True]
E At index 0 diff: False != True
E Full diff:
E - [False, True]
E + [True, True]
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We need to make sure all the updates are known to gossip. Since
one is the local update, we change that message to look the same.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Receiving them in channeld is not enough to avoid the race:
route = l1.rpc.getroute(l3.info['id'], 4999999, 1)["route"]
...
ValueError: RPC call failed: Could not find a route
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We wait the the receipt of the CHANNEL_UPDATE message by channeld,
but that doesn't mean it reached gossipd yet, causing spurious test
failure.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
CI always runs with TEST_DEBUG=1 which prints logs anyway, and testing
locally should also be done this way, combined with pytest which
captures the logs. No need to duplicate the functionality of pytest.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
Since we seem to have some isolation concerns when re-generating the
same HSM secret and re-parsing the blockchain some blocks in the past.
This also alleviates the problem of printing to a logging stream that
has been closed. Previously bitcoind would keep running despite a test
had failed and continue logging to the, now closed, StringIO that
py.test uses when capturing stdout.
The performance impact seems to be 1-3 second per test, not too bad
IMHO for increased test isolation and cleaner logs:
|--------------------+---------------+----------|
| | No_valgrind | Valgrind |
|--------------------+---------------+----------|
| bitcoind per suite | 10 min 24 sec | 46:15.31 |
| bitcoind per test | 11 min 38 sec | 49:21.64 |
|--------------------+---------------+----------|
Signed-off-by: Christian Decker <decker.christian@gmail.com>
I was examining a test_onchain_timeout failure, and realized that we
were forgetting a peer even though we'd just spent the HTLC_TIMEOUT_TX!
This reveals that we weren't resolving an output when we stole the preimage
from it, like we should.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Since we panic when we see our root reorg out, even if we're not doing
anything yet, restoring the 100 block margin is the simplest fix.
Unfortunately this means adding a 100-block spacer in the tests, so things
don't get confused.
Fixes: #511
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We stopped too early; we should continue and make sure it all goes well.
This means we have to fix them to be deterministic: by generating 2
blocks at once in test_htlc_in_timeout, we raced between fulfill and
timeout on the HTLC. Now it's always fulfilled.
Also, fixed confusing comments: l1 doesn't drop to chain.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
OUR_HTLC_TIMEOUT_TO_US = normal tx, used to timeout htlc in their commit tx.
OUR_HTLC_TIMEOUT_TX = dual-sig tx with delay, used to timeout htlc in our commit tx.
Only one test looks at that string, so fix that too.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The test_reconnect_normal test is failing rather consistently on 32bit
architectures, disabling to reduce noise. Issue #468 tracks progress.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
With python-bitcoinlib==0.9.0 it appears that the URL based auth
information is no longer used, so we fall back to reading the config
file for the bitcoind daemon instead.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
Relatively simple: until we reach funding-depth the channels should be
known locally, so we can already route through them, but they should
not be announced to peers to which the connection is non-local.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
If send_htlc_out() fails, it doesn't initialize pc->out; that can
make us think it's still in progress.
Reported-by: Jonas Nick
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
All peers come from gossipd, and maintain an fd to talk to it. Sometimes
we hand the peer back, but to avoid a race, we always recreated it.
The race was that a daemon closed the gossip_fd, which made gossipd
forget the peer, then master handed the peer back to gossipd. We stop
the race by never closing the gossipfd, but hand it back to gossipd
for closing.
Now gossipd has to accept two fds, but the handling of peers is far
clearer.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We should not disconnect from a peer just because it fails opening; we
should return it to gossipd, and give a meaningful error.
Closes: #401
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We don't use it yet, but now we'll decode correctly.
See: https://github.com/lightningnetwork/lightning-rfc/pull/317
lightning-rfc commit: ef053c09431442697ab46e83f9d3f86e3510a18e
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
If you run locally, it fails occasionally; presumably because it
sees previous funds. Use a random HSM key for that teste.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
I noticed some breakage with git master:
1. getinfo no longer supported (for us, use getblockchaininfo)
2. generate no longer supported (use generatetoaddress)
Both these options are supported at least in 0.15, too.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This seems to happen when we manage to check between the
channel_announcement and the channel_update being processed.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
Add two simple tests: one for a single direct payment and one with
hundreds of parallel payments, reusing the same route.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
We are announcing that we are willing to accept incoming payments with
current_height + min_final_cltv_expiry + slack, assuming that the
sender adds some slack. In particular we'd reject the payment if
slack=0 which is allowed by the spec.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
Raising the exception about non-zero exit values into the
teardown. This was previously masking the valgrind errors. Now
valgrind errors > crash errors > non-zero return value.
Still hoping to catch that elusive [7, 0] return value on travis.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
These need to be different for testing the example in BOLT 11.
We also use the cltv_final instead of deadline_blocks in the final hop:
various tests assumed 5 was OK, so we tweak utils.py.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
I run with ulimit -c unlimited, and valgrind leaves core files like
valgrind-errors.22114.core.22114 which test_lightning.py tries to
parse as log files.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It crashes under valgrind, causing a valgrind error: valgrind gives us a
backtrace anyway, so we don't need it.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
And we report these through the getpeers JSON RPC again (carefully: in
our reconnect tests we can get duplicates which this patch now filters
out).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is a bit messier than I'd like, but we want to clearly remove all
dev code (not just have it uncalled), so we remove fields and functions
altogether rather than stub them out. This means we put #ifdefs in callers
in some places, but at least it's explicit.
We still run tests, but only a subset, and we run with NO_VALGRIND under
Travis to avoid increasing test times too much.
See-also: #176
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Makes it easier to compare before/after failures. Ideally, we should
run under Travis both with this option and with the seed based on the
entire tmp path (which is still reproducible with determination, but
not fixed every run like this is).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
There are now only two kinds of subdaemons: global ones (hsmd, gossipd) and
per-peer ones. We can handle many callbacks internally now.
We can have a handler to set a new peer owner, and automatically do
the cleanup of the old one if necessary, since we now know which ones
are per-peer.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Now the flow is much simpler from a lightningd POV:
1. If we want to connect to a peer, just send gossipd `gossipctl_reach_peer`.
2. Every new peer, gossipd hands up to lightningd, with global/local features
and the peer fd and a gossip fd using `gossip_peer_connected`
3. If lightningd doesn't want it, it just hands the peerfd and global/local
features back to gossipd using `gossipctl_handle_peer`
4. If a peer sends a non-gossip msg (eg `open_channel`) the gossipd sends
it up using `gossip_peer_nongossip`.
5. If lightningd wants to fund a channel, it simply calls `release_channel`.
Notes:
* There's no more "unique_id": we use the peer id.
* For the moment, we don't ask gossipd when we're told to list peers, so
connected peers without a channel don't appear in the JSON getpeers API.
* We add a `gossipctl_peer_addrhint` for the moment, so you can connect to
a specific ip/port, but using other sources is a TODO.
* We now (correctly) only give up on reaching a peer after we exchange init
messages, which changes the test_disconnect case.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We currently assume the daemon gives up; gossipd won't, and we want to
test it there too.
This reveals a bug (returning io_close() is bad if the call is to
duplex()), and breaks a test which now continues after dropping a
packet..
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Useful if we want to drop & suppress, for example. We change '=' to mean
do nothing to the packet.
We use this to clean up the test_reconnect_sender_add test.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We're going to make the ip/port optional, so they should go at the end.
In addition, using ip:port is nicer, for gethostbyaddr().
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We have to do a dance when we get a reconnect in openingd, because we
don't normally expect to free both owner and peer. It's a layering
violation: freeing a peer should clean up the owner's pointer to it,
to avoid a double free, and we can eliminate this dance.
The free order is now different, and the test_reconnect_openingd was
overprecise.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Now that we have HTLC persistence we'd also like to test it. This
kills the second node in the middle of an HTLC, it'll recover and
finish the flow.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
This broke somewhere in the recent changes, because we override
TailalbleProc stop(). Break out log extractor.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Moved the flagging for allowed failures into the factory getter, and
renamed into `may_fail`. Also stopped the teardown of a node from
throwing an exception if we are allowed to exit non-cleanly.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
A failed returncode check could result in the cleanup for other
lightningds to be skipped. Now make sure to cleanup all and then
rethrow an exception that contains all returncodes.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
We used to simply kill the daemon, which in some cases could result in
half-written crashlogs and similar artifacts such as half-completed
RPC calls. Now we ask lightningd to stop nicely, give it some time and
only then kill it. We also return the returncode of the daemon.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
peer_fail_permanent() frees peer->owner, but for bad_peer() we're
being called by the sd->badpeercb(), which then goes on to
io_close(conn) which is a child of sd.
We need to detach the two for this case, so neither tries to free the
other.
This leads to a corner case when the subd exits after the peer is gone:
subd->peer is NULL, so we have to handle that too.
Fixes: #282
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Note that it should really be a flag to daemon on construction, too,
but that may interfere with another concurrent branch so I've deferred.
Suggested-by: Christian Decker
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We re-use the value for reasonable_depth given by the master, and we
tell it when our timeout transactions reach that depth.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>