Now sending a ping makes sense: it should force the other end to send
a reply, unblocking the commitment process.
Note that rather than waiting for a reply, we're actually spinning on
a 100ms loop in this case. But it's simple and it works.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Currently, if we don't realize a TCP connection is down, we almost
certainly don't find out until *after* we're sent the
commitment_signed message, in which case we cannot fail the incoming
HTLC.
This test demonstrates that. Note the 30 second sleep: we should really
run Travis tests in parallel!
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
If we have an address hint, we start with that, but we'll use
node_announcement information if required.
Note: we (ab)use the address hint when restoring from the database
or reconnecting, even if the connection was *incoming*. That meant
that the recipient of a connection would *never* manage to connect out.
We still don't take multiple addresses from the DNS seeds: I assume we
should, since there could be IPv4 and IPv6.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
connectd tells master about every disconnection, and master knows
whether it's important to reconnect. Just get the master to invoke a new
connect command if it considers the peer important!
The only twist is timeouts: we don't want to immediately reconnect if
we've failed to connect. To solve this, connectd passes a 'delaytime'
to the master when a connection fails, and the master passes it back
when it asks for a connection.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
In particular, all opening_read_peer_msg() callers need to know there
was an error (presumably, negotiating) so they can stop, but we should
not exit.
This lets us reenable the final disabled test.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We now simply maintain a pubkey set for connected peers (we only care
if there's a reconnect), not the entire peer structure.
lightningd no longer queries us for getpeers: it knows more than we do
already.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
There's now a potential race: the source peer connect returns, but in
destination peer the master hasn't read the connect message from
connectd, so the peer isn't in listpeers yet.
(Previously the connection stayed in connectd, so there was no such
window).
This is an occasional issue in a few places.
Note that we take the opportunity to speed up test_disconnectpeer too
while we're there.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Prior to this, lightningd would hand uninteresting peers back to connectd,
which would then return it to lightningd if it sent a non-gossip msg,
or if lightningd asked it to release the peer.
Now connectd hands the peer to lightningd once we've done the init
handshake, which hands it off to openingd.
This is a deep structural change, so we do the minimum here and cleanup
in the following patches.
Lightningd:
1. Remove peer_nongossip handling from connect_control and peer_control.
2. Remove list of outstanding fundchannel command; it was only needed to
find the race between us asking connectd to release the peer and it
reconnecting.
3. We can no longer tell if the remote end has started trying to fund a
channel (until it has succeeded): it's very transitory anyway so not
worth fixing.
4. We now always have a struct peer, and allocate an uncommitted_channel
for it, though it may never be used if neither end funds a channel.
5. We start funding on messages for openingd: we can get a funder_reply
or a fundee, or an error in response to our request to fund a channel.
so we handle all of them.
6. A new peer_start_openingd() is called after connectd hands us a peer.
7. json_fund_channel just looks through local peers; there are none
hidden in connectd any more.
8. We sometimes start a new openingd just to send an error message.
Openingd:
1. We always have information we need to accept them funding a channel (in
the init message).
2. We have to listen for three fds: peer, gossip and master, so we opencode
the poll.
3. We have an explicit message to start trying to fund a channel.
4. We can be told to send a message in our init message.
Testing:
1. We don't handle some things gracefully yet, so two tests are disabled.
2. 'hand_back_peer .*: now local again' from connectd is no longer a message,
openingd says 'Handed peer, entering loop' once its managing it.
3. peer['state'] used to be set to 'GOSSIPING' (otherwise this field doesn't
exist; 'state' is now per-channel. It doesn't exist at all now.
4. Some tests now need to turn on IO logging in openingd, not connectd.
5. There's a gap between connecting on one node and having connectd on
the peer hand over the connection to openingd. Our tests sometimes
checked getpeers() on the peer, and didn't see anything, so line_graph
needed updating.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
In particular, I found lightning_openingd processes after running
tests. When we use the dev_disconnect blackhole '0' option, they
stick around until the dev_disconnect file is truncated (there is only
so much you can do with only a file descriptor), so let's do that.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The following changes revealed this race, where expecting listchannels()
to contain two channels immediately after fund_channel() was racy.
We also derive the short_channel_id first, so we can search logs for the
exact messages.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The next patches get better at reconecting, so if we use dev-allow-localhost
nodes can often find each other and reconnect before shutting down; only
use that option where we actually need it.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Saw this in Travis: technically we return from the dev_set_max_scids...
cmd after sending it to gossipd, but we should wait for it to log.
Adding an internal reply message for a dev command seems overkill.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
1. If the IPv6 address was public, that changed the wireaddr and thus the ipv4 bind
would not be to a wildcard and would fail.
2. Binding two fds to the same port on both wildcard IPv4 and IPv6 succeeds; we only
fail when we try to listen, so allow error at this point.
For some reason this triggered on my digital ocean machine.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
openingd calculates our reserve based on the channel amount (even if
we're funding, to keep the calculation in one place), but it wasn't
reporting it back to the master daemon. We initialized it to 0 so that
valgrind wouldn't get upset, as it's part of a structure we send over
the wire.
Have openingd report back, and also initialize it to an impossible value
as extra assurance. And remove a stray (harmless but weird) semicolon.
Reported-by: Gálli Zoltán
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This adds one line with the onion and the channel_update we extract from
it. This in turn allows us to check that the channel_update in the onion is not
type prefixed, and that we patch it correctly before passing it to gossipd.
The easiest way to do this is to play with the 'wallet_tx' semantics
and have 'amount' have meaning even when 'all_funds' is set.
Note that we change the string 'Cannot afford funding transaction' to
'Cannot afford transaction' as this code is also used for withdrawls.
Inspired-by: molz on #c-lightning
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The logs in various Travis failures show that it takes 20 seconds just for
closingd to read the init message. As a result, the close times out (default
is 30 seconds).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We were *supposed* to be waiting for the next commitment tx so we
made sure the one we broadcast was old, *but* the 'revoke_and_ack'
we were waiting for could be matched by the completion of the previous
'revoke_and_ack'.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This needs to be done separately from the rest of the daemon since we can
otherwise not make sure that it happens before the DB is freed and we might
still need the DN, and be running in a DB transaction, for some destructors to
run.
That was the cause of the bad gossip order failures: gossipd thought our
channel was live, but the other end didn't receive message last time.
Now gossipd doesn't use fd to kill us (connectd tells master to do so), we
can implement read_peer_msg_nogossip().
Fixes: #1706
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This patch guts gossipd of all peer-related functionality, and hands
all the peer-related requests to channeld instead.
gossipd now gets the final announcable addresses in its init msg, since
it doesn't handle socket binding any more.
lightningd now actually starts connectd, and activates it. The init
messages for both gossipd and connectd still contain redundant fields
which need cleaning up.
There are shims to handle the fact that connectd's wire messages are
still (mostly) gossipd messages.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Gossipd combines the information if it knows it, but that's really the
job of 'listnodes'. More importantly, channeld won't have access to
this information.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
I saw an error in test_gossip_weirdalias in Travis, where listnodes(nodeid)
returned *BOTH* nodes; it happened to fail because [0] was the wrong one, but
it would have passed if the order had been different.
This helper asserts that we really do only have one element, and should
catch such bugs faster.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
I could not figure out why test_announce_address suddenly stopped working:
I had previously been using DEVELOPER=1 on the cmdline for historical
reasons when testing locally.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We weren't waiting for gossipd to actually process the
dev_set_max_scids_encode_size message, so under Travis it sometimes
split the reply before processing that.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
In this test we tell l3 to disconnect on sending WIRE_CHANNEL_ANNOUNCEMENT.
This is hit by gossipd (to disconnect from l2) but *also* channeld to
disconnect from l4. That's OK, because normally by this point l4 has
sent its real channel_update.
However, the next patch introduces a delay in sending channel_updates,
meaning l4 hasn't sent it yet. If l3 doesn't reconnect to l4, we
never get the channel_update and the test which expects l1 to eventually
see both sides of the channel fails.
So we manually reconnect then. Note that we remove the redundant
'dev-no-reconnect' option from l2: it's added automatically as it
doesn't set 'may_reconnect'.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Gossipd will ignore the second one, but doing it in the front end
gives an explicit error message.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Python dict can't have duplicate entries, but some options can be specified
multiple times. The easiest way is to put a list in the dict.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
E ConnectionRefusedError: [Errno 111] Connection refused
And in debug.log:
2018-05-17T04:06:35Z Warning: Config setting for -rpcport only applied on regtest network when in [regtest] section.
Unfortunately, current versions including 0.16.1 *ignore* the contents of
a '[regtest]' section, so we need it in *both* places.
Also remove the misleading 'rpcport' initialization which we always
override.
Note that we don't fix this message though:
2018-05-17T04:06:35Z Config options rpcuser and rpcpassword will soon be deprecated. Locally-run instances may remove rpcuser to use cookie-based auth, or may be replaced with rpcauth. Please see share/rpcuser for rpcauth auth generation.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
For some reason, we created a second bitcoin.conf in the regtest/ directory,
which AFAICT nothing uses.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It's not optional for our test setup, and this makes it easier to invoke
bitcoin-cli manually, for example.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
As of bitcoind 0.16.1, you can't send a single-input OP_RETURN output,
as you get 'tx-too-small'.
sendrawtx exit 26, gave error code: -26?error message:?tx-size-small (code 64)?'
So instead we use the minimum fee we can, but otherwise ignore it and
don't wait for it to be mined.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
New codes: FUND_MAX_EXCEEDED, FUND_CANNOT_AFFORD, FUND_DUST_LIMIT_UNMET.
The error message "Cannot afford fee" was not exactly correct because
it would also occur if the amount requested could not be afforded. So
I changed it to the more generic "Cannot afford transaction".
Other things:
* Fixed off-by-one satoshi in fundchannel manpage.
* Changed 'arror' to 'error' because we are not pirates.
I still believe that 2 weeks is way too much, but we were promised that these
defaults would be slowly reduced to saner values as the stability increases.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
Waiting for 'Disabling channel' is not enough, since it's async to the
actual tx broadcast: I caught a case where the unilateral close wasn't
in the block, and so failed.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
72d103d6bb deprecated DEVELOPER env var
in favor of config.vars, but didn't update test_closing.py or test_gossip.py.
5d0a54b7f0 then removed the explicit
DEVELOPER= setting from Travis.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The modern, pytest based, tests now clean up after themselves by removing
directories of successful tests and the base directory if there was no failure.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
Previous replies weren't large enough; add another channel and then
use IO tracing to make sure the reply is zlib encoded.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Since we currently only (ab)use it to send everything, we need a way to
generate boutique queries for testing.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The gossip-query spec enhancements say not to forward an channel_announcement until
you have receive a channel_update. This test fails for now.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
When opening a number channels from a single node we could end up not waiting
for the funding tx to make it into the mempool, instead triggering on a previous
`sendrawtransaction` or `CHANNEL_NORMAL` in the logs. This now checks that the
actual funding transaction makes it into the mempool and that we wait for the
depth change for that specific channel.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
We don't have any connection yet, so how could they be active? Disable both
sides to avoid trying to route through them or telling others to use them as
`contact_points` in invoices.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
Mostly `lightningd` complaining about not being able to estimate fees. Safes us
a lot of log space when some tests time out, and safes us a few context switches
between log appender and log watchers.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
We were sort of assuming that one side telling it's ok would automagically make
the other side succeed as well.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
While 1 second is very generous, it resolving the HTLCs may take longer, so we
just loop over this instead of making it one-shot
Signed-off-by: Christian Decker <decker.christian@gmail.com>
The RPC calls aren't really free, so let's wait for absolute times, computed
from the `start_time` instead.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
We can restore them once we get parallel testing on Travis, but meanwhile
we time out because of the 30 seconds bitcoind poll.
Running on my laptop with --duration=5:
=========================== slowest 5 test durations ===========================
184.07s call tests/test_lightningd.py::LightningDTests::test_multiple_channels
156.66s call tests/test_lightningd.py::LightningDTests::test_forward
155.77s call tests/test_lightningd.py::LightningDTests::test_closing
126.83s call tests/test_lightningd.py::LightningDTests::test_waitinvoice
126.11s call tests/test_lightningd.py::LightningDTests::test_waitanyinvoice
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Because we have too many which are never used and I don't want to document
them.
1. Remove unused anchor_onchain_wait. When implemented, it should be
hardcoded to 100 or more.
2. Remove anchor_confirms_max. 10 always reasonable, and we can readd
an override option should someone need it.
3. max_htlc_expiry should be the same as locktime_max (which increases
from 3 to 5 days by default): they're both a limit on how long
funds can be locked up.
4. channel_update_interval should always be a dev option.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Make --override-fee-rates a dev option. We use default-fee-rate in
its place, which (since bitcoind won't give fee estimates in regtest
mode for short chains) gives an effective feerate of 15000/7500/3750.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
@cdecker points out that in test_forward, where we manually create a route,
we get an error back which contains an update for an unknown channel.
We should still note this, but it's not an error for testing.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is something which generally shouldn't happen, but we didn't
notice it previously.
We ignore this warning in the case where a channel was deleted: this
happens because one side can send an update while the other notices
that the channel is closed.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It gets confused if re-run due to flaky, since:
1. Using the same HSM, it generates the same spend and sees a previous one.
2. The block height numbers are off.
Fixes: #1479
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We have a 'bitcoind' global: getting it from inside one of the daemons
was a mistake I've copied widely.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It simply wasn't waiting for us to register the peer before attempting to open a
connection. Moved into a separate test to be able rerun multiple times.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
Someone could try to announce an internal address, and we might probe
it.
This breaks tests, so we add '--dev-allow-localhost' for our tests, so
we don't eliminate that one. Of course, now we need to skip some more
tests in non-developer mode.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
If we're given a wildcard address, we can't announce it like that: we need
to try to turn it into a real address (using guess_address). Then we
use that address. As a side-effect of this cleanup, we only announce
*any* '--addr' if it's routable.
This fix means that our tests have to force '--announce-addr' because
otherwise localhost isn't routable.
This means that gossipd really controls the addresses now, and breaks
them into two arrays: what we bind to, and what we announce. That is
now what we return to the master for json_getinfo(), which prints them
as 'bindings' and 'addresses' respectively.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
1. Add special option where an empty host means 'wildcard for IPv4 and/or IPv6'
which means ':1234' can be used to set only the portnum.
2. Only add this protocol wildcard if --autolisten=1 (default)
and no other addresses specified.
3. Pass it down to gossipd, so it can handle errors correctly: in most cases,
it's fatal not to be able to bind to a port, but for this case, it's OK
if we can only bind to one of IPv4/v6 (fatal iff neither).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It's become clear that our network options are insufficient, with the coming
addition of Tor and unix domain support.
Currently:
1. We always bind to local IPv4 and IPv6 sockets, unless --port=0, --offline,
or any address is specified explicitly. If they're routable, we announce.
2. --addr is used to announce, but not to control binding.
After this change:
1. --port is deprecated.
2. --addr controls what we bind to and announce.
3. --bind-addr/--announce-addr can be used to control one and not the other.
4. Unless --autolisten=0, we add local IPv4 & IPv6 port 9735 (and announce if they are routable).
5. --offline still overrides listening (though announcing is still the same).
This means we can bind to as many ports/interfaces as we want, and for
special effects we can announce different things (eg. we're sitting
behind a port forward or a proxy).
What remains to implement is semi-automatic binding: we should be able
to say '--addr=0.0.0.0:9999' and have the address resolve at bind
time, or even '--addr=0.0.0.0:0' and have the port autoresolve too
(you could determine what it was from 'lightning-cli getinfo'.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We're about to change the JSONRPC, so let's put an explicit 'port' into
our node class.
We initialize it at startup time: in future I hope to use ephemeral ports
to make our tests more easily parallelizable.
Suggested-by: @cdecker
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This used to be the port, but since we no longer have fixed ports, and we start
them in random order we can't easily distinguish them by the port anymore. Just
use a numeric ID that matches their lightning-dirs.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
We add an attempt number to the test directory to improve the test-isolation and
allow for multiple reruns of the same test, without re-using any of the
lightning-dirs or bitcoin-datadirs.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
This is the first example of the py.test style fixtures which should allow us to
write much cleaner and nicer tests.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
We never really look at the output, and it's rather noisy, so we just stop
writing to the log.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
Especially with valgrind this allows us to safe quite some time on multi-core
machines since startup is about the most expensive operation in the tests. In a
simple test spinning up 3 nodes this gave me about 25% - 30% test time
reduction. The effects will be smaller for single core machines, hoping Travis
handles these gracefully.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
We can create the hsm file from python directly; that works even if we
don't have DEVELOPER set, and is simpler.
We add a test that the aliases are correct.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We're getting spurious closures, even on mainnet. Using --ignore-fee-limits
is dangerous; it's slightly less so to lower the minimum (which is the
usual cause of problems).
So let's halve it, but beware the floor.
This is a workaround, until we get independent feerates in the spec.
Fixes: #613
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
At least say whether we failed to connect at all, or failed cryptographic
handshake, or failed reading/writing init messages.
The errno can be "Operation now in progress" if the other end closes the
socket on us: this happens when we handshake with the wrong key and it
hangs up on us. Fixing this would require work on ccan/io though.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
When we get a reconnection, kill the current remote peer, and wait for the
master to tell us it's dead. Then we hand it the new peer.
Previously, we would end up with gossipd holding multiple peers, and
the logging was really hard to interpret; I'm not completely convinced
that we did the right thing when one terminated, either.
Note that this now means we can have peers with neither ->local nor ->remote
populated, so we check that more carefully.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
1. Lifetime of 'struct reaching' now only while we're actively doing connect.
2. Always free after a single attempt: if it's an important peer, retry
on a timer.
3. Have a single response message to master, rather than relying on
peer_connected on success and other msgs on failure.
4. If we are actively connecting and we get another command for the same
id, just increment the counter
The result is much simpler in the master daemon, and much nicer for
reconnection: if they say to connect they get an immediate response,
rather than waiting for 10 retries. Even if it's an important peer,
it fires off another reconnect attempt, unless it's actively
connecting now.
This removes exponential backoff: that's restored in next patch. It
also doesn't handle multiple addresses for a single peer.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Got some intermittant failures, mainly caused by the tests being slow
enough that the peer reconnected. We should always suppress
reconnection if we can, and not stress too much in the !DEVELOPER case
where we can't.
We should turn off dev-no-reconnect *always* unless told we will
reconnect, and since we can't if !DEVELOPER, don't do the connection
check there.
Instead of adding an option to line_graph, we remove it in favor
of connect (since we only use it with n=2 anyway).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This shaves off about 15% of our integration testing suite on my machine. It
assumes we never reorg below the first block the node starts with, which is true
for all tests, so it's safe.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
Simplification of the offset calculation to use the rescan parameter, and rename
of `wallet_first_blocknum`. We now use either relative rescan from our last
known location, or absolute if a negative rescan was given. It's all handled in
a single location (except the case in which the blockcount is below our
precomputed offset), so this should reduce surprises.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
Careful log examination revealed that we were generating a block before one
of the mutual close txs had entered the mempool. This is rare because it
means that both peers have to be too slow.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
I had a weird failure which was caused by an unexpected disconnect and
reconnecct. Since we are prersistend and recover from these, they can
slip through our tests; most tests don't involve reconnection, so we
need to catch this explicitly.
For the connect() helper, we always suppress reconnection; tests which
want it all want other options so don't use this helper anyway. (Actually,
after I said that, test_closing_while_disconnected was added when I
rebased, which did require it, so I had to open-code that one).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Also report tx and txid, and whether we closed unilaterally or
bilaterally, if we could close the channel.
Also make a manpage.
Fixes: #1207Fixes: #714Fixes: #622
Mixing the test into the existing test allows us to reduce the execution, but
adds a bit of complexity, so I added two small helpers.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
We can have more than one; eg we might offer both bech32 and a p2sh
address, and in future we might offer v1 segwit, etc.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This allows us to have some default options that can then be overridden easily
on a per-test basis.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
In the next commit we remove channels whose outpoint was spent from our network
view, so checking for it will not work anymore.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
This fixes the root cause of https://github.com/ElementsProject/lightning/issues/1212
where we deleted the payment because we wanted to retry, then retry failed
so we had an (old) HTLC without a matching payment. We then fed that
HTLC to onchaind, which tells us it's missing, and we try to fail the
payment and deref a NULL pointer.
Fixes: #1212
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We say "in N blocks" but we actually mean "N blocks after this tx" which is
actually N-1 or less. Change wording and tighten tests which misunderstood
this.
Also, the 'assert not l1.daemon.is_in_log('onchaind complete, forgetting peer')'
are unlikely to work until the daemon has actually seen the block, so add
sync_blockheight before all of those.
These changes reveal some sloppy testing, which we fix.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
With the following patch applied, we could clearly see onchaind try to
broadcast the timeout tx one block too early:
sendrawtx exit 26, gave error code: -26?error message:?non-final (code 64)?
This is because of an out-by-one error in calculating the relative
depth required, since the out->tx_blockheight is already 1 before the
current block.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This was revealed in #1114; onchaind isn't actually completely idempotent
due to fee changes (and the now-fixed change in keys used).
This triggers the bug by restarting with different fees, resulting in
onchaind not recognizing its own proposal:
2018-03-05T09:38:15.550Z lightningd(23076): lightning_onchaind-022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59 chan #1: STATUS_FAIL_INTERNAL_ERROR: THEIR_UNILATERAL/OUR_HTLC spent with weird witness 3
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is twice the 'update_channel_interval' we get handed.
We delete the non-existent channel_add_connection and delete_connection
declarations from the header too.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We currently give them a free pass. The simplest fix is to give them
an old timestamp on initialization.
We still skip unannounced channels, on the assumption that they're
ours. And we set the last_update_timestamp to -1 when we convert to
gossip_getchannels_entry to indicate no update.
This breaks the DEVELOPER=1 pruning test, since we hardcode the 1
week timeout. That's fixed in the next patch.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
I initially disabled this until 0.16 because the withdraw command
was modified to require 'brct1' addresses for regtest.
But commit bd07a9 allows a regular testnet address to work
just as well. So renable this check.
FWIW, the tests without valgrind take 662 seconds before we reduced
the number of blocks, and only 648 seconds now.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
0.16.0 is required since we rely on it for some tests and the block
reduction allows us to waste less time during setup. 121 blocks were
chosen so that we have at least one mature output to spend.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
If bitcoind is still fetching blocks, we might accidentally inject
the failure between getblockhash and getblock. That's OK, but
it's not the failure we test for.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Looks like rebasing the flake8 branch caused breakage, as new violations
had occurred since that check was written
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This test generates a pre-bip173 testnet P2PKH address with bitcoin 0.15.1
which fails under the new verification checks for the bip173 network name.
This should be renabled with bitcoin 0.16 which can generate a bip173 address
for regtest with the bcrt prefix.
* Modifies invoice command to have the following format
invoice <msatoshi> <label> <desc> <?expiry> <?fallbackaddr>
* Adds support for Segwit bcrt1 addresses for withdraw
* Add test case for fallback address in invoice creation
* Create a common json_tok_address_scriptpubkey to be used
by invoice and withdraw commands.
We can do similar tricks to test other things, even to run without a
real bitcoind for faster testing, but for now we simply exit if a magic
file says so.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The billboard is now far more useful to tell what's going on, and this
gets us closer to a state == owner mapping.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We also fold opening_got_hsm_funding_sig() into the caller; it was
previously a callback before we decided to always use the HSM
synchronously.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Each peer can have one 'uncommitted' channel, which is in the process
of opening. This is used for openingd, and then on return we convert
it into a full-fledged struct channel and commit it into the database.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* Fix dev_setfees to set slow and normal fees correctly.
Due to a bug def_setfees(100, slow=100) would instead set immediate and
normal fees to 100. This behavior has been updated to set fees to
correct values, make the values truly optional as per documentation and
unit test this behavior.
* Fix pay() to set msatoshi, description and risk factor properly
Due to a bug pay(invoice, description='1000') resulted in payment of
1000 msatoshi instead. This was fixed and covered with tests.
* Fix named args in listpayments, listpeers and connect
* Do not pass None to methods where it is default value
* Make description on invoice and pay match.
Suggested-by: @ZmnSCPxj
* Fix dev_setfees to set slow and normal fees correctly.
Due to a bug def_setfees(100, slow=100) would instead set immediate and
normal fees to 100. This behavior has been updated to set fees to
correct values, make the values truly optional as per documentation and
unit test this behavior.
* Fix pay() to set msatoshi, description and risk factor properly
Due to a bug pay(invoice, description='1000') resulted in payment of
1000 msatoshi instead. This was fixed and covered with tests.
* Fix named args in listpayments, listpeers and connect
* Do not pass None to methods where it is default value
* Make description on invoice and pay match.
Suggested-by: @ZmnSCPxj
We now keep a list of commands for the jcon instead of a simple
'current' pointer: the assertions become a bit more complex, but
the rest is fairly mechanical.
Fixes: #1007
Reported-by: @ZmnSCPxj
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We need to set expiry, otherwise waitinvoice would take 1 hr, and we
can't read once for every cmd, since each read may consume more than
a single result, and we block.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Once we read a command, we are supposed to io_wait until it finishes.
However, we are actually woken in two places: when it's complete
(which is correct), and when it's written out (which is wrong).
We don't care when it's written out, only when it's finished:
refactor to make json_done() free and NULL the old ->current,
rather than have the callers do it. Now it's clear that it's
ready for both new output and new input.
Fixes: #934
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
And return the correct error message for the channel they give, if
they try to re-establish on an error channel.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Much like the database; peer contains id, address, channel contains
per-channel information. Where we create a channel, we always create
the peer too.
For the moment, peer->log and channel->log coexist side-by-side, to
reduce some of the churn.
Note that this changes the API to dev-forget-channel: if we have more
than one channel, we insist they specify the short-channel-id.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We get intermittant failure: WIRE_UNKNOWN_NEXT_PEER (First peer not ready)
because CHANNELD_NORMAL and actually telling gossipd that the channel
is available are distinct things: we need both.
(For test_closing_different_fees, we were testing CHANNELD_NORMAL on
the peer, not on l1, too).
But we may also directly send the announcement sigs if the height is
sufficient, so the simplest is to unify the messages.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
```
> assert [c['active'] for c in l2.rpc.listchannels()['channels']] == [True, True]
E AssertionError: assert [True, False] == [True, True]
E At index 1 diff: False != True
E Full diff:
E - [True, False]
E + [True, True]
```
We don't actually wait that l2's gossipd has also processed the message.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Sometimes the super-low-fee commitment tx succeeds, and we see
that 'sendrawtx exit 0' instead of the one we're expecting.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
fundrawtransaction returns before the actual sendrawtx, so we can
end up mining blocks before it's sent, thus not having enough confirms.
We handle this correctly in fund_channel, but this test open-codes it
for speed with multiple peers.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It was taking over 10 minutes under valgrind, causing Travis to time it out.
This shrinks it to its essential tests, and also batches.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
With the new 'human-readable' mode of lightning-cli, this actually produces
a valid config file. It's a bit hacky though...
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
In particular, decode error messages correctly and do the right thing with
messages about other channels.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We are now too quick in disabling the channel for us to attempt a
payment. We need to separate into getroute and sendpay to trigger this
now.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
This was flaky because we didn't wait for the fee update to complete
and were using the old, way too small, fees, which upset bitcoind.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
delinvoice was orginally documented to only allow deletion of unpaid
invoices, but there might be reasons to delete paid ones or unexpired ones.
But we have to avoid the race where someone pays as it's deleted: the
easiest way is to have the caller tell us the status, and fail if
it's wrong.
Fixes: #477
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We're going to have to support multiple channels per peer, even if only
when some are onchain. This would break the current listpeers, so
change it to an array (single element for now).
Other cleanups:
1. Only set connected true if daemon is not onchaind.
2. Only show netaddr if connected; don't make it an array, call it `address`
in comparison with `addresses` in listnodes.
3. Rename `channel` to `short_channel_id`
4. Add `funding_txid` field for voyeurism.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Individual tests can always re-enable them, though.
[ More test fallout fixes by Christian Decker ]
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Seems to avoid the nasty python resource warnings, as well as the
fatal 'ValueError: PyMemoryView_FromBuffer(): info->buf must not be NULL'
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This, of course, should never be used. But it helps maintain connections
for the moment while we dig deeper into feerates.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
For now this just tests that we are sending out keepalive
channel_updates for all local channels.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
Since most callers use positional arguments, we should allow a 'null'
literal where we require no value at all.
Also adds some more value tests.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Paid invoices need to know how much was actually paid: both for the case
where no 'msatoshi' amount was specified, and for the normal case, where
clients are permitted to overpay in order to help them disguise their
payments.
While we migrate the db, we leave this field as 0 for old paid
invoices. This is unhelpful for accounting, but at least clearly
indicates what happened if we find this in the wild.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
'rhash' is the old terminology, but 'payment_preimage' and
'payment_hash' were decided on for the BOLTs, so we should fix that here.
We still use rhash internally, but that's much easier to fix.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
> assert [c['active'] for c in l2.rpc.getchannels()['channels']] == [True, True]
E AssertionError: assert [False, True] == [True, True]
E At index 0 diff: False != True
E Full diff:
E - [False, True]
E + [True, True]
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We need to make sure all the updates are known to gossip. Since
one is the local update, we change that message to look the same.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Receiving them in channeld is not enough to avoid the race:
route = l1.rpc.getroute(l3.info['id'], 4999999, 1)["route"]
...
ValueError: RPC call failed: Could not find a route
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We wait the the receipt of the CHANNEL_UPDATE message by channeld,
but that doesn't mean it reached gossipd yet, causing spurious test
failure.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
CI always runs with TEST_DEBUG=1 which prints logs anyway, and testing
locally should also be done this way, combined with pytest which
captures the logs. No need to duplicate the functionality of pytest.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
Since we seem to have some isolation concerns when re-generating the
same HSM secret and re-parsing the blockchain some blocks in the past.
This also alleviates the problem of printing to a logging stream that
has been closed. Previously bitcoind would keep running despite a test
had failed and continue logging to the, now closed, StringIO that
py.test uses when capturing stdout.
The performance impact seems to be 1-3 second per test, not too bad
IMHO for increased test isolation and cleaner logs:
|--------------------+---------------+----------|
| | No_valgrind | Valgrind |
|--------------------+---------------+----------|
| bitcoind per suite | 10 min 24 sec | 46:15.31 |
| bitcoind per test | 11 min 38 sec | 49:21.64 |
|--------------------+---------------+----------|
Signed-off-by: Christian Decker <decker.christian@gmail.com>
I was examining a test_onchain_timeout failure, and realized that we
were forgetting a peer even though we'd just spent the HTLC_TIMEOUT_TX!
This reveals that we weren't resolving an output when we stole the preimage
from it, like we should.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>