Moved the flagging for allowed failures into the factory getter, and
renamed into `may_fail`. Also stopped the teardown of a node from
throwing an exception if we are allowed to exit non-cleanly.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
We used to simply kill the daemon, which in some cases could result in
half-written crashlogs and similar artifacts such as half-completed
RPC calls. Now we ask lightningd to stop nicely, give it some time and
only then kill it. We also return the returncode of the daemon.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
Note that it should really be a flag to daemon on construction, too,
but that may interfere with another concurrent branch so I've deferred.
Suggested-by: Christian Decker
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
In the next test, we wait for multiple 'sendrawtx exit 0' which
doesn't work because we use a set not a list, and the current code
would match multiple against the same thing. The result was we didn't
wait for the final sendrawtransaction, and occasionally had test
failures as a result.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
When we see an offered HTLC onchain, we need to use the preimage if we
know it. So we dump all the known HTLC preimages at startup, and send
new ones as we discover them.
This doesn't cover preimages we know because we're the final
recipient; that can happen if an HTLC hasn't been irrevocably
committed yet. We'll do that in a followup patch.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We simply kill lightningd; we should stop it properly and have a timeout
to kill it if that fails. However, that's beyond my python skills :(
So we just look for crash.log. Unfortunately, we usually kill
lightningd before it's finished writing it. So we look for it and
don't kill lightningd, just wait in this case.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
To reproduce the next bug, I had to ensure that one node keeps thinking it's
disconnected, then the other node reconnects, then the first node realizes
it's disconnected.
This code does that, adding a '0' dev-disconnect modifier. That means
we fork off a process which (due to pipebuf) will accept a little
data, but when the dev_disconnect file is truncated (a hacky, but
effective, signalling mechanism) will exit, as if the socket finally
realized it's not connected any more.
The python tests hang waiting for the daemon to terminate if you leave
the blackhole around; to give a clue as to what's happening in this
case I moved the log dump to before killing the daemon.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
I was hoping to defer HTLC updates until we actually store HTLCs, but
we need to flush to DB whenever balances update as well.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
I was hoping to trigger on more things from the bitcoind process, but
stuff like mempool is hard to trigger on. Reducing to info so we can
work a bit easier with pdb and the log becomes less noisy.
I have a test which waits for multiple occurrences of the same string,
but doesn't want them to overlap. Make wait_for_log() do the right thing,
so that it only looks for log entries since the last wait_for_log.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
On my laptop under load, 5 seconds was no longer enough for legacy.
But this breaks async (they all see mempool increase, and fire
prematurely), so stop doing that.
I can't get this test to work at all, in fact, without this patch.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
1. We explicitly assert what state we're coming from, to make transitions
clearer.
2. Every transition has a state, even between owners while waiting for HSM.
3. Explictly step though getting the HSM signature on the funding tx
before starting channeld, rather than doing it in parallel: makes
states clearer.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
I couldn't actually figure out how to just dump them on error, so I
dump all the time. When running 3 lightningd + bitcoind, this separates
the logs nicely.
TODO: We should delete the directories on success!
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This moves all the non-legacy blackbox testing into python.
Before:
real 10m18.385s
After:
real 9m54.877s
Note that this doesn't valgrind the subdaemons: that patch seems to cause
some issues in the python framework which I am still chasing.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Running long integration tests could result in `bitcoind` dropping the
connection inbetween calls, and since python-bitcoinlib does not
reconnect and/or retry, all subsequent tests would fail as well. This
patch switches to throwaway connections, each serving just one
request. It's easier than to reach into the bitcoinlib to reconnect
and reauth, but comes with some overhead, but I think it's acceptable
for the few bitcoin calls we actually perform.
By looking for 'Done loading' in the log output we should actually be
called after `SetRPCWarmupFinished` in bitcoind. Only then is it safe
to make RPC calls. This resulted in the test suite being a bit flaky.