Now that we have HTLC persistence we'd also like to test it. This
kills the second node in the middle of an HTLC, it'll recover and
finish the flow.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
This broke somewhere in the recent changes, because we override
TailalbleProc stop(). Break out log extractor.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Moved the flagging for allowed failures into the factory getter, and
renamed into `may_fail`. Also stopped the teardown of a node from
throwing an exception if we are allowed to exit non-cleanly.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
A failed returncode check could result in the cleanup for other
lightningds to be skipped. Now make sure to cleanup all and then
rethrow an exception that contains all returncodes.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
We used to simply kill the daemon, which in some cases could result in
half-written crashlogs and similar artifacts such as half-completed
RPC calls. Now we ask lightningd to stop nicely, give it some time and
only then kill it. We also return the returncode of the daemon.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
peer_fail_permanent() frees peer->owner, but for bad_peer() we're
being called by the sd->badpeercb(), which then goes on to
io_close(conn) which is a child of sd.
We need to detach the two for this case, so neither tries to free the
other.
This leads to a corner case when the subd exits after the peer is gone:
subd->peer is NULL, so we have to handle that too.
Fixes: #282
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Note that it should really be a flag to daemon on construction, too,
but that may interfere with another concurrent branch so I've deferred.
Suggested-by: Christian Decker
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We re-use the value for reasonable_depth given by the master, and we
tell it when our timeout transactions reach that depth.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
In the next test, we wait for multiple 'sendrawtx exit 0' which
doesn't work because we use a set not a list, and the current code
would match multiple against the same thing. The result was we didn't
wait for the final sendrawtransaction, and occasionally had test
failures as a result.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
When we sent out an HTLC-Timeout or HTLC-Success tx, we need to spend
it after the timeout so it's safely in our wallet.
We generalize the tx_type OUR_UNILATERAL_TO_US_RETURN_TO_WALLET to
OUR_DELAYED_RETURN_TO_WALLET, since we use it for HTLC transactions too.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
When we see an offered HTLC onchain, we need to use the preimage if we
know it. So we dump all the known HTLC preimages at startup, and send
new ones as we discover them.
This doesn't cover preimages we know because we're the final
recipient; that can happen if an HTLC hasn't been irrevocably
committed yet. We'll do that in a followup patch.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
lightningd can crash on shutdown if it's in the middle of getchaintips;
we free the conn, the finished callback is called (process_chaintips),
and it reports that it received an empty result.
The simplest fix is to set a flag in the struct bitcoind destructor,
and avoid the callback.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Either when it exits with a signal, or sends an error status message.
Then we make test_lightningd.py use it.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We simply kill lightningd; we should stop it properly and have a timeout
to kill it if that fails. However, that's beyond my python skills :(
So we just look for crash.log. Unfortunately, we usually kill
lightningd before it's finished writing it. So we look for it and
don't kill lightningd, just wait in this case.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is the step where we broadcast the transaction to the network and
a nice place to extract the change from the transaction.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
For the permfail tests the sendpay call is supposed to fail, so this
was printing stacktraces upon success. Running in futures captures any
thrown exceptions and rethrows them when calling `result()`, in our
case we just ignore them.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
So far we were always using the deadline in the announcements, that's
obviously not good, so this introduces the parameter as per spec.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
We weren't killing it. Eventually it would die, and peer_owner_finished()
would access subd->peer->owner, but that peer was freed already.
Closes: #261
Reported-by: Christian Decker <decker.christian@gmail.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
To reproduce the next bug, I had to ensure that one node keeps thinking it's
disconnected, then the other node reconnects, then the first node realizes
it's disconnected.
This code does that, adding a '0' dev-disconnect modifier. That means
we fork off a process which (due to pipebuf) will accept a little
data, but when the dev_disconnect file is truncated (a hacky, but
effective, signalling mechanism) will exit, as if the socket finally
realized it's not connected any more.
The python tests hang waiting for the daemon to terminate if you leave
the blackhole around; to give a clue as to what's happening in this
case I moved the log dump to before killing the daemon.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
jl777 reported a crash when we try to pay past reserve. Fix that (and
a whole class of related bugs) and add tests.
In test_lightning.py I had to make non-async path for sendpay() non-threaded
to get the exception passed through for testing.
Closes: #236
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
I was hoping to defer HTLC updates until we actually store HTLCs, but
we need to flush to DB whenever balances update as well.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
This is the big one, and it's completely anticlimactic: it loads all
channels that have reached opening and are not marked as
closingd_complete into memory, that's it.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
I was hoping to trigger on more things from the bitcoind process, but
stuff like mempool is hard to trigger on. Reducing to info so we can
work a bit easier with pdb and the log becomes less noisy.
We'll need this for testing nodes going down during payment.
However, there's no good way to silence the threads that I can tell,
so we get a nasty backtrace from it.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
I tracked down a bug, and couldn't figure out why valgrind wasn't
finding it.
From man valgrind:
--log-file=<filename>
Specifies that Valgrind should send all of its messages to the
specified file. If the file name is empty, it causes an abort.
There are three special format specifiers that can be used in the
file name.
%p is replaced with the current process ID. This is very useful for
program that invoke multiple processes. WARNING: If you use
--trace-children=yes and your program invokes multiple processes OR
your program forks without calling exec afterwards, and you don't
use this specifier (or the %q specifier below), the Valgrind output
from all those processes will go into one file, possibly jumbled
up, and possibly incomplete.
"possibly incomplete" indeed!
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We weren't registering reconnecting peers for broadcasts. Just
starting a timer is enough. Also added an integration test to check
that the gossip sync is being resumed.