typesafe_cb isn't suitable here, as it is simply a conditional cast,
and the result is passed through '...' and doesn't matter.
Reported-by: @wythe
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
json_listpeers returns an array of peers, and an array of nodes: the latter
is a subset of the former, and is used for printing alias/color information.
This changes it so there is a 1:1 correspondance between the peer information
and nodes, meaning no more O(n^2) search.
If there is no node_announce for a peer, we use a negative timestamp
(already used to indicate that the rest of the gossip_getnodes_entry
is not valid).
Other fixes:
1. Use get_node instead of iterating through the node map.
2. A node without addresses is perfectly valid: we have to use the timestamp
to see if the alias/color are set. Previously we wouldn't print that
if it didn't also advertize an address.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
There doesn't seeem to be a need for this anymore (unless I'm missing something).
I added the sendpay_nulltok() unit test to confirm.
Signed-off-by: Mark Beckwith <wythe@intrig.com>
@rustyrussell showed we don't need temporary objects for params.
This means params no longer need a tal context.
Signed-off-by: Mark Beckwith <wythe@intrig.com>
@wythe points out we don't need to keep the around now param_is_set()
is removed. We can in fact go further and avoid marshalling them into
temporary objects at the caller altogether.
This means internally we have an array of struct param, rather than an
array of 'struct param *', which causes most of the noise in this
patch.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We're using a macro anyway, so appending "" make it a compile-time check.
Complicates testing a bit, since we actually use generated names there.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is a bit more natural, IMHO. The only issue is that json_tok_tok is
special, so we end up with param_opt_tok() if you really want an optional
generic token.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is part of #1464 and incorporates Rusty's suggested updates from #1569.
See comment in param.h for description, here's the basics:
unsigned cltv;
const jsmntok_t *note;
u64 msatoshi;
struct param * mp;
if (!param_parse(cmd, buffer, tokens,
param_req("cltv", json_tok_number, &cltv),
param_opt("note", json_tok_tok, ¬e),
mp = param_opt("msatoshi", json_tok_u64, &msatoshi),
NULL))
return;
if (param_is_set(mp))
do_something()
There is a lot of developer mode code to make sure we don't make mistakes,
like trying to unmarshal into the same variable twice or adding a required param
after optional.
During testing, I found a bug (of sorts) in the current system. It allows you
to provide two named parameters with the same name without error; e.g.:
# cli/lightning-cli -k newaddr addresstype=p2sh-segwit addresstype=bech32
{
"address": "2N3r6fT65PhfhE1mcMS6TtcdaEurud6M7pA"
}
It just takes the first and ignores the second. The new system reports this as an
error for now. We can always change this later.
structeq() is too dangerous: if a structure has padding, it can fail
silently.
The new ccan/structeq instead provides a macro to define foo_eq(),
which does the right thing in case of padding (which none of our
structures currently have anyway).
Upgrade ccan, and use it everywhere. Except run-peer-wire.c, which
is only testing code and can use raw memcmp(): valgrind will tell us
if padding exists.
Interestingly, we still declared short_channel_id_eq, even though
we didn't define it any more!
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
`getinfo` has been providing the blockheight for a good while and doesn't
require the `DEVELOPER=1` flag during compilation, so it should be the preferred
method to retrieve the blockchain height.
Implements an EWMA for the fee estimation. Achieves 90% influence of the newer
fee after 5 minutes, and adjusts to the polling rate that is configured.
Gossipd will ignore the second one, but doing it in the front end
gives an explicit error message.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
New codes: FUND_MAX_EXCEEDED, FUND_CANNOT_AFFORD, FUND_DUST_LIMIT_UNMET.
The error message "Cannot afford fee" was not exactly correct because
it would also occur if the amount requested could not be afforded. So
I changed it to the more generic "Cannot afford transaction".
Other things:
* Fixed off-by-one satoshi in fundchannel manpage.
* Changed 'arror' to 'error' because we are not pirates.
Turn req_running into a pointer to the current bcli structure, which means
the leak detection can find it.
Also suppress leaks in the case where we're only attached to a timer
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
During a meeting earlier this week we agreed with Eclair to temporarily
increase the final CLTV delta in our invoices to establish
compatibility with the already deployed Eclair wallets. They in turn
agreed to remove the enforcement of higher final CLTV deltas, or bump
it locally should it not match their expectations as allowed by
BOLT 11. This has since been implemented in ACINQ/eclair#627.
satoshis.place was slowing to a crawl, c-lightning was unresponsive.
Logs revealed charged doing many, many listinvoice <label> RPCs.
We were iterating the entire db every time: stop that!
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The fee range can sometimes cause channels to be closed when the estimator
jumps. This has been the case a few times in the last months, and causes a
number of channels to be closed, and issue reports to be filed.
Increasing this from 5x to 10x should get rid of 84%+ of these
closures (measured based on 1h windows over the last 6 months and assuming
worst case situations).
Signed-off-by: Christian Decker <decker.christian@gmail.com>
I still believe that 2 weeks is way too much, but we were promised that these
defaults would be slowly reduced to saner values as the stability increases.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
Compares the `blocknum` in the `short_channel_id` with the range of blocks we
store in the database and abort if we should have known about it. Avoids
bombarding `bitcoind` with requests for channels that have already been spent or
were invalid in the first place.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
Since we currently only (ab)use it to send everything, we need a way to
generate boutique queries for testing.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We're telling gossipd about disconnections anyway, so let's just use that signal
to disable both sides of the channel.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
This was failing some of our integration tests, i.e., the ones closing a channel
and not waiting for sigexchange. The remote node would often not be quick enough
to send us its disabling channel_update, and hence we'd still remember the
incoming direction. That could then be sent out as part of an invoice, and fail
subsequently. So just set both directions to be disabled and let the onchain
spend clean up once it happens.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
Until now, `command_fail()` reported an error code of -1 for all uses.
This PR adds an `int code` parameter to `command_fail()`, requiring the
caller to explicitly include the error code.
This is part of #1464.
The majority of the calls are used during parameter validation and
their error code is now JSONRPC2_INVALID_PARAMS.
The rest of the calls report an error code of LIGHTNINGD, which I defined to
-1 in `jsonrpc_errors.h`. The intention here is that as we improve our error
reporting, all occurenaces of LIGHTNINGD will go away and we can eventually
remove it.
I also converted calls to `command_fail_detailed()` that took a `NULL` `data`
parameter to use the new `command_fail()`.
The only difference from an end user perspecive is that bad input errors that
used to be -1 will now be -32602 (JSONRPC2_INVALID_PARAMS).
This resolves the problem where both channeld and gossipd can generate
updates, and they can have the same timestamp. gossipd is always able
to generate them, so can ensure timestamp moves forward.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Because we have too many which are never used and I don't want to document
them.
1. Remove unused anchor_onchain_wait. When implemented, it should be
hardcoded to 100 or more.
2. Remove anchor_confirms_max. 10 always reasonable, and we can readd
an override option should someone need it.
3. max_htlc_expiry should be the same as locktime_max (which increases
from 3 to 5 days by default): they're both a limit on how long
funds can be locked up.
4. channel_update_interval should always be a dev option.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Make --override-fee-rates a dev option. We use default-fee-rate in
its place, which (since bitcoind won't give fee estimates in regtest
mode for short chains) gives an effective feerate of 15000/7500/3750.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We never hit the guess_feerate() path, because we turned a 0 ("can't
estimate fee") into 253.
This also revealed that we weren't initializing topo->feerate, and
that we were giving spurious updates even if we were using override-fee-rates.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Just have a "new depth" callback, and let channeld do the right thing.
This makes the channeld paths a bit more straightforward.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Tor wasn't actually working for me to connect to anything, but it worked
for 'ssh -D' testing.
Note that the resulting 'netaddr' is a bit weird, but I guess it's honest.
$ ./cli/lightning-cli connect 021f2cbffc4045ca2d70678ecf8ed75e488290874c9da38074f6d378248337062b
{
"id": "021f2cbffc4045ca2d70678ecf8ed75e488290874c9da38074f6d378248337062b"
}
$ ./cli/lightning-cli listpeers
{
"peers": [
{
"state": "GOSSIPING",
"id": "021f2cbffc4045ca2d70678ecf8ed75e488290874c9da38074f6d378248337062b",
"netaddr": [
"ln1qg0je0lugpzu5ttsv78vlrkhteyg9yy8fjw68qr57mfhsfyrxurzkq522ah.lseed.bitcoinstats.com:9735"
],
"connected": true,
"owner": "lightning_gossipd"
}
]
}
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is useful for the next patch, where we want to hand the unresolved
name through to the proxy.
This also addresses @Saibato's worry that we still called getaddrinfo()
(with the AI_NUMERICHOST option) even if we didn't want a lookup.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
1. If we have a channel_announcement, the channel is public, otherwise
it's not. Not all channels are public, as they can be local: those
have a NULL channel_announcement.
2. If we don't have a channel_update, we know nothing about that half
of the channel, and no other fields are valid.
3. We can tell if a half channel is disabled by the flags field directly.
Note that we never send halfchannels without an update over
gossip_getchannels_reply so that marshalling/unmarshalling can be
vastly simplified.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This means it will effect connect commands too (though it's too
late to stop DNS lookups caused by commandline options).
We also warn that this is one case where we allow forcing through Tor
without a proxy set: it just means all connections will fail.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This takes the Tor service address in the same option, rather than using
a separate one. Gossipd now digests this like any other type.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
For the moment, this is a straight handing of current parameters through
from master to the gossip daemon. Next we'll change that.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Currently it's always for messages to peer: make that status_peer_io and
add a new status_io for other IO.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Risks leakage. We could do lookup via the proxy, but that's a TODO.
There's only one occurance of getaddrinfo (and no gethostbyname), so
we add a flag to the callers.
Note: the use of --always-use-proxy suppresses *all* DNS lookups, even
those from connect commands and the command line.
FIXME: An implicit setting of use_proxy_always is done in gossipd if it
determines that we are announcing nothing but Tor addresses, but that
does *not* suppress 'connect'.
This is fixed in a later patch.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rename tor_proxyaddrs and tor_serviceaddrs to tor_proxyaddr and tor_serviceaddr:
the 's' at the end suggests that there can be more than one.
Make them NULL or non-NULL, rather than using all-zero if unset.
Hand them the same way to gossipd; it's a bit of a hack since we don't
have optional fields, so we use a counter which is always 0 or 1.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
There's no reason to do this async, and far easier to follow using normal
read/write.
The previous parsing was deeply questionable, using substring searches
only, and relying on the fact that a single non-blocking read would get
the entire response. This is changed to do (somewhat) proper parsing
using ccan/rbuf.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is simply the code to set up the automatic hidden service, so move
it into lightningd.
I removed the undefined parse_tor_wireaddr, and added a parameter name
to the create_tor_hidden_service_conn() declaration for update-mocks.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is a rebased and combined patch for Tor support. It is extensively
reworked in the following patches, but the basis remains Saibato's work,
so it seemed fairest to begin with this.
Minor changes:
1. Use --announce-addr instead of --tor-external.
2. I also reverted some whitespace and unrelated changes from the patch.
3. Removed unnecessary ';' after } in functions.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Someone could try to announce an internal address, and we might probe
it.
This breaks tests, so we add '--dev-allow-localhost' for our tests, so
we don't eliminate that one. Of course, now we need to skip some more
tests in non-developer mode.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
If we're given a wildcard address, we can't announce it like that: we need
to try to turn it into a real address (using guess_address). Then we
use that address. As a side-effect of this cleanup, we only announce
*any* '--addr' if it's routable.
This fix means that our tests have to force '--announce-addr' because
otherwise localhost isn't routable.
This means that gossipd really controls the addresses now, and breaks
them into two arrays: what we bind to, and what we announce. That is
now what we return to the master for json_getinfo(), which prints them
as 'bindings' and 'addresses' respectively.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
1. Add special option where an empty host means 'wildcard for IPv4 and/or IPv6'
which means ':1234' can be used to set only the portnum.
2. Only add this protocol wildcard if --autolisten=1 (default)
and no other addresses specified.
3. Pass it down to gossipd, so it can handle errors correctly: in most cases,
it's fatal not to be able to bind to a port, but for this case, it's OK
if we can only bind to one of IPv4/v6 (fatal iff neither).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This replacement is a little menial, but it explicitly catches all
the places where we allow a local socket. The actual implementation of
opening a AF_UNIX socket is almost hidden in the patch.
The detection of "valid address" is now more complex:
p->addr.itype != ADDR_INTERNAL_WIREADDR || p->addr.u.wireaddr.type != ADDR_TYPE_PADDING
But most places we do this, we should audit: I'm pretty sure we can't
get an invalid address any more from gossipd (they may be in db, but
we should fix that too).
Closes: #1323
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It does all the other address handling, do this too. It also proves useful
as we clean up wildcard address handling.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Now we only bind to addresses in our wireaddrs array, we would not
autobind to local sockets if they couldn't reach google's nameserver.
That's clearly wrong: we should only not bind if there's a protocol
issue (eg. no IPv6 support).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It's become clear that our network options are insufficient, with the coming
addition of Tor and unix domain support.
Currently:
1. We always bind to local IPv4 and IPv6 sockets, unless --port=0, --offline,
or any address is specified explicitly. If they're routable, we announce.
2. --addr is used to announce, but not to control binding.
After this change:
1. --port is deprecated.
2. --addr controls what we bind to and announce.
3. --bind-addr/--announce-addr can be used to control one and not the other.
4. Unless --autolisten=0, we add local IPv4 & IPv6 port 9735 (and announce if they are routable).
5. --offline still overrides listening (though announcing is still the same).
This means we can bind to as many ports/interfaces as we want, and for
special effects we can announce different things (eg. we're sitting
behind a port forward or a proxy).
What remains to implement is semi-automatic binding: we should be able
to say '--addr=0.0.0.0:9999' and have the address resolve at bind
time, or even '--addr=0.0.0.0:0' and have the port autoresolve too
(you could determine what it was from 'lightning-cli getinfo'.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We set no_reconnect with --offline, but that doesn't work if !DEVELOPER.
Make the flag positive, and non-DEVELOPER mode for gossipd.
We also don't override portnum with --offline, but have an explicit
'listen' flag.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We can create the hsm file from python directly; that works even if we
don't have DEVELOPER set, and is simpler.
We add a test that the aliases are correct.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Originally we were supposed to tell the HSM we had just created the directory,
otherwise it wouldn't create a new seed. But we modified it to check if
there was a seed file anyway: just move that logic into a branch of hsmd.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
No new functionality, just a continuation of my work toward completing #665.
I removed the common members of `struct withdrawal` and `struct fund_channel`
and placed them in a new `struct wallet_tx`. Then it was fairly straightforward
to reimplement the existing code in terms of `wallet_tx`.
Since I made some structural changes I wanted to get this approved before I
go any farther.
Added 'all' to fundchannel help message.
Fixes: #1445
Hacky fix, possibly. First cut at avoiding starting up onchaind and gossipd (which might make queries of chaintopology, which might start up a bitcoin-cli) before we can daemonize.
We're getting spurious closures, even on mainnet. Using --ignore-fee-limits
is dangerous; it's slightly less so to lower the minimum (which is the
usual cause of problems).
So let's halve it, but beware the floor.
This is a workaround, until we get independent feerates in the spec.
Fixes: #613
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This means gossipd is live and we can tell it things, but it won't
receive incoming connections. The split also means that the main daemon
continues (eg. loading peers from db) while gossipd is loading from the store,
potentially speeding startup.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
If we start accepting peer connections before we initialized some of the other
parts (mainly the chaintopology) we could end up asking for stuff that isn't
ready yet (blockchain head for example).
Signed-off-by: Christian Decker <decker.christian@gmail.com>
If channeld dies for some reason (eg, reconnect) and we didn't yet announce
the channel, we can miss doing so. This is unusual, because if lightningd
restarts it rearms the callback which gives us funding_locked, so it only
happens if just channel dies before sending the announcement message.
This problem applies to both temporary announcement (for gossipd) and
the real one. For the temporary one, simply re-send on startup, and
remote the error msg gossipd gives if it sees a second one. For the
real one, we need a flag to tell us the depth is sufficient; the peer
will ignore re-sends anyway.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
When we get a reconnection, kill the current remote peer, and wait for the
master to tell us it's dead. Then we hand it the new peer.
Previously, we would end up with gossipd holding multiple peers, and
the logging was really hard to interpret; I'm not completely convinced
that we did the right thing when one terminated, either.
Note that this now means we can have peers with neither ->local nor ->remote
populated, so we check that more carefully.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Currently we intuit it from the fd being closed, but that may happen out
of order with when the master thinks it's dead.
So now if the gossip fd closes we just ignore it, and we'll get a
notification from the master when the peer is disconnected.
The notification is slightly ugly in that we have to disable it for
a channel when we manually hand the channel back to gossipd.
Note: as stands, this is racy with reconnects. See the next patch.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This was sitting in my gossip-enchancement patch queue, but it simplifies
this set too, so I moved it here).
In 94711969f we added an explicit gossip_index so when gossipd gets
peers back from other daemons, it knows what gossip it has sent (since
gossipd can send gossip after the other daemon is already complete).
This solution is insufficient for the more general case where gossipd
wants to send other messages reliably, so replace it with the other
solution: have gossipd drain the "gossip fd" which the daemon returns.
This turns out to be quite simple, and is probably how I should have
done it originally :(
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
1. Lifetime of 'struct reaching' now only while we're actively doing connect.
2. Always free after a single attempt: if it's an important peer, retry
on a timer.
3. Have a single response message to master, rather than relying on
peer_connected on success and other msgs on failure.
4. If we are actively connecting and we get another command for the same
id, just increment the counter
The result is much simpler in the master daemon, and much nicer for
reconnection: if they say to connect they get an immediate response,
rather than waiting for 10 retries. Even if it's an important peer,
it fires off another reconnect attempt, unless it's actively
connecting now.
This removes exponential backoff: that's restored in next patch. It
also doesn't handle multiple addresses for a single peer.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
And on channel_fail_permanent and closing (the two places we drop to
chain), we tell gossipd it's no longer important.
Fixes: #1316
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
These don't have a maximum number of reconnect attempts, and ensure
that we try to reconnect when the peer dies.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Simplification of the offset calculation to use the rescan parameter, and rename
of `wallet_first_blocknum`. We now use either relative rescan from our last
known location, or absolute if a negative rescan was given. It's all handled in
a single location (except the case in which the blockcount is below our
precomputed offset), so this should reduce surprises.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
This is intended to recover from an inconsistent state, involving
`onchaind`. Should we for some reason not restore the `onchaind` process
correctly we can instruct `lightningd` to go back in time and just replay
everything.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
Since we reference the channel ID to allow cascades in the database we also need
the ability to look up a channel by its database ID.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
This will allow us in the next commit to store the transactions that triggered
this event in the DB and thus allowing us to replay them later on.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
We used to queue the preimages to be sent to onchaind only after receiving the
onchaind_init_reply. Once we start replaying we might end up in a situation in
which we queue the tx that onchaind should react to before providing it with the
preimages. This commit just moves the preimages being sent, making it atomic
with the init, and without changing the order.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
These were so far only used for bolt11 construction, but we'll need them for the
DNS seed as well, so here we just pull them out into their own unit and prefix
them.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
d822ba1ee accidentally removed this case, which is important: if the
other side didn't get our final matching closing_signed, it will
reconnect and try again. We consider the channel no longer "active"
and thus ignore it, and get upset when it send the
`channel_reestablish` message.
We could just consider CLOSINGD_COMPLETE to be active, but then we'd
have to wait for the closing transaction to be mined before we'd allow
another connection.
We can't special case it when the peer reconnects, because there
could be (in theory) multiple channels for that peer in CLOSINGD_COMPLETE,
and we don't know which one to reestablish.
So, we need to catch this when they send the reestablish, and hand
that msg to closingd to do negotiation again. We already have code
to note that we're in CLOSINGD_COMPLETE and thus ignore any result
it gives us.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We're about to remove automatic retrying of connect, and that uncovered
that we actually print out our "Server started" message before we create
the listening socket.
Move the init higher (outside the db transaction) and make it a
request/response, the loop until it's done.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The new connect code revealed an existing race: we tell gossipd to
release the peer, but at the same time it connects in. gossipd fails
the release because the peer is remote, and json_fundchannel fails.
Instead, we catch this race when we get peer_connected() and we were
trying to open a channel. It means keeping a list of fundchannels which
are awaiting a gossipd response though.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We missed it in some corner cases where we crashed/were killed between
being told of the lockin and sending the channel_normal_operation message.
When we were restarted, we were told both sides were locked in already,
so we never updated the state.
Pull the entire "tell channeld" logic into channel_control.c, and make
it clear that we need to keep waching if we cant't tell channeld. I think
we did get this correct in practice, since funding_announce_cb has the
same test, but it's better to be clear.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We'd usually commit to the db soon, but there's a window where it
could be missed.
Also moves loc into the block it's used and make it tmpctx to avoid
an explicit free.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Without this, we can get errors on shutdown:
Valgrind error file: valgrind-errors.27444
==27444== Invalid read of size 8
==27444== at 0x1950E2: secp256k1_pubkey_load (secp256k1.c:127)
==27444== by 0x19CF87: secp256k1_ec_pubkey_serialize (secp256k1.c:189)
==27444== by 0x14FED9: towire_pubkey (towire.c:59)
==27444== by 0x15AAFB: towire_gossipctl_peer_disconnected (gen_gossip_wire.c:969)
==27444== by 0x1253EF: opening_channel_errmsg (opening_control.c:526)
==27444== by 0x1386A3: destroy_subd (subd.c:589)
==27444== by 0x18222C: notify (tal.c:240)
==27444== by 0x1826E1: del_tree (tal.c:400)
==27444== by 0x182733: del_tree (tal.c:410)
==27444== by 0x182733: del_tree (tal.c:410)
==27444== by 0x182B1F: tal_free (tal.c:511)
==27444== by 0x11FC53: main (lightningd.c:410)
==27444== Address 0x6c3af98 is 72 bytes inside a block of size 216 free'd
==27444== at 0x4C30D3B: free (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==27444== by 0x1827BC: del_tree (tal.c:421)
==27444== by 0x182B1F: tal_free (tal.c:511)
==27444== by 0x11F3C7: shutdown_subdaemons (lightningd.c:211)
==27444== by 0x11FC27: main (lightningd.c:406)
==27444== Block was alloc'd at
==27444== at 0x4C2FB0F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==27444== by 0x182296: allocate (tal.c:250)
==27444== by 0x182863: tal_alloc_ (tal.c:448)
==27444== by 0x12F2DF: new_peer (peer_control.c:74)
==27444== by 0x125600: new_uncommitted_channel (opening_control.c:576)
==27444== by 0x125870: peer_accept_channel (opening_control.c:668)
==27444== by 0x13032A: peer_sent_nongossip (peer_control.c:427)
==27444== by 0x116B9E: peer_nongossip (gossip_control.c:60)
==27444== by 0x116F2B: gossip_msg (gossip_control.c:172)
==27444== by 0x138323: sd_msg_read (subd.c:503)
==27444== by 0x137C02: read_fds (subd.c:330)
==27444== by 0x175550: next_plan (io.c:59)
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Also report tx and txid, and whether we closed unilaterally or
bilaterally, if we could close the channel.
Also make a manpage.
Fixes: #1207Fixes: #714Fixes: #622
We had an intermittant test failure, where the fee we negotiated was
further from our ideal than the final commitment transaction. It worked
fine if the other side sent the mutual close first, but not if we sent
our unilateral close first.
ERROR: test_closing_different_fees (__main__.LightningDTests)
----------------------------------------------------------------------
Traceback (most recent call last):
File "tests/test_lightningd.py", line 1319, in test_closing_different_fees
wait_for(lambda: p.rpc.listpeers(l1.info['id'])['peers'][0]['channels'][0]['status'][1] == 'ONCHAIN:Tracking mutual close transaction')
File "tests/test_lightningd.py", line 74, in wait_for
raise ValueError("Error waiting for {}", success)
ValueError: ('Error waiting for {}', <function LightningDTests.test_closing_different_fees.<locals>.<lambda> at 0x7f4b43e31a60>)
Really, if we're prepared to negotiate it, we should be prepared to
accept it ourselves. Simply take the cheapest tx which is above our
minimum.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The only use for these was to compute their txids so we could notify depth
in case of reorgs.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
We are slowly hollowing out the in-memory blockchain representation to make
restarts easier.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
All of the callback functions were only using the tx to generate the txid again,
so we just pass that in directly and save passing the tx itself.
This is a simplification to move to the DB backed depth callbacks. It'd be
rather wasteful to read the rawtx and deserialize just to serialize right away
again to find the txid, when we already searched the DB for exactly that txid.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
This will later allow us to determine the transaction confirmation count, and
recover transactions for rebroadcasts.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
Internally both payment and routing use 64-bit, but the interface
between them used 32-bit.
Since both components already support 64-bit we should use that.
In the short_channel_id check we were copying the entire result into the next
bitcoin-cli call, including the newline character.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
Reported-By: @gdassori
Creating the pid-file before daemonizing results in the pid-file containing the
pid of the process that started the daemon, but is now dead.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
Reported-By: Torkel Rogstad @torkelrogstad
We can have more than one; eg we might offer both bech32 and a p2sh
address, and in future we might offer v1 segwit, etc.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
There are two very hard problems in software engineering:
1. Off-by-one errors
In this case we were rolling back further than needed and we were starting the
catchup one block further than expected.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
I saw a failure in test_funding_fail():
assert l2.rpc.listpeers()['peers'][0]['connected']
This can happen if l2 hasn't yet handed back to gossipd. Turns out
we didn't mark uncommitted channels as connected:
[{'id': '03afa3c78bb39217feb8aac308852e6383d59409839c2b91955b2d992421f4a41e', 'connected': False, 'channels': [{'state': 'OPENINGD', 'owner': 'lightning_openingd', 'funder': 'REMOTE', 'status': ['Incoming channel: accepted, now waiting for them to create funding tx']}]}]
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is probably covered by our "channel capacity" heuristic which
requires the channel be significant, but best to be explicit and sure.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
So we know how much counterparty could theoretically steal from us
(msatoshi_to_us - msatoshi_to_us_min) and how much we could
theoretically steal from counterparty (msatoshi_to_us_max -
msatoshi_to_us).
For more piloting goodness.
In particular, the main daemon and subdaemons share the backtrace code,
with hooks for logging.
The daemon hook inserts the io_poll override, which means we no longer
need io_debug.[ch]. Though most daemons don't need it, they still link
against ccan/io, so it's harmess (suggested by @ZmnSCPxj).
This was tested manually to make sure we get backtraces still.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
I didn't convert all tests: they can still use a standalone context.
It's just marginally more efficient to share the libwally one for all
our daemons which link against it anyway.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
If we only remember the actions that added channels then we'd restore them when
re-reading the gossip_store, so put a tombstone in there to remember to delete
it. These will be cleared upon re-writing the store since the announcements wont
be written anymore.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
This is necessary since we have onchaind tell us about the
their_unilateral/to_us output, after it is already in a block.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
We don't handle \u, since we assume everyone sane is using UTF-8. We'd
still have to reject '\u0000' and maybe other weird cases if we did.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Since we may want to extend the on-disk format by adding custom information we
may as well just go the extra mile and reuse the serialization primitives we
already have.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
But only if we're actually going to change the feerate, otherwise we'd
log every time.
Suggested-by: @ZmnSCPxj
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Naively, this would be 250 satoshi per sipa, but it's not since bitcoind's
fee calculation was not rewritten to deal with weight, but instead bolted
on using vbytes.
The resulting calculations made me cry; I dried my tears on the thorns
of BUILD_ASSERT (I know that makes no sense, but bear with me here as I'm
trying not to swear at my bitcoind colleagues right now).
Fixes: #1194
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This bug is a classic case of being lazy:
1. peer_accept_channel() allocated its return off the input message,
rather than taking an explicit allocation context. This concealed the
lifetime nature of the return.
2. The context for sanitize_error was the error itself, rather than the
more obvious tmpctx (connect_failed does not take).
The global tmpctx removes the "efficiency" excuse for grabbing a random
object to use as context, and is also nice and explicit.
All-the-hard-work-by: @ZmnSCPxj
This fixes the root cause of https://github.com/ElementsProject/lightning/issues/1212
where we deleted the payment because we wanted to retry, then retry failed
so we had an (old) HTLC without a matching payment. We then fed that
HTLC to onchaind, which tells us it's missing, and we try to fail the
payment and deref a NULL pointer.
Fixes: #1212
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We would `block_map_add` inside `add_tip`, but we never
`block_map_del` inside `remove_tip`, which is dangerous as
we actually `tal_free` the block inside `remove_tip`.
Our CI did not reliably trap this problem since block
hashes are random and rerunning the `test_blockchaintrack`
often passed spuriously.
If we're going to simply take() a pointer, don't allocate it off a random
object. Using NULL makes our intent clear, particularly with allocating
packets we're going to take() onto a queue.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
I did a brief audit of tmpctx uses, and we do leak them in various
corner cases. Fortunely, all our daemons are based on some kind of
I/O loop, so it's fairly easy to clean a global tmpctx at that point.
This makes things a bit neater, and slightly more efficient, but also
clearer: I avoided creating a tmpctx in a few places because I didn't
want to add another allocation. With that penalty removed, I can use
it more freely and hopefully write clearer code.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Needed for particular race condition: client calls `sendpay` with
intent to call `waitsendpay` later to get information, but the
payment fails after `sendpay` returns but before client can invoke
`waitsendpay`.
This lets client know of information even if it manages to invoke
`waitsendpay` "late".
As we add more features, the current code is insufficient.
1. Keep an array of single feature bits, for easy switching on and off.
2. Create feature_offered() which checks for both compulsory and optional
variants.
3. Invert requires_unsupported_features() and unsupported_features()
which tend to be double-negative, all_supported_features() and
features_supported().
4. Move single feature definition from wire/peer_wire.h to common/features.h.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Improve usability in these scenarios:
* bitcoin-cli not available in PATH and/or bitcoind not running
* bitcoin-cli available in PATH but bitcoind is not running
This simplifies things, and means it's always in the database. Our
previous approach to creating it on the fly had holes when it was
created for onchaind, causing us to use another every time we
restarted.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Our testing also reveals a bug: we start lightningd and shut it down
before fully processing the blockchain, so we don't set
last_processed_block. Fix that by setting it immediately once we have
a block: worst case it goes backwards a little.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
In general, it is true that accessors should take const and discard it,
but chainparams is *always* const.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Transaction filters are strongly related to the wallet, this move just
makes it a bit more explicit.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
I leave all the now-unnecessary accessors in place to avoid churn, but
the use of bitfields has been more pain than help.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Let's have a simple function that allows us to check whether a channel
still has an HTLC open.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
The following command can be used to trigger these messages:
```
$ timeout 0.01 cli/lightning-cli connect [insert-syntactically-valid-peer-id-here] 123.123.123.123 # where 123.123.123.123 is unreachable
```
These error codes will cause `pay` to retry, so `pay` will never
actually report those error codes.
Those error codes will only get reported at the `sendpay` level.
* Modifies invoice command to have the following format
invoice <msatoshi> <label> <desc> <?expiry> <?fallbackaddr>
* Adds support for Segwit bcrt1 addresses for withdraw
* Add test case for fallback address in invoice creation
* Create a common json_tok_address_scriptpubkey to be used
by invoice and withdraw commands.
There are two recurring calls: the estimatefee call and the
getblockcount call. Currently we simply discard them on error, the
timer isn't rearmed.
This should fix a number of cases where bitcoind has an intermittant
failure and lightningd simply stops collecting blocks.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
In particular, process_getblockhash() exits with status 8 when the block
number is out of range, which is expected. Any other exit status should
be treated as a spurious error.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The billboard is now far more useful to tell what's going on, and this
gets us closer to a state == owner mapping.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Use NULL on the callback to mean "clear the slot", and call it.
We have do this in two places: the old daemon might die, or the new
daemon might start first.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Each state (effectively, each daemon) has two slots: a permanent slot
if something permanent happens (usually, a failure), and a transient
slot which summarizes what's happening right now.
Uncommitted channels only have a transient slot, by their very nature.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
In #1018 we got no information, except "Internal error". At least
if we tell the other side what went wrong, we're more likely to get
an answer.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
If the source channel is onchain, we try to send a message to onchaind
which (1) doesn't care, (2) doesn't take a channel_fail_htlc msg, and
(3) causes us to crash in subd.c:
assert(!strstarts(sd->msgname(fromwire_peektype(msg_out)), "INVALID"));
Fixes: #821
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We always hand in "NULL" (which means use tal_len on the msg), except
for two places which do that manually for no good reason.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We also fold opening_got_hsm_funding_sig() into the caller; it was
previously a callback before we decided to always use the HSM
synchronously.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
When we clear and recreate ltmp, we attach it to whatever logbook it's on.
This, of course, is fraught, since it may be freed.
We could make it NULL-parented, but that makes YA special-case to free
when we exit (we try to keep valgrind happy by freeing everything). So
since the first log_book is the permanent one attached to lightningd,
just keep that parent when we re-build it after use.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Because peer_failed would previously drop the connection, we had a
special 'negotiation_failed' message which made the master hand it
back to gossipd. We don't need that any more.
This also meant we no longer need a special hook in read_peer_msg
for openingd to send this message.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Several daemons (onchaind, hsm) want to use the status messages, but
don't communicate with peers. The coming changes made them drag in
more code they didn't need, so instead we have a different
non-overlapping type.
We combine the status_received_errmsg and status_sent_errmsg
into a single status_peer_error, with the presence or not of the
'error_for_them' field indicating direction.
We also rename status_fatal_connection_lost() to
peer_failed_connection_lost() to fit in.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
And now we can finally do the db upgrade to remove any OPENINGD
channels once, since we never put them back.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It's giant, but it's encapsulating at least. It is called from the wallet
code when loading channels, or from the opening code when converting
an uncommitted_channel.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Now any struct channel is a genuine channel, the following fields are
always valid:
1. funding_txid: doesn't need to be a pointer.
2. our_msatoshi: doesn't need to be a pointer.
3. last_sig: doesn't need to be a pointer.
4. channel_info: doesn't need to be a pointer.
In addition, 'last_tx' is always valid.
The main effect is to remove a whole heap of branches from the wallet code.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Each peer can have one 'uncommitted' channel, which is in the process
of opening. This is used for openingd, and then on return we convert
it into a full-fledged struct channel and commit it into the database.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This means the caller needs to supply an explicit log to base the
subd log on, and also a callback for error handling.
The callback is kind of ugly, but it gets reworked towards the end
of this series.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Once we rely on the logbook outlasting the peer, we can't refer to the
peer from the logbook function:
Valgrind error file: valgrind-errors.26567
==26567== Invalid read of size 8
==26567== at 0x126297: copy_to_parent_log (peer_control.c:690)
==26567== by 0x11C06B: maybe_print (log.c:253)
==26567== by 0x11C145: logv (log.c:270)
==26567== by 0x11C448: log_ (log.c:319)
==26567== by 0x132951: destroy_subd (subd.c:537)
==26567== by 0x179C19: notify (tal.c:240)
==26567== by 0x17A0CE: del_tree (tal.c:400)
==26567== by 0x17A120: del_tree (tal.c:410)
==26567== by 0x17A4ED: tal_free (tal.c:509)
==26567== by 0x16DEB5: io_close (io.c:443)
==26567== by 0x1328BC: sd_msg_read (subd.c:516)
==26567== by 0x1320AC: read_fds (subd.c:328)
==26567== Address 0x6cf9ca0 is 48 bytes inside a block of size 216 free'd
==26567== at 0x4C30D3B: free (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==26567== by 0x17A1A9: del_tree (tal.c:421)
==26567== by 0x17A4ED: tal_free (tal.c:509)
==26567== by 0x124B6C: delete_peer (peer_control.c:180)
==26567== by 0x12B369: destroy_uncommitted_channel (peer_control.c:2505)
==26567== by 0x179C19: notify (tal.c:240)
==26567== by 0x17A0CE: del_tree (tal.c:400)
==26567== by 0x17A4ED: tal_free (tal.c:509)
==26567== by 0x12B31E: opening_channel_errmsg (peer_control.c:2496)
==26567== by 0x13243A: handle_peer_error (subd.c:407)
==26567== by 0x1326E4: sd_msg_read (subd.c:472)
==26567== by 0x1320AC: read_fds (subd.c:328)
==26567== Block was alloc'd at
==26567== at 0x4C2FB0F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==26567== by 0x179C83: allocate (tal.c:250)
==26567== by 0x17A250: tal_alloc_ (tal.c:448)
==26567== by 0x124950: new_peer (peer_control.c:151)
==26567== by 0x12B3EC: new_uncommitted_channel (peer_control.c:2521)
==26567== by 0x12B5C5: peer_accept_channel (peer_control.c:2569)
==26567== by 0x126099: peer_sent_nongossip (peer_control.c:641)
==26567== by 0x113B28: peer_nongossip (gossip_control.c:55)
==26567== by 0x113D9D: gossip_msg (gossip_control.c:144)
==26567== by 0x132783: sd_msg_read (subd.c:487)
==26567== by 0x1320AC: read_fds (subd.c:328)
==26567== by 0x16D1FE: next_plan (io.c:59)
==26567==
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
BackgroundL Each log has a log_book: many logs can share the same one,
as each one can have a separate prefix.
Testing tickled a bug at the end of this series, where subd was
logging to the peer's log_book on shutdown, but the peer was already
freed. We've already had issues with logging while lightningd is
shutting down.
There are times when reference counting really is the right answer,
this seems to be one of them: the 'struct log' share the 'struct
log_book' and the last 'struct log' cleans it up.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We derive the seed from this, so it needs to be unique, but using
rowid forced us to put the channel into the db early, before it
was ready.
Instead, use a counter to ensure uniqueness, initialized when we load
existing peers. This doesn't need to touch the database at all.
As we now have only two places where the channel is committed (the
funder and fundee paths), so we create a new explicit
'wallet_channel_insert()' function: 'wallet_channel_save()' now just
updates.
Note that this also fixes some weirdness in
wallet_channels_load_active: we strangely avoided loading channels in
CLOSINGD_COMPLETE (which fortunately was a transient state, so
unlikely anyone hit this). Note that since the lines above already
delete all the OPENINGD channels, we now simply load them all.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Adds a simple check that compares genesis-blockhashes from the
chainparams against the blockhash that the wallet was created
with. The wallet is network specific, so mixing is always a bad idea.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
We now keep a list of commands for the jcon instead of a simple
'current' pointer: the assertions become a bit more complex, but
the rest is fairly mechanical.
Fixes: #1007
Reported-by: @ZmnSCPxj
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We do a complicated dance because we don't know the current block
height before setting up the topology.
If we're starting at a particular block, we want to go back 100 blocks
before that to cover any reorgs.
If we're not (fresh startup), we still want to go back 100 blocks
because we don't bother handling a reorg which removes all the blocks
we know.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
With fallback depending on chainparams: this means the first upgrade
will be slow, but after that it'll be fast.
Fixes: #990
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We error out for all kinds of reasons early on (eg. bitcoind down),
and printing a backtrace for them is pretty confusing.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Includes closing off stdout and stderr. We don't do it directly in the
arg parser, as we want to interact normally (eg with other errors) before
we turn off stdout/stderr.
Fixes: #986
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This interacts badly with --daemon (next patch) which then tries to
reap a child it didn't create, which took me a couple of hours to
figure out.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Once we read a command, we are supposed to io_wait until it finishes.
However, we are actually woken in two places: when it's complete
(which is correct), and when it's written out (which is wrong).
We don't care when it's written out, only when it's finished:
refactor to make json_done() free and NULL the old ->current,
rather than have the callers do it. Now it's clear that it's
ready for both new output and new input.
Fixes: #934
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We will have probably failed the others, but either way, don't try to
fulfill an HTLC we've already failed.
Fixes: #394
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We usually did this, but sometimes they were named after what they did,
rather than what they cleaned up.
There are still a few exceptions:
1. I didn't bother creating destroy_xxx wrappers for htable routines
which already existed.
2. Sometimes destructors really are used for side-effects (eg. to simply
mark that something was freed): these are clearer with boutique names.
3. Generally destructors are static, but they don't need to be: in some
cases we attach a destructor then remove it later, or only attach
to *some* cases. These are best with qualifiers in the destroy_<type>
name.
Suggested-by: @ZmnSCPxj
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This provides a sanity check that we are in sync, and also keeps the
logic in the program and out of the SQL.
Since the destructor now doesn't clean up the peer, there are some
wider changes to be made when cleaning up. Most notably we create
lots of channels in run-wallet.c and they previously freed the peer:
now we need free the peer explicitly, so we need to free them first.
Suggested-by: @cdecker
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
And return the correct error message for the channel they give, if
they try to re-establish on an error channel.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Channels are within the peer structure, but the peer is freed only
when the last channel is freed.
We also implement channel_set_owner() and make peer_set_owner() a temporary
wrapper.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Much like the database; peer contains id, address, channel contains
per-channel information. Where we create a channel, we always create
the peer too.
For the moment, peer->log and channel->log coexist side-by-side, to
reduce some of the churn.
Note that this changes the API to dev-forget-channel: if we have more
than one channel, we insist they specify the short-channel-id.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is not connected yet; during the transition, there will be a 1:1
mapping from channel to peer, so we can use channel2peer and peer2channel
to shim between them.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Both when we forget about an opening peer, and at startup. We're
going to be relying on this, and the next patch, as we refactor
peer/channel handling to mirror the db.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Combining the two was just awkward, so it's clearer to have separate
functions. And we make the lower-level functions do the escaping.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The JSON-RPC spec specifies that if the request is unparseable we
should return an error with a NULL id. This is a bit more friendly
than slamming the door in the face.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
As reported by @practicalswift in #945 it is possible to inject
non-printable, or shell escape, characters in a json command, that
will fail to parse and then clear the shell.
Reported-by: @practicalswift
Signed-off-by: Christian Decker <decker.christian@gmail.com>
Now we have wirestring, this is much more natural. And with the
24M length limit, we needn't be so concerned about dumping 64k peer
messages in hex.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
These are now logically arrays of pointers. This is much more natural,
and gets rid of the horrible utxo array converters.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
`activate_peer` does little more than wiring up some txwatches and
asking `gossipd` to reconnect to the peer. If the peer manages to
reconnect before we activate then we would crash.
This just changes the `assert` causing the crash into a conditional
whether we need to reconnect or not.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
Due to the broadcast failure quite a few users are reporting channels
stuck in awaiting lockin. This commit adds a `dev-forget-channel`
command that checks whether the funding outpoint is in the UTXO, and
forgets the channel if not. The UTXO check can be overridden with the
`force` parameter, but that is dangerous.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
We were sideloading it, which is awkward, now it's a field that we can
actually use in the code.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
We currently don't handle LOG_IO properly, and we turn it into a string
before handing it to the ->print function, which makes it ugly for
the case where we're using copy_to_parent_log, and also means in
that case we lose *what peer* the IO is coming from.
Now, we handle the io as a separate arg, which is much neater.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Logging often gets called in error paths, so this is just good hygiene.
Also, log_io does this already.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
libunwind does not accept a NULL parameter for the error callback. It
will simply call into the NULL pointer. So add an error callback.
This makes the crash output somewhat more sensible on FreeBSD, where
there is no libunwind stack trace available:
2018-02-05T20:24:50.598Z lightningd(75556): error getting backtrace: no stack trace because unwind library not available (0)
Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Maintaining it was always fraught, since the command could go away
if the JSON RPC died. Most recently, it was broken again on shutdown
(see below).
In future we may allow pay commands to block on previous payments, so
it won't even be a 1:1 mapping. Generalize it: keep commands in a
simple list and do a lookup when a payment fails/succeeds.
Valgrind error file: valgrind-errors.5732
==5732== Invalid read of size 8
==5732== at 0x4149FD: remove_cmd_from_hout (pay.c:292)
==5732== by 0x468BAB: notify (tal.c:237)
==5732== by 0x469077: del_tree (tal.c:400)
==5732== by 0x4690C7: del_tree (tal.c:410)
==5732== by 0x46948A: tal_free (tal.c:509)
==5732== by 0x40F1EA: main (lightningd.c:362)
==5732== Address 0x69df148 is 1,512 bytes inside a block of size 1,544 free'd
==5732== at 0x4C2EDEB: free (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==5732== by 0x469150: del_tree (tal.c:421)
==5732== by 0x46948A: tal_free (tal.c:509)
==5732== by 0x4198F2: free_htlcs (peer_control.c:1281)
==5732== by 0x40EBA9: shutdown_subdaemons (lightningd.c:209)
==5732== by 0x40F1DE: main (lightningd.c:360)
==5732== Block was alloc'd at
==5732== at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==5732== by 0x468C30: allocate (tal.c:250)
==5732== by 0x4691F7: tal_alloc_ (tal.c:448)
==5732== by 0x40A279: new_htlc_out (htlc_end.c:143)
==5732== by 0x41FD64: send_htlc_out (peer_htlcs.c:397)
==5732== by 0x41511C: send_payment (pay.c:388)
==5732== by 0x41589E: json_sendpay (pay.c:513)
==5732== by 0x40D9B1: parse_request (jsonrpc.c:600)
==5732== by 0x40DCAC: read_json (jsonrpc.c:667)
==5732== by 0x45C706: next_plan (io.c:59)
==5732== by 0x45D1DD: do_plan (io.c:387)
==5732== by 0x45D21B: io_ready (io.c:397)
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is a transitional patch so we can still close channels cleanly;
for want of a better option, I hooked it into --deprecated-apis.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We shouldn't fail negotiation just because they exceeded what we thought
fair: we're better off as long as it's actually <= final commitment fee.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We move it into jsonrpc where it belongs, and make it fail the command.
This means it can tell us exactly what was wrong.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
With the new 'human-readable' mode of lightning-cli, this actually produces
a valid config file. It's a bit hacky though...
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We may need to lookup UTXO entries for other reasons, so here we
disentangle it and make it into its own method.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
Might help alleviate some of the issues of having to run a full-node
on the same machine as `lightningd`.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
Exception: Node /tmp/lightning-t5gxc6gs/test_closing_different_fees/lightning-2/ has memory leaks: [{'value': '0x55caa0a0b8d0', 'label': 'ccan/ccan/tal/str/str.c:90:char[]', 'backtrace': ['ccan/ccan/tal/tal.c:467 (tal_alloc_)', 'ccan/ccan/tal/tal.c:496 (tal_alloc_arr_)', 'ccan/ccan/tal/str/str.c:90 (tal_vfmt)', 'lightningd/log.c:131 (new_log)', 'lightningd/subd.c:632 (new_subd)', 'lightningd/subd.c:686 (new_peer_subd)', 'lightningd/peer_control.c:2487 (peer_accept_channel)', 'lightningd/peer_control.c:674 (peer_sent_nongossip)', 'lightningd/gossip_control.c:55 (peer_nongossip)', 'lightningd/gossip_control.c:142 (gossip_msg)', 'lightningd/subd.c:477 (sd_msg_read)', 'lightningd/subd.c:319 (read_fds)', 'ccan/ccan/io/io.c:59 (next_plan)', 'ccan/ccan/io/io.c:387 (do_plan)', 'ccan/ccan/io/io.c:397 (io_ready)', 'ccan/ccan/io/poll.c:305 (io_loop)', 'lightningd/lightningd.c:347 (main)', '(null):0 ((null))', '(null):0 ((null))', '(null):0 ((null))'], 'parents': ['lightningd/log.c:103:struct log_book', 'lightningd/lightningd.c:43:struct lightningd']}]
Technically, true, but we save more memory by sharing the prefix pointer
than we lose by leaking it.
However, we'd ideally refcount so it's freed if the log is freed and
all the entries using it are pruned from the log book.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This makes much more sense when you ask for a specific peer's log.
Also, we put the peerid rather than pid ().
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We added code to allow a few spurious failures, but it didn't unmark
the request running.
IRC user 'mlz' (@molxyz) provided logs from his stuck-at-old-block lightningd:
lightningd(31981): Adding block 1261159: 00000000da3890ccd0f313a74fccfd4789654b496836da5c28a8d2ad28852264
lightningd(31981): Adding block 1261160: 00000000f70938a33aecbdd7b047cb5cf5b095ea4770c1335acf1859bad1e767
lightningd(31981): bitcoin-cli -testnet estimatesmartfee 2 CONSERVATIVE exited with status 1
Fixes: #749
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
There is an interaction between --ipaddr and --port, namely that the
default port is used when parsing --ipaddr if --port comes after the
--ipaddr, and --port is used if it comes before it. Adding a port to
--ipaddr still trumps everything else, but this way we correctly set
port in the address.
Reported-by: Wladimir J. van der Laan @laanwj
Signed-off-by: Christian Decker <decker.christian@gmail.com>
The JSON connect command wouldn't terminate if peer reconnected
in a state CHANNELD_AWAITING_LOCKIN or above.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Such an htlc is invalid, and will be failed cleanly by our channeld
(which also checks that it meets the minimum amount), but it's
not the master's job to check it, and in fact, it asserts if we were
to try to pay or forward such a thing.
Fixes: #686
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The peer shouldn't try, and channeld won't try to add it if it does,
but we shouldn't trust it. And it would make our htlc_in_check() code
assert.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Before this patch:
```
$ lightningd/lightningd
lightningd(PID): Creating lightningd dir /root/.lightning (because chdir gave No such file or directory)
lightningd(PID): Creating database
```
After this patch:
```
$ lightningd/lightningd
lightningd(PID): Creating lightningd dir /root/.lightning
lightningd(PID): Creating database
```
delinvoice was orginally documented to only allow deletion of unpaid
invoices, but there might be reasons to delete paid ones or unexpired ones.
But we have to avoid the race where someone pays as it's deleted: the
easiest way is to have the caller tell us the status, and fail if
it's wrong.
Fixes: #477
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Error code is inverted (which makes sense: who returns 'true' on
error?), and anyway there's a leak if we do error.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We're going to have to support multiple channels per peer, even if only
when some are onchain. This would break the current listpeers, so
change it to an array (single element for now).
Other cleanups:
1. Only set connected true if daemon is not onchaind.
2. Only show netaddr if connected; don't make it an array, call it `address`
in comparison with `addresses` in listnodes.
3. Rename `channel` to `short_channel_id`
4. Add `funding_txid` field for voyeurism.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This allows us to add other fields, such as version information,
warnings or invoiceless payments, later.
(Note: the deprecated listinvoice is unchanged)
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This matches the other names, and also the return value is about to change.
This will be removed before release!
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This can be used for upgrades to make sure you're not using deprecated
options, JSON commands, JSON fields, etc.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
For performance, we delay entering the 'wallet_payment' into the db
until we actually commit to the HTLC (when we have to touch the DB
anyway).
This opens a race where we can try to pay twice, and since it's not in
the database yet, we don't notice the duplicate.
So remove the temporary payment field from htlc_out, which was always
an uncomfortable hack, and make the wallet code abstract over the
deferred entry a little by maintaining a 'unstored_payments' list
and incorporating that in results.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We need these to decode any returned errors.
We remove it from struct pay_command too, and load directly from db
when we need it.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We should be saving this, as it's our proof of payment. Also, we return
it if they try to pay again.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This, of course, should never be used. But it helps maintain connections
for the moment while we dig deeper into feerates.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is a common occurence on pruned nodes. By calling the callback
upon failures, we communicate that we couldn't verify the txoutput. We
fail safe rejecting any channel we can't verify.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
This means we print out the correct path with --debugger, which
can be vital if there are multiple binaries (eg. compiled vs installed).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
json_get_params does this for us.
Fixes: 78adf0b (pay: allow 'null' msatoshi field.)
Reported-by: ZmnSCPxj
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Pulling up the save call from `peer_save_commitsig_received` into its
caller `peer_got_commitsig` and adding a call to
`peer_sending_commitsig`
Signed-off-by: Christian Decker <decker.christian@gmail.com>
Message buffer `why` is allocated in the `peer` context and also freed when peer is freed.
Only explicitly free the buffer when peer itself is not freed yet.
exit status is not enough to detect spent outputs. gettxout will return a
success exit code and 0 bytes.
Signed-off-by: William Casarin <jb55@jb55.com>
We'll pass this down to gossip and make sure to re-announce/update
channels every so often. This is also used as a pruning timer, i.e.,
channels that have not been updated in 2 x channel-update-interval
will be pruned from the local view.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
Since most callers use positional arguments, we should allow a 'null'
literal where we require no value at all.
Also adds some more value tests.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Paid invoices need to know how much was actually paid: both for the case
where no 'msatoshi' amount was specified, and for the normal case, where
clients are permitted to overpay in order to help them disguise their
payments.
While we migrate the db, we leave this field as 0 for old paid
invoices. This is unhelpful for accounting, but at least clearly
indicates what happened if we find this in the wild.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
'rhash' is the old terminology, but 'payment_preimage' and
'payment_hash' were decided on for the BOLTs, so we should fix that here.
We still use rhash internally, but that's much easier to fix.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Different commands (listinvoice, delinvoice, waitinvoice,
waitanyinvoice) returned different fields, as not all were updated.
This makes them uniform.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This reuses the same code internally, and also now means that we deal
correctly with "any" msatoshi invoices: the old code would a return
'msatoshi' of 0 in that case.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The manfile and the online help use 'msatoshi', the returned
response uses 'msatoshi', nearly every invoice-related
monetary amount is labelled 'msatoshi' and not 'amount'.
It would be nice if bitcoind had an RPC to do this in one, but that's
a bit much to ask for. We could also hand around proofs, for lite nodes.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
1. htlc->fail has been changed to a u8 *.
2. wallet_get_newindex saves to the db.
3. peer->next_htlc_id is saved to the db in peer_save_commitsig_sent() below.
4. We do store commit in peer_save_commitsig_received(peer, commitnum),
and the fixme below talks about HTLC sigs.
5. We do commit shachain and next_per_commit_point in wallet_shachain_add_hash
and update_per_commit_point respectively.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
All other users of json_get_params(...) check the return value:
```
lightningd/chaintopology.c: if (!json_get_params(buffer, params,
lightningd/chaintopology.c: if (!json_get_params(buffer, params,
lightningd/dev_ping.c: if (!json_get_params(buffer, params,
lightningd/gossip_control.c: if (!json_get_params(buffer, params,
lightningd/invoice.c: if (!json_get_params(buffer, params,
lightningd/invoice.c: if (!json_get_params(buffer, params,
lightningd/invoice.c: if (!json_get_params(buffer, params,
lightningd/invoice.c: if (!json_get_params(buffer, params,
lightningd/invoice.c: if (!json_get_params(buffer, params, "label", &labeltok, NULL)) {
lightningd/invoice.c: if (!json_get_params(buffer, params,
lightningd/jsonrpc.c: if (!json_get_params(buffer, params,
lightningd/pay.c: if (!json_get_params(buffer, params,
lightningd/pay.c: if (!json_get_params(buffer, params,
lightningd/peer_control.c: if (!json_get_params(buffer, params,
lightningd/peer_control.c: if (!json_get_params(buffer, params,
lightningd/peer_control.c: if (!json_get_params(buffer, params,
lightningd/peer_control.c: if (!json_get_params(buffer, params,
lightningd/peer_control.c: if (!json_get_params(buffer, params,
lightningd/peer_control.c: if (!json_get_params(buffer, params,
wallet/walletrpc.c: if (!json_get_params(buffer, params,
wallet/walletrpc.c: if (!json_get_params(buffer, params, "tx", &txtok, NULL)) {
```
I've only seen this under travis, so I can't verify that this fixes it,
but it's certainly a bug which could cause that issue.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is necessary to grad the their_unilateral/to-us outputs since
they aren't being harvested by `onchaind`
Signed-off-by: Christian Decker <decker.christian@gmail.com>
This is the scriptpubkey that onchaind spends all funds to, except for
the their_unilateral/to-us case, so we better recognize that address.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
This is the only case in which we don't respend to a simple keyindex'd
pubkey, so we need to handle this for future spends.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
We always arm the funding_lockin_cb, even if we don't need to. If we
have an short_channel_id already from the db, this was replacing it
and leaking the old one.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Since we panic when we see our root reorg out, even if we're not doing
anything yet, restoring the 100 block margin is the simplest fix.
Unfortunately this means adding a 100-block spacer in the tests, so things
don't get confused.
Fixes: #511
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is surprisingly simple. We set up the watches for funding tx
depth and the funding output, then if it's not onchain we ask gossipd
to reconnect.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Load the first block we're possibly interested in, then load the peers so
we can restore the tx watches, then finally replay to the current tip.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Eventually we want to save blockchain in db to avoid this scan, but
for the moment, we need to reload as far back as we may be interested in.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This gives us a lower bound on where funding tx could be.
In theory, it could be lower than this if we get a reorganization, but
in practice this is already a 1-block buffer (since we can't get into
current block, only the next one).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
In the normal (peer-to-peer) path, the HTLC state prevents us fulfilling
twice, but this goes out the window with onchain HTLCs.
The actual assert which caught it was lightningd/pay.c:70 (payment_succeeded)
in the test_htlc_in_timeout test, after the next commit.
So add an assert earlier (in fulfill_our_htlc_out) and check in the
one caller where it can be true.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We used to load the new tip and work backwards until we joined up with
the previous tip. That consumed quite a lot of memory if there were
many blocks.
Instead, just poll on blocknum+1, and grab it once that succeeds. If
prev is different from what we expect (reorg), we free the current tip
and try again.
We could theoretically miss a reorg which is the same length (2 block
reorg with more work due to difficulty adjustment), but even if that
happened we'd catch up on the next block.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It definitely changes when we get a block, but it also changes between
blocks as mempool fills. So put it on its own timer.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It's just a sha256_double, but importantly when we convert it to a
string (in type_to_string, which is used in logging) we use
bitcoin_blkid_to_hex() so it's reversed as people expect.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It's just a sha256_double, but importantly when we convert it to a
string (in type_to_string, which is used in logging) we use
bitcoin_txid_to_hex() so it's reversed as people expect.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
I prefer the typesafety of specific functions, rather than having the
caller know that txids are traditionally reversed in bitcoin.
And we already have a bitcoin_txid_to_hex() function for this.
Closes: #411
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* Add port parsing support to parse_wireaddr. This is in preparation for storing
addresses in the peers table. This also makes parse_wireaddr a proper inverse of
fmt_wireaddr.
* Move parse_wireaddr to common/wireaddr.c this seems like a better place for
it. I bring along parse_ip_port with it for convenience. This also fixes some
issues with the upcoming ip/port parsing tests.
Signed-off-by: William Casarin <jb55@jb55.com>
We set hout->key.id when channeld tells us what it is, but if channeld
dies before that we free the hout, and our destructor logs it:
Valgrind error file: valgrind-errors.20312
==20312== Use of uninitialised value of size 8
==20312== at 0x53ABC9B: _itoa_word (_itoa.c:179)
==20312== by 0x53B041F: vfprintf (vfprintf.c:1642)
==20312== by 0x53B17D5: buffered_vfprintf (vfprintf.c:2330)
==20312== by 0x53AEAA5: vfprintf (vfprintf.c:1301)
==20312== by 0x53B7D63: fprintf (fprintf.c:32)
==20312== by 0x128BAC: hout_subd_died (peer_htlcs.c:316)
==20312== by 0x16D8E0: notify (tal.c:240)
==20312== by 0x16DD95: del_tree (tal.c:400)
==20312== by 0x16DDE7: del_tree (tal.c:410)
==20312== by 0x16DDE7: del_tree (tal.c:410)
==20312== by 0x16E1B4: tal_free (tal.c:509)
==20312== by 0x162B5C: io_close (io.c:443)
==20312== by 0x12D563: sd_msg_read (subd.c:508)
==20312== by 0x161EA5: next_plan (io.c:59)
==20312== by 0x1629A2: do_plan (io.c:387)
==20312== by 0x1629E0: io_ready (io.c:397)
==20312== by 0x164319: io_loop (poll.c:305)
==20312== by 0x118E21: main (lightningd.c:334)
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Accuracy improvements:
1. We assumed the output was a p2wpkh, but it can be user-supplied now.
2. We assumed we always had change; remove this for wallet_select_all.
Calculation out-by-one fixes:
1. We need to add 1 byte (4 sipa) for the input count.
2. We need to add 1 byte (4 sipa) for the output count.
3. We need to add 1 byte (4 sipa) for the output script length for each output.
4. We need to add 1 byte (4 sipa) for the input script length for each input.
5. We need to add 1 byte (4 sipa) for the PUSH optcode for each P2SH input.
The results are now a slight overestimate (due to guessing 73 bytes
for signature, whereas they're 71 or 72 in practice).
Fixes: #458
Reported-by: Jonas Nick @jonasnick
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Two changes:
- Fixed the function signature of noleak_ to match in both
configurations
- Added memleak.o to linker for tests
Generating the stubs for the unit tests doesn't really work since the
stubs are checked in an differ between the two configurations, so
adding memleak to the linker fixes that, by not requiring stubs to be
generated in the first place.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
We can call this multiple times. The best solution is to add and remove
the signature so it's always unsigned as we expect it.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The pay command in particular, attaches a reasonable number of
temporaries to cmd, knowing they'll be freed once cmd is done.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is called when we load from database: clearly our tests aren't thorough
enough because we were allocating and initializing `r` in an unused structure.
invs is also the owner already; functions which steal are a bit surprising
to callers, so we either document them, or just don't do it.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We have things which we don't keep a pointer to, but aren't leaks.
Some are simply eternal (eg. listening sockets), others cases are
io_conn tied to the lifetime of an fd, and timers which expire.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
memleak doesn't detect pointers to within an object, only pointers to their
exact address (it's simpler this way). Moving the linked list to the
top of the structure means it can follow the chain.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
memleak doesn't detect pointers to within an object, only pointers to their
exact address (it's simpler this way). Moving the linked list to the
top of the structure means it can follow the chain.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is not a child of cmd, since they have independent lifetimes, but
we don't want to noleak them all, since it's only the one currently in
progress (and its children) that we want to exclude.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We use the tal notifiers to attach a `backtrace` object on every
allocation.
This also means moving backtrace_state from log.c into lightningd.c, so
we can hand it to memleak_init().
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is a primitive mark-and-sweep-style garbage detector. The core is
in common/ for later use by subdaemons, but for now it's just lightningd.
We initialize it before most other allocations.
We walk the tal tree to get all the pointers, then search the `ld`
object for those pointers, recursing down. Some specific helpers are
required for hashtables (which stash bits in the unused pointer bits,
so won't be found).
There's `notleak()` for annotating things that aren't leaks: things
like globals and timers, and other semi-transients.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
jsonrpc handlers usually directly call command_success or
command_fail; not doing that implies they're waiting for something
async.
Put an explicit call (currently a noop) there, and add debugging
checks to make sure it's used.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Couldn't find a good place to put these messages, we probably want to
do the same capability based request routing that we did for the HSM,
but for now this just defines the message in the master messages file.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
If send_htlc_out() fails, it doesn't initialize pc->out; that can
make us think it's still in progress.
Reported-by: Jonas Nick
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
When gossipd sends a message, have a gossip_index. When it gets back a
peer, the current gossip_index is included, so it can know exactly where
it's up to.
Most of this is mechanical plumbing through openingd, channeld and closingd,
even though openingd and closingd don't (currently) read gossip, so their
gossip_index will be unchanged.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
All peers come from gossipd, and maintain an fd to talk to it. Sometimes
we hand the peer back, but to avoid a race, we always recreated it.
The race was that a daemon closed the gossip_fd, which made gossipd
forget the peer, then master handed the peer back to gossipd. We stop
the race by never closing the gossipfd, but hand it back to gossipd
for closing.
Now gossipd has to accept two fds, but the handling of peers is far
clearer.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
As demonstrated in the test at the end of this series, openingd dying
spontaneously causes the conn to be freed which causes the subd to be
destroyed, which fails the peer, which hits the db.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rather than using the destructor, hook up the cmd so we can close it.
peers are allocated off ld, so they are only destroyed explicitly.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We are still generating only char* style aliases, but the field is
defined to be unicode, which doesn't mix too well with char.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
We don't use it yet, but now we'll decode correctly.
See: https://github.com/lightningnetwork/lightning-rfc/pull/317
lightning-rfc commit: ef053c09431442697ab46e83f9d3f86e3510a18e
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Change all calls to use the correct serialization and deserialization
functions, include the correct headers and remove the control
messages.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
The master now hands channeld either an error code, and channeld
generates the error message, or an error message relayed from another
node to pass through.
This doesn't fill in the channel_update yet: we need to wire up gossipd
to give us that.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Currently lightningd does this, but channeld is perfectly capable of doing it.
channeld is also in a far better position to add channel_updates to it.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
estimatesmartfee 4 ECONOMICAL was too high for lnd, so drop it, with some
increased security risk.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The filter is being populated while initializing the daemon and by
adding new keys as they are being generated. The filter is then used
in connect_block to identify transactions of interest.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
This is mainly used to filter for transactions that may be of interest
to us, i.e., whether one of our keys is the recipient. It currently
does onyl simple scriptpubkey checks, but will eventually be extended
to use bloomfilters and add more sophisticated checks.
For now the goal is to speed up the processing of blocks during startup.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
This addresses a performance regression introduced by
6ceb375650. We were storing it in an
otherwise empty DB transaction, which means that DB transaction was no
longer a no-op. Now we defer storing until we need to store the
corresponding HTLC anyway, so we can just piggyback on top of that
transaction.
This is also more consistent since we'd be forgetting the payment
anyway if we restart between adding the HTLC and committing to it.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
We only send them when we're not awaiting revoke_and_ack: our
simplified handling can't deal with multiple in flights.
Closes: #244
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The wire protocol uses this, in the assumption that we'll never see feerates
in excess of 4294967 satoshi per kiloweight.
So let's use that consistently internally as well.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Depending on what we're doing, we can want different ones. So use
IMMEDIATE (estimatesmartfee 2 CONSERVATIVE), NORMAL (estimatesmartfee
4 ECONOMICAL) and SLOW (estimatesmartfee 100 ECONOMICAL).
If one isn't available, we try making each one half the previous.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This means we convert it when retrieving from bitcoind; internally it's
always satoshi-per-1000-weight aka millisatoshi-per-weight.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Test objects must be added to $(ALL_OBJS) so they correctly depend on
CCAN headers etc.
Also, each test in a subdir must depend on headers and src in the parent
directory, as it will often #include them directly.
Reported-by: Christian Decker
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Our testsuite uses --dev-fail-on-subdaemon-fail, so I didn't notice this
until I turned that off to chase a bug.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
All the callers need to pass it in: currently channeld and openingd just
fake it by copying the payment point.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
There were two bugs: we weren't returning the next from the given
label but the one matching the label, and we were appending new
invoices to the head instead of the tail, which meant we'd be
traversing in the wrong order.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
Thought we don't handle it at the moment, nodes can certainly have multiple
addresses, and we should display them all.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We don't track them accurately when in onchaind, but we don't want to:
onchaind can be restarted at any time.
Once it's all settled, we're clear to clean them up.
Before this, valgrind could complain about deferncing hout->key.peer:
Valgrind error file: valgrind-errors.10876
==10876== Invalid read of size 4
==10876== at 0x41F8AF: peer_on_chain (peer_control.h:127)
==10876== by 0x42340D: notify_new_block (peer_htlcs.c:1461)
==10876== by 0x40A08D: connect_block (chaintopology.c:96)
==10876== by 0x40A96B: topology_changed (chaintopology.c:313)
==10876== by 0x40AC85: add_block (chaintopology.c:384)
==10876== by 0x40ABF0: gather_previous_blocks (chaintopology.c:363)
==10876== by 0x4051B3: process_rawblock (bitcoind.c:410)
==10876== by 0x4044DD: bcli_finished (bitcoind.c:155)
==10876== by 0x454665: destroy_conn (poll.c:183)
==10876== by 0x454685: destroy_conn_close_fd (poll.c:189)
==10876== by 0x45DF89: notify (tal.c:240)
==10876== by 0x45E43A: del_tree (tal.c:400)
==10876== Address 0x6929208 is 2,120 bytes inside a block of size 2,416 free'd
==10876== at 0x4C2EDEB: free (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==10876== by 0x45E513: del_tree (tal.c:421)
==10876== by 0x45E849: tal_free (tal.c:509)
==10876== by 0x41A8E9: handle_irrevocably_resolved (peer_control.c:1172)
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We are announcing that we are willing to accept incoming payments with
current_height + min_final_cltv_expiry + slack, assuming that the
sender adds some slack. In particular we'd reject the payment if
slack=0 which is allowed by the spec.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
We save location where transaction was started, in case we try to nest.
There's now no error case; db_exec_mayfail() is the only one.
This means the tests need to override fatal() if they want to intercept
these errors.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is a subset of a "bitcoind: wrap callbacks in transaction." from
the everything-in-transaction branch, but we need the ld pointer now.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
And nail "make check-source" to that specific version (which is a commit id,
not a branch name, so needs a different syntax for git).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
These need to be different for testing the example in BOLT 11.
We also use the cltv_final instead of deadline_blocks in the final hop:
various tests assumed 5 was OK, so we tweak utils.py.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It crashes under valgrind, causing a valgrind error: valgrind gives us a
backtrace anyway, so we don't need it.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Normally, we get an error as soon as we send WIRE_REVOKE_AND_ACK. But if the
commit timer goes off, we get some extra cycles, during which the other side
can reconnect. In this case, we simply kill the channeld before it fails,
and never check for the permfail string.
b'lightning_channeld(18613): TRACE: dev_disconnect: -WIRE_REVOKE_AND_ACK'
b'lightning_channeld(18613): TRACE: Trying commit'
b'lightning_channeld(18613): TRACE: htlc 0: SENT_ADD_REVOCATION->SENT_ADD_ACK_COMMIT'
b'lightning_channeld(18613): TRACE: htlc added REMOTE: local +0 remote -200000000'
b'lightning_channeld(18613): TRACE: sending_commit: HTLC REMOTE 0 = SENT_ADD_ACK_COMMIT/RCVD_ADD_ACK_COMMIT'
b'lightning_gossipd(18590): TRACE: Responder: Act 1'
b'lightning_channeld(18613): TRACE: Derived key 034aab0b5cb755de836cffb34c053ba115fba6fe75414e8f56261e23c80eabb1fe from basepoint 03e0a7bb422b254f54bc954be05bd6823a7b7a4b996ff8d3079ca211590fb5df39, point 02f3bf525b6ca595bf85d63e89c95fc59c0fde3ae434b55c8093bbb5c64849da37'
b'lightningd(18465): Connected json input'
b'lightningd(18465):jcon fd 16: Success'
b'lightningd(18465):jcon fd 16: Closing (Bad file descriptor)'
b'lightning_gossipd(18590): TRACE: Responder: Act 2'
b'lightning_gossipd(18590): TRACE: Responder: Act 3'
b'lightning_gossipd(18590): UPDATE WIRE_GOSSIP_PEER_CONNECTED'
b'lightning_gossipd(18590): UPDATE WIRE_GOSSIP_PEER_CONNECTED'
b'lightningd(18465): peer 0266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c03518: Peer has reconnected, state CHANNELD_NORMAL'
b'lightning_channeld(18613): Status closed, but not exited. Killing'
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
And we report these through the getpeers JSON RPC again (carefully: in
our reconnect tests we can get duplicates which this patch now filters
out).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
In future it will have TOR support, so the name will be awkward.
We collect the to/fromwire functions in common/wireaddr.c, and the
parsing functions in lightningd/netaddress.c.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
They don't currently, since callers check, but be safe. In addition,
handle NULL returns from these in the bitcoind code.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
There are others, but they really are casued by bad failure. We need a
parachute system for these.
Closes: #176
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is a bit messier than I'd like, but we want to clearly remove all
dev code (not just have it uncalled), so we remove fields and functions
altogether rather than stub them out. This means we put #ifdefs in callers
in some places, but at least it's explicit.
We still run tests, but only a subset, and we run with NO_VALGRIND under
Travis to avoid increasing test times too much.
See-also: #176
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
test_routing_gossip (__main__.LightningDTests) ... lightningd: Outstanding taken pointers: lightningd/peer_control.c:2352:towire_errorfmt(ld, ((void *)0), "Can't resolve your address")
This caused by the other end closing due to the next bug.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
There is a race we see sometimes under valgrind on Travis which shows
gossipd receiving the node_announce from master before it reads the
channel_announce from channeld, and thus fails. The simplest solution
is to send the channel_announce and channel_update to master as well,
so it can ensure it sends them to gossipd in order
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It makes it impossible to embed an ipaddr in another structure, since we
always try to skip over any zeroes, which may swallow a following field.
Do the skip specially for the case where we're parsing routing messages:
we never use padding for our own internal messages anyway.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Makes it easier to compare before/after failures. Ideally, we should
run under Travis both with this option and with the seed based on the
entire tmp path (which is still reproducible with determination, but
not fixed every run like this is).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
There are now only two kinds of subdaemons: global ones (hsmd, gossipd) and
per-peer ones. We can handle many callbacks internally now.
We can have a handler to set a new peer owner, and automatically do
the cleanup of the old one if necessary, since we now know which ones
are per-peer.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We currently rely on a zero exit status. That's the only difference between
onchain finished handling and other per-peer daemons, so instead we should
have an explicit "done" message. This is both clearer, and allows us to
unify.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Now the flow is much simpler from a lightningd POV:
1. If we want to connect to a peer, just send gossipd `gossipctl_reach_peer`.
2. Every new peer, gossipd hands up to lightningd, with global/local features
and the peer fd and a gossip fd using `gossip_peer_connected`
3. If lightningd doesn't want it, it just hands the peerfd and global/local
features back to gossipd using `gossipctl_handle_peer`
4. If a peer sends a non-gossip msg (eg `open_channel`) the gossipd sends
it up using `gossip_peer_nongossip`.
5. If lightningd wants to fund a channel, it simply calls `release_channel`.
Notes:
* There's no more "unique_id": we use the peer id.
* For the moment, we don't ask gossipd when we're told to list peers, so
connected peers without a channel don't appear in the JSON getpeers API.
* We add a `gossipctl_peer_addrhint` for the moment, so you can connect to
a specific ip/port, but using other sources is a TODO.
* We now (correctly) only give up on reaching a peer after we exchange init
messages, which changes the test_disconnect case.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We're going to make the ip/port optional, so they should go at the end.
In addition, using ip:port is nicer, for gethostbyaddr().
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We have to do a dance when we get a reconnect in openingd, because we
don't normally expect to free both owner and peer. It's a layering
violation: freeing a peer should clean up the owner's pointer to it,
to avoid a double free, and we can eliminate this dance.
The free order is now different, and the test_reconnect_openingd was
overprecise.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This fixes the only case where the master currently has to write directly
to the peer: re-sending an error. We make gossipd do it, by adding
a new gossipctl_fail_peer message.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
In particular, the main daemon needs to pass it about (marshal/unmarshal)
but it won't need to actually use it after the next patch.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We pull them from the database on-demand, where we're storing them
anyway. No need to keep them in memory as well.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
No idea why we were iterating over the list of stubs and then passing
in the index instead of a pointer to the stub directly.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
This wires in the loading of `struct htlc_stub`s on-demand when
starting `onchaind` so that we don't need to keep them in memory.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
So far we were tracking the status by including it either in the paid
or the unpaid list. This refactor makes the state explicit, which
matches the planned DB schema much better.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
While loading HTLCs from the database we might not yet have all the
incoming HTLCs loaded when loading a dependent htlc_out. So we defer
the wiring of the HTLCs until we are sure we have them loaded.
This is also the first step towards keeping that association only in
the database, since otherwise we cannot selectively load channels from
DB.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
Especially when testing we might want to disable the automatic
reconnection logic in order not to masquerade bugs that disappear when
reconnecting.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
Seems to go out to lunch on reorgs:
+136792.168286138 lightningd(9465):BROKEN: bitcoin-cli getchaintips exited 28: 'error code: -28
error message:
Rewinding blocks...
Closes: #286
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We don't hit this in testing, since we wait for startup already. Hacking
tests to avoid that, I tested this code by hand.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Using pc after free in the pay_command_destroyed destructor, so
we just steal cmd onto pc so free order is the one we want.
[ Edit: expanded comment, split commit ]
Signed-off-by: Christian Decker <decker.christian@gmail.com>
So far only happens during normal shutdown, but it may happen in other
cases as well. We simply define a new destructor that unregisters the
`cmd` from the `jcon`.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
These were fun to hunt down. The jcon and the conn are allocated off
of ld, so the free order is unspecified and if conn is freed before
conn then the finish_jcon destructor uses conn after free.
[ Edit: split commit, modified to use a destructor directly on jcon,
which is more robust than relying on it only being freed via conn --RR ]
Signed-off-by: Christian Decker <decker.christian@gmail.com>
peer_fail_permanent() frees peer->owner, but for bad_peer() we're
being called by the sd->badpeercb(), which then goes on to
io_close(conn) which is a child of sd.
We need to detach the two for this case, so neither tries to free the
other.
This leads to a corner case when the subd exits after the peer is gone:
subd->peer is NULL, so we have to handle that too.
Fixes: #282
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We have a race where we start onchaind, but state is unchanged, so checks
like peer_control.c's:
peer_ready = (peer->owner && peer->state == CHANNELD_AWAITING_LOCKIN);
if (!peer_ready) {
log_unusual(peer->log,
"Funding tx confirmed, but peer state %s %s",
peer_state_name(peer->state),
peer->owner ? peer->owner->name : "unowned");
} else {
subd_send_msg(peer->owner,
take(towire_channel_funding_locked(peer,
peer->scid)));
}
Can send to the wrong daemon.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We were sending a channeld message to onchaind, which was v. confusing
due to overlap. We make all the numbers distinct, which means we can
also add an assert() that it's valid for that daemon, which catches
such errors immediately.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We re-use the value for reasonable_depth given by the master, and we
tell it when our timeout transactions reach that depth.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
When we see an offered HTLC onchain, we need to use the preimage if we
know it. So we dump all the known HTLC preimages at startup, and send
new ones as we discover them.
This doesn't cover preimages we know because we're the final
recipient; that can happen if an HTLC hasn't been irrevocably
committed yet. We'll do that in a followup patch.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
If the HSM is slow it might happen that the timestamp has changed the
second time we come around, so we generate the timestamp externally
and pass it in so we're sure it won't change between calls.
Reported-by: Rusty Russell
Signed-off-by: Christian Decker <decker.christian@gmail.com>
lightningd can crash on shutdown if it's in the middle of getchaintips;
we free the conn, the finished callback is called (process_chaintips),
and it reports that it received an empty result.
The simplest fix is to set a flag in the struct bitcoind destructor,
and avoid the callback.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Either when it exits with a signal, or sends an error status message.
Then we make test_lightningd.py use it.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This change is really to allow us to have a --dev-fail-on-subdaemon-fail option
so we can handle failures from subdaemons generically.
It also neatens handling so we can have an explicit callback for "peer
did something wrong" (which matters if we want to close the channel in
that case).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
1. Remove reference to old $(LIGHTNINGD_OLD_LIB_OBJS) var (in handshaked too).
2. Make check depend directly on unit tests, insteadof weird lightningd/tests
variable.
3. check-source-bolt and check-whitespace are automatic for $(ALL_TEST_PROGRAMS)
so we don't need them here.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is the step where we broadcast the transaction to the network and
a nice place to extract the change from the transaction.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
It's no longer used and we definitely do not want to run with an
outdated or future db, so we'll terminate if we can't upgrade or
the version is newer than what we understand.
Signed-off-by: Christian Decker <decker.christian@gmai.com>
So far we were always using the deadline in the announcements, that's
obviously not good, so this introduces the parameter as per spec.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
We weren't killing it. Eventually it would die, and peer_owner_finished()
would access subd->peer->owner, but that peer was freed already.
Closes: #261
Reported-by: Christian Decker <decker.christian@gmail.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
To reproduce the next bug, I had to ensure that one node keeps thinking it's
disconnected, then the other node reconnects, then the first node realizes
it's disconnected.
This code does that, adding a '0' dev-disconnect modifier. That means
we fork off a process which (due to pipebuf) will accept a little
data, but when the dev_disconnect file is truncated (a hacky, but
effective, signalling mechanism) will exit, as if the socket finally
realized it's not connected any more.
The python tests hang waiting for the daemon to terminate if you leave
the blackhole around; to give a clue as to what's happening in this
case I moved the log dump to before killing the daemon.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
In this case, we unset the old subd->peer, then freed subd.
peer_owner_finished dereferenced subd->peer->owner, and boom:
test_disconnect_funder (__main__.LightningDTests) ... Fatal signal 11. Log dumped in crash.log
------------------------------- Valgrind errors --------------------------------
Valgrind error file: valgrind-errors.2882
==2882== Invalid read of size 8
==2882== at 0x413F74: peer_owner_finished (peer_control.c:679)
==2882== by 0x41EA2C: destroy_subd (subd.c:381)
==2882== by 0x459700: notify (tal.c:240)
==2882== by 0x459BB1: del_tree (tal.c:400)
==2882== by 0x459FC0: tal_free (tal.c:509)
==2882== by 0x413796: peer_reconnected (peer_control.c:493)
==2882== by 0x413A6A: add_peer (peer_control.c:592)
==2882== by 0x40ED1F: handshake_succeeded (new_connection.c:186)
==2882== by 0x41E3DD: sd_msg_reply (subd.c:262)
==2882== by 0x41E6BB: sd_msg_read (subd.c:318)
==2882== by 0x41E4E6: read_fds (subd.c:283)
==2882== by 0x44DEB4: next_plan (io.c:59)
==2882== Address 0x838 is not stack'd, malloc'd or (recently) free'd
==2882==
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The logic of dispatching the announcement_signatures message was
distributed over several places and daemons. This aims to simplify it
by moving it all into `channeld`, making peer_control only report
announcement depth to `channeld`, which then takes care of the
rest. We also do not reuse the funding_locked tx watcher since it is
easier to just fire off a new watcher with the specific purpose of
waiting for the announcement_depth.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
1. The code to skip over padding didn't take into account max.
2. It also didn't use symbolic names.
3. We are not supposed to fail on unknown addresses, just stop parsing.
4. We don't use the read_ip/write_ip code, so get rid of it.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Use a negative timestamp as the flag for this, making the test simple.
This allows valgrind to detect that we're accessing them prematurely,
including across the wire on gossip_getchannels_entry.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
jl777 reported a crash when we try to pay past reserve. Fix that (and
a whole class of related bugs) and add tests.
In test_lightning.py I had to make non-async path for sendpay() non-threaded
to get the exception passed through for testing.
Closes: #236
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
You will want to 'make distclean' after this.
I also removed libsecp; we use the one in in libwally anyway.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Some fields were redundant, some are simply moved into 'struct lightningd'.
All routines updated to hand 'struct lightningd *ld' now.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Also, we split the more sophisticated json_add helpers to avoid pulling in
everything into lightning-cli, and unify the routines to print struct
short_channel_id (it's ':', not '/' too).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
To avoid everything pulling in HTLCs stuff to the opening daemon, we
split the channel and commit_tx routines into initial_channel and
initial_commit_tx (no HTLC support) and move full HTLC supporting versions
into channeld.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Other places require the flags and states, but the structure is
only needed in channeld, and even then we can remove several fields.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
I was hoping to defer HTLC updates until we actually store HTLCs, but
we need to flush to DB whenever balances update as well.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
We're very simple about it: if there's a reorganization, we restart. Otherwise
we tell it about everything.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It's in the shachain, so storing it is completely redundant. We leave
it in for the moment so we can assert() that nothing has changed.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
When loading from DB the list of htlcs was not being initialized which
caused a segfault when the first commit came around, this fixes it.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
The peer->seed needs to be unique for each channel, since bitcoin
pubkeys and the shachain are generated from it. However we also need
to guarantee that the same seed is generated for a given channel every
time, e.g., upon a restart. The DB channel ID is guaranteed to be
unique, and will not change throughout the lifetime of a channel, so
we simply mix it in, instead of a separate increasing counter.
We also needed to make sure to store in the DB before deriving the
seed, in order to get an ID assigned by the DB.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
This is the big one, and it's completely anticlimactic: it loads all
channels that have reached opening and are not marked as
closingd_complete into memory, that's it.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
This was supposed to be a temporary solution anyway, and I had a
rather annoying mixup between peer_id and unique_id, the latter of
which is actually a connection identifier.
Add the channel to the peer on the two open paths (fundee and funder)
and store it into the database. Currently fails when opening a channel
to a known peer after loading from DB because we attempt to insert a
new peer with the same node_id. Will fix later.
As per lightning-rfc change 956e8809d9d1ee87e31b855923579b96943d5e63
"BOLT 7: add chain_hashes values to channel_update and channel_announcment"
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This brings us up to 955e874acc535ab2c74c1cf0eab61896ea4224ff in
https://github.com/lightningnetwork/lightning-rfc
This doesn't actually change anything; the only actual change is held back
for the next commit.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is required for onchaind: we want to watch all descendents by default,
as to do otherwise would be racy, which means we need to traverse the outputs
when a tx appears.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
And store in peer->last_tx/peer->last_sig like all other places,
that way we broadcast it if we need to.
Note: the removal of tmpctx in funder_channel() is needed because we
use txs[0], which was allocated off tmpctx.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We need to check if we exit after sending a revoke_and_ack, otherwise
channeld ends up getting the closing_signed packet.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
tal_strdup() doesn't set tal_count(), so we end up sending an ERROR
packet with an empty message. Wrap this and get it right.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
There was a race condition that would cause an assertion to segfault
if a call to release_peer was interleaved with a fail_peer. The
release_peer was making the peer non-local, which was then causing the
assertion in fail_peer to fail. Now we just have 3 cases: not found,
local, and non-local.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
After quite some back and forth we seem to finally agree on the bit
3 (mask 0x08) to signal optional initial_routing_sync.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
This was causing failures on testnet where confirmations are not
immediate.
Reported-by: Fabrice Drouin @sstone
Signed-off-by: Christian Decker <decker.christian@gmail.com>
We support a number of features already, so failing connections
whenever we see an even bit set is not a good idea. This turned out to
kill our connections to eclair.
Also, the spec says that the LSB / bit 0 is to be counted as index 0, and
therefore even. So we need to check the lower of each 2-bit-tuple not
the higher one.
We were using the bitcoin genesis blockhash for all networks, which is
not correct, and would result in the open being aborted when talking
to other implementations.
Reported-by: @sstone and @pm47
Signed-off-by: Christian Decker <decker.christian@gmail.com>
We weren't registering reconnecting peers for broadcasts. Just
starting a timer is enough. Also added an integration test to check
that the gossip sync is being resumed.
test_closing_negotiation_reconnect (__main__.LightningDTests) ... peer state CLOSINGD_COMPLETE should be CLOSINGD_SIGEXCHANGE
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is a transitional state, while we're waiting to see the
closing tx onchain (which is To Be Implemented).
The simplest way to do re-transmission is to re-use closingd, and just
disallow any updates.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
I made the mistake of thinking it was a [NUM_SIDES] array, but
it's actually our balance, and it's in millisatoshi. Rename
for clarity.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
As tracked down by Christian; by setting up the master conn first,
we make the master fd async. This means that the synchronous read
(in init_channel) can fail with -EAGAIN, and indeed, Christian
saw this when not running under valgrind.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We actually don't need to transition if we're reconnecting, and logic
to go to CHANNELD_NORMAL was wrong: we checked that we'd seen funding tx
locked, but not that we'd received a msg from the remote peer.
We need to fix the tests now we no longer double-transition, too.
Fixes: #188
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Important: a non-standard one can make the closing tx not propagate.
Drive-by cut&paste message fix, too.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is what it actually is, and makes it clearer when we refer to the
spec. It's the commitment we're currently updating, which is the next
commitment.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We currently have the problem that the master can send new HTLCs before
we've processed the incoming reestablish message.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The next patch includes wire/peer_wire.h and causes a compile error
as lightningd/gossip_control.c defined its own gossip_msg function.
New names are clearer.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This can happen even without a protocol violation, if the incoming
update_add_htlc crosses over our outgoing shutdown.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We keep the scriptpubkey to send until after a commitment_signed (or,
in the corner case, if there's no pending commitment). When we
receive a shutdown from the peer, we pass it up to the master.
It's up to the master not to add any more HTLCs, which works because
we move from CHANNELD_NORMAL to CHANNELD_SHUTTING_DOWN.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We don't need to keep this around any more: by handing it to
subdaemons we ensure we'll close it if the peer disconnects, and we
also add code to get a new one on reconnection.
Because getting a gossip_fd is async, we re-check the peer state after
it gets back. This is kind of annoying: perhaps if we were to hand
the reconnected peer through gossipd (with a flag to immediately
return it) we could get the gossip fd that way and unify the paths?
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
At the moment, master simply keeps the gossip fd open when peer
disconnects. That's inefficient, and wrong anyway (it may want a
complete new sync, or may not, but we'll currently send all the
messages including stale ones).
This interface will be required for restart anyway.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Now we're always sync, just use an fd. Put the hsm_sync_read() helper
here, too, and do HSM init sync which makes things much simpler.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
With no async calls left, we can just use a stack variable for the fd.
And we're now *always* in the hands of some daemon, unless we're
disconnected, so owner is only NULL in that case.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We had a terrible hack in gossip when a peer didn't exist. Formalize
a pattern when code+200 is a failure (with no fds passed), and use it
here.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This means there's no GETTING_HSMFD state at all any more. We
temporarily play games with the hsm fd; those will go away once we're
done.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This means there's no GETTING_SIG_FROM_HSM state at all any more. We
temporarily play games with the hsm fd; those will go away once we're
done.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Wallet should really be the container for anything bip32 related, so
I'd like to slowly wean off of `ld->bip32_base` in favor of
`ld->wallet->bip32_base`
We'll re-use them a few times so having them at a central location is
nice. We also fix a bug that was unreserving UTXO entries upon free,
instead of promoting them to being spent.
So far we always needed to know the public key, which was not the case
for addresses that we don't own. Moving the hashing outside of the
script construction allows us to send to arbitrary addresses. I also
added the hash computation to the pubkey primitives.
When we get a fail/fulfill on an outgoing HTLC, we tell the correspoding
incoming HTLC about it. But if that peer is disconnected, we don't.
The better solution is to copy the preimage/malformed/failmessage and mark
the incoming HTLC as resolved. This is done most simply by marking it
SENT_REMOVE_HTLC, which will work in the database case as well.
channeld now re-transmits appropriately when it gets started with an HTLC
in that state.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This matches what the master does: increments commit index when we send
commit_sig. Thus if we restart at that point, we match.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This matters in one case: channeld receiving a bad message is a
permenant failure, whereas losing a connection is transient.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We need the old remote per_commitment_point so we can validate the
per_commitment_secret when we get it.
We unify this housekeeping in the master daemon using
update_per_commit_point().
This patch also saves whether remote funding is locked, and disallows
doing that twice (channeld should ignore it).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It's a bit tricky since we want to hand more verbose errors to the local
case, but the locally-created and forwarded paths had diverged (the local
one missing some things).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
There are two ways we can do retransmission on reconnect: re-derive
what we would have sent, or remember it and simply re-send. The
rederivation is difficult: unwinding state depends on whether we sent
a revoke_and_ack before or after the commitment_signed, and unwinding
a revoke_and_ack would require us to remember HTLCs we would have
normally forgotten at this point.
So we simply tell the master to remember the old signatures for us,
and hand them back in case we need to re-send.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
In the case where we can't decrypt the onion, we can't fail it in the
normal way (which is encrypted using the onion shared secret), we need
to respond with a update_fail_malformed_htlc message.
Moreover, we need to remember this for persistence. This means that
we really have three conclusions for an HTLC: fulfilled, failed,
malformed. Fix up the logic everywhere which assumed failed or
fulfilled.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It's easiest to have the master keep the last commit we sent, for
re-transmission. We could recalculate it, but it's made more difficult
by the before/after revoke case.
And because revoke_and_ack changes the channel state, we need to
remember which order we sent them in for re-transmission.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We need this for reestablishing a channel.
(Note: this patch changes quite a bit in this series, but reshuffling was
tedious).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Currently it's fairly ad-hoc, but we need to tell it to channeld when
it restarts, so we define it as the non-HTLC balance.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It needs to save them to the db in case of restart; this means we tell
it about funding_locked, as well as the next_per_commit_point given
in revoke_and_ack.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The channel daemon gets the shared secrets from the HSM to save
the master daemon some work. It used to hand these over at
revoke_and_ack receive, which is when the master daemon needs them.
However, it's a bit simpler to hand them over when we first tell
the master about the incoming HTLC (the first commitsig).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
They share some fields, but they're basically different, and it's clearest
to treat them differently in most places.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
When adding their HTLCs, it needs all the information. When failing,
it needs the id as key and the failure reason. When fulfilling, it
needs the id and payment preimage.
It also needs to know when we have received an revoke_and_ack or a
commitment_signed, to place in the database.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We're about to change to a batch interface, where we tell the master
before we send certain packets (eg. commit, revoke). We need to wait
for it to respond before doing anything else, but it might cross-over
and be sending us commands at the same time.
This queues those requests until we're ready.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This prepares us for handlers turning off peer I/O, rather than assuming
we always want to handle the next incoming message.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We still get the shared secret, since that requires a round trip to the HSM
(why waste the master daemon's time?) but it does the processing, which
simplifies the message passing and things like realm handling which
have nothing to do with this particular channeld.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Some paths were still sending unencrypted failure messages; unify them
all. We need to keep the fail_msg around for resubmission if the
channeld dies; similarly, we need to keep the htlc_end structure
itself after failure, in case the failed HTLC is committed: we can
move it to a minimal archive once it's flushed from both sides,
however.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Means caller has to do some more work, but this is closer to what we want:
we're going to want to send them to the master daemon for atomic commit.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We used level -1 to mean "append to log", but that doesn't actually
work, and results in an assert if we try to prune the logs:
lightningd: daemon/pseudorand.c:36: pseudorand: Assertion `max' failed.
Expose logv_add and use that.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We use --log-level to control this, but we could add another switch.
It makes the test infrastructure simpler, since we can just look in the
main logs.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
In particular, it reassured me that the ammag obfuscation step occurs
even for the initial failmsg creator.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
I actually hit this very hard to reproduce race: if we haven't process
the channeld message when block #6 comes in, we won't send the gossip
message. We wait for logs, but don't generate new blocks, and timeout
on l1.daemon.wait_for_log('peer_out WIRE_ANNOUNCEMENT_SIGNATURES').
The solution, which also tests that we don't send announcement signatures
immediately, is to generate a single block, wait for CHANNELD_NORMAL,
then (in gossip tests), generate 5 more.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Since we have a simple way to query the database for UTXOs we can
simplify some of the coin selection logic. That gets rid of the
in-memory list of UTXOs.
Not the nicest code, but it allows us to store the bip32_max_index so
that we don't forget our addresses upon restart. We could have done
the same by retrieving the max index from our index, but then we'd
forget addresses that don't have an associated output. Conversion
to/from string is so that we can store arbitrary one off values in the
DB in the future, independent of type.
The format we use to generate marshal/unmarshal code is from
the spec's tools/extract-formats.py which includes the offset:
we don't use it at all, so rather than having manually-calculated
(and thus probably wrong) values, or 0, emit it altogther.
Reported-by: Christian Decker
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We use this to make it send the funding_signed message, rather than having
the master daemon do it (which was even more hacky). It also means it
can handle the crypto, so no need for the packet to be handed up encrypted,
and also make --dev-disconnect "just work" for this packet.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We use a file descriptor, so when we consume an entry, we move past it
(and everyone shares a file offset, so this works).
The file contains packet names prefixed by - (treat fd as closed when
we try to write this packet), + (write the packet then ensure the file
descriptor fails), or @ ("lose" the packet then ensure the file
descriptor fails).
The sync and async peer-write functions hook this in automatically.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Header from folded patch 'test-run-cryptomsg__fix_compilation.patch':
test/run-cryptomsg: fix compilation.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Valgrind error file: /tmp/lightning-8k06jbb3/test_disconnect/lightning-7/valgrind-errors
==32307== Uninitialised byte(s) found during client check request
==32307== at 0x11EBAD: memcheck_ (mem.h:247)
==32307== by 0x11EC18: towire (towire.c:14)
==32307== by 0x11EF19: towire_short_channel_id (towire.c:92)
==32307== by 0x12203E: towire_channel_update (gen_peer_wire.c:918)
==32307== by 0x1148D4: send_channel_update (channel.c:185)
==32307== by 0x1175C5: peer_conn_broken (channel.c:1010)
==32307== by 0x13186F: destroy_conn (poll.c:173)
==32307== by 0x13188F: destroy_conn_close_fd (poll.c:179)
==32307== by 0x13B279: notify (tal.c:235)
==32307== by 0x13B721: del_tree (tal.c:395)
==32307== by 0x13BB3A: tal_free (tal.c:504)
==32307== by 0x130522: io_close (io.c:415)
==32307== Address 0xffefff87d is on thread 1's stack
==32307== in frame #2, created by towire_short_channel_id (towire.c:88)
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is simpler than passing back and forth, for the moment at least. That
means we don't need to ask for a new one on reconnect.
This partially reverts the gossip handling in openingd, since it no longer
passes the gossip fd back. We also close it when peer is freed, so it
needs initializing to -1.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We can go to release a gossip peer, and it can fail at the same time.
We work around the problem that the reply must be a gossipctl_release_peer_reply
with two fds, but it's not pretty.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We kill the existing connection if possible; this may mean simply
forgetting the prior peer altogether if it's in an early state.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Instead, send it the funding_signed message; it can watch, save to
database, and send it.
Now the openingd fundee path is a simple request and response, too.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Simplifies state machine. Master still has to calculate the tx to get
the signature and broadcast, but now the opening daemon funding path
is a simple request/response.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We want to use it in peer_control to generate the transaction, but we
really only need the funding_pubkey.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Like the fd, it's only useful when the peer is not in a daemon, so we
free & NULL it when that happens.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We steal it when we're closing connection, but we normally want to forget
it if connection just dies.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
1. We explicitly assert what state we're coming from, to make transitions
clearer.
2. Every transition has a state, even between owners while waiting for HSM.
3. Explictly step though getting the HSM signature on the funding tx
before starting channeld, rather than doing it in parallel: makes
states clearer.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We need to do this on every connection, whether reconnecting or not,
so it makes sense for the handshake daemon to handle it and return
the feature fields.
Longer term I'm considering having the handshake daemon handle the
listening and connecting, and simply hand the fds back once the peers
are ready.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We currently create a peer struct, then complete handshake to find out
who it is. This means we have a half-formed peer, and worse: if it's
a reconnect we get two peers the same.
Add an explicit 'struct connection' for the handshake phase, and
construct a 'struct peer' once that's done.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Now in sync with 8ee57b97738b1e9467a1342ca8373d40f0c4aca5.
Our tool doesn't need to convert them any more, but we actually had a
mis-typed field in the HSM which needed fixing.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The single string-based hostname and port has been retired in favor of
having multiple `struct ipaddr`s from the `node_announcement`. This
breaks the hostnames and ports from IRC, but I didn't bother to
backport ipaddr for it since it is only used in the legacy daemon.
Rather a big commit, but I couldn't figure out how to split it
nicely. It introduces a new message from the channel to the master
signaling that the channel has been announced, so that the master can
take care of announcing the node itself. A provisorial announcement is
created and passed to the HSM, which signs it and passes it back to
the master. Finally the master injects it into gossipd which will take
care of broadcasting it.
We alternated between using a sha256 and using a privkey, but there are
numerous places where we have a random 32 bytes which are neither.
This fixes many of them (plus, struct privkey is now defined in terms of
struct secret).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Under stress, the tests can mine blocks too soon, and the funding never
locks. This gives more of a chance, at least.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We were getting an assert "!secp256k1_fe_is_zero(&ge->x)", because
an all-zero pubkey is invalid. We allow marshal/unmarshal of NULL for
now, and clean up the error handling.
1. Use status_failed if master sends a bad message.
2. Similarly, kill the gossip daemon if it gives a bad reply.
3. Use an array for returned pubkeys: 0 or 2.
4. Use type_to_string(trc, struct short_channel_id, &scid) for tracing.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
I implemented this because a bug causes us to consider the HTLC malformed,
so I can trivially test it for now.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Since we now use the short_channel_id to identify the next hop we need
to resolve the channel_id to the pubkey of the next hop. This is done
by calling out to `gossipd` and stuffing the necessary information
into `htlc_end` and recovering it from there once we receive a reply.
This was overly complex since it was off-by-one and we were storing
some information elsewhere. Now this just loads the route as is into
structs, extracts some information for our outgoing HTLC, and then
shifts by the array of structs by one, and finally fills in the last
instruction, which is the terminal.
The new onion uses the `channel_id` instead of the `node_id` of the
next hop to identify where to forward the payment. So we return the
exact channel chosen by the routing algo, to avoid having to look it
up again later.
Mainly switching from the old include to the new include and adjusting
the actual size of the onion packet. It also moves `channel.c` to use
`struct hop_data`.
It introduces a dummy next hop in `channel.c` that will be replaced in
the next commit.
Adds a new command line flag `--dev-broadcast-interval=<ms>` that
allows us to specify how often the staggered broadcast should
trigger. The value is passed down to `gossipd` via an init message.
This is mainly useful for integration tests, since we do not want to
wait forever for gossip to propagate.
We were using an uninitialized `broadcast_index` on the peer which
would occasionally result in no forwardings at all, segmenting the
network. And during the `msg_queue` refactor, some wait targets were
not updated, resulting in the waits never to be woken up.
This moves all the non-legacy blackbox testing into python.
Before:
real 10m18.385s
After:
real 9m54.877s
Note that this doesn't valgrind the subdaemons: that patch seems to cause
some issues in the python framework which I am still chasing.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rather than dumping all gossip messages then handling local ones again.
This should help us give timely ping replies.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This fails on the old dev-restart tests, so we need to only enable it
for the new tests:
rusty@rusty-XPS-13-9360:~/devel/cvs/lightning (guilt/ping-pong)$ daemon/test/test-basic --restart --verbose
...
{ }
RESTARTING
dev-restart failed!
valgrind: mmap(0x38000000, 2265088) failed in UME with error 22 (Invalid argument).
valgrind: this can be caused by executables with very large text, data or bss segments.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Only the side *accepting* the connection gives a `minumum_depth`, but both
sides are supposed to wait that long:
BOLT #2:
### The `funding_locked` message
...
#### Requirements
The sender MUST wait until the funding transaction has reached
`minimum-depth` before sending this message.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We now have two partially overlapping state-machines: the channel
state and the announcement state. We need to request signatures from
the HSM to exchange them with the peer, and we need to have both sets
of signatures before we can proceed and send the actual announcements.
Instead of reusing HSMFD_ECDH, we have an explicit channeld hsm fd,
which can do ECDH and will soon do channel announce signatures as well.
Based-on: Christian Decker <decker.christian@gmail.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We *should* split the struct into key and data, rather than only comparing
the key parts in the htlc_end_eq function. But meanwhile, this fixes
the code.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This lets us link HTLCs from one peer to another; but for the moment it
simply means we can adjust balance when an HTLC is fulfilled.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is an approximate result (it's only our confirmed balance, not showing
outstanding HTLCs), but it gives an easy way to check HTLCs have been
resolved.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
If a peer dies, and then we get a reply, that can cause access after free.
The usual way to handle this is to make the request a child of the peer,
but in fact we still want to catch (and disard) it, so it's a little
more complex internally.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We call channel_sent_commit *before* sending (so we know if we need
to), so the name is wrong. Similarly channel_sent_revoke_and_ack.
We can usefully have them tell is if there is outstanding work to do,
too.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Passing through 'struct peer *' was a layering violation.
Reported-by: Christian Decker <decker.christian@gmail.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The three cases we care about only happen on specific transitions:
1. They can no longer spend our failed HTLC: we can fail the source now.
2. They are fully committed to their new HTLC htlc: we can forward now.
3. They can no longer timeout their fulfilled HTLC: the funds are ours.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The direction bit was computed in several spots and was inconsistent
in some cases. Now we compute it just in routing, and once when
starting up `channeld`, this avoids recomputing it all over the place.
Now we correctly use the remote revocation basepoint, we need to set
it in run-channel (instead of the local revocation basepoint).
We also update all the comments, as per (pending) spec commit:
https://github.com/lightningnetwork/lightning-rfc/pull/137
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Before exiting, `channeld` constructs and sends a `channel_update`
marking the channel as disabled. This is the pro-active signalling
that the channel may no longer be used.
Copied the JSON-request parsing from `pay.c`, passing through to
`gossipd`, filling the reply with the `route_hop` serialization, and
serializing as JSON-RPC response.
The `route_hop` struct introduced in the previous refactoring is
reused when returning the reply to a `getroute` request. Since these
are nested messages I added the serialization and deserialization
methods.
This came up while debugging the gossip daemon breaking upon calling
`getroute`. It turns out that log was still writing to stdout, but
stdout had been reused for an inter-daemon socket, which would
break...
The STDOUT fd being reused as communication sockets with other daemons
was causing some unexpected crashes if the sub-daemon wrote something,
e.g., using `log_*`. Not closing it should avoid that conflict.
Some of the struct array helpers need to allocate data when
deserializing their fields. The `getnodes` reply is one such example
that allocates the hostname. Since the change to calling array helpers
the getnodes call was broken because it was attempting to allocate off
of the entry, which did not have a tal header, thus failing.
Use msg_enqueue's wake and msg_queue_wait, and don't clone packets since
msg_enqueue() respects take.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We remove the unused status_send_fd, and rename status_send_sync (it
should only be used for that case now).
We add a status_setup_async(), and wire things internally to use that
if it's set up: status_setup() is renamed status_setup_sync().
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is a little more awkward, as we used to do some work
synchronously (the init message), but it's still pretty clear.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Since we use async IO, we can't use status_send. We keep a pointer to the
master daemon_conn, and use that to send.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The gossip subdaemon previously passed the fd after init: this is
unnecessary for peers which simply want to gossip (and not establish
channels).
Now we hand the gossip fd back with the peer fd. This adds another
error message for when we fail to create the gossip fds.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Instead of indicating where to place the fd, you say how many: the
fd array gets passed into the callback.
This is also clearer for the users.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This measn that gossip (which also wants to wake it) needs to wake
the queue, not the daemon_conn.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We use the fourth value (size) to determine the type, unless the fifth
value is suppled. That's silly: allow the fourth value to be a typename,
since that's the only reason we care about the size at all!
Unfortunately there are places in the spec where we use a raw fieldname
without '*1' for a length, so we have to distingish this from the
typename case.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Except for the trivial case of u8 arrays, have the generator create
the loop code for the array iteration.
This removes some trivial helpers, and avoids us having to write more.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
subd_req() needs to get the type before it calls subd_send_msg, because
if it's take() then msg_enqueue() may reallocate.
Which also made me realize that subd_send_message() should not try to dup,
since msg_enqueue() handles that itself.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We have some duplication in handling queues, so this is an attempt at
deduplicating some of that work. `daemon_conn` now uses the
`msg_queue` and `channeld` was also migrated to `msg_queue`. At the
same time I made `msg_queue` create a copy of the messages or takes
over messages marked with `take()`. This should make cleaning up
messages easier.
This allows us to break out of the normal queue-based write loop and
handle things ourself for a while. Currently this is used to trigger
regular gossip dumps that do not proceed until the buffers have been
cleared in order to avoid memory-explosions.
We were firing off the wakeup timers all over the place, out of fear
that we would be triggering two concurrent broadcasts. This is not
really the case since the wakeup calls are idempotent. This also
allows us not to differentiate between triggering a broadcast on a
local peer or on a proxied peer.
This includes some code duplication, but since the two write targets
are fundamentally different we might need to refactor a bit more to
unify them again.
We will eventually ween off of the logging, or replace it with status
messages that log in `lightningd`, but for now we still have the
routing module that does some logging.
This uses a single fd for both status and control.
To make this work, we enforce the convention that replies are the same
as requests + 100, and that their name ends in "_REPLY".
This also means that various daemons can simply exit when done; there's
no race between reading request and closing status fds.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This was included in the lightningnetwork/lightning-rfc#105 update
to the test vectors, and it's a good idea. Takes a bit of work to
calculate (particularly, being aware of rounding issues).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
aka "BOLT 3: Use revocation key hash rather than revocation key",
which builds on top of lightningnetwork/lightning-rfc#105 "BOLT 2,3,5:
Make htlc outputs of the commitment tx spendable with revocation key".
This affects callers, since they now need to hand us the revocation
pubkey, but commit_tx has that already anyway.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
All the daemons will use a common seed for point derivation, so drag
it out of lightningd/opening.
This also provide a nice struct wrapper to reduce argument count.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We should start watching for the transaction before we send the
signature; we might miss it otherwise. In practice, we only see
transactions as they enter a block, so it won't happen, but be
thorough.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is a bit tricky: for our signing code, we don't want scriptsigs,
but to calculate the txid, we need them. For most transactions in lightning,
they're pure segwit so it doesn't matter, but funding transactions can
have P2SH-wrapped P2WPKH inputs.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
built_utxos needs to calculate fees for figuring out how many utxos to
use, so fix that logic and rely on it.
We make build_utxos return a pointer array, so funding_tx can simply hand
that to permute_inputs.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The spec 4af8e1841151f0c6e8151979d6c89d11839b2f65 uses a 32-byte 'channel-id'
field, not to be confused with the 8-byte short ID used by gossip. Rename
appropriately, and update to the new handshake protocol.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We use a different 'struct peer' in the new daemons, so make sure
the structure isn't assumed in any shared files.
This is a temporary shim.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Only minor changes, but I add some more spec text to
lightningd/test/run-commit_tx.c to be sure to catch if it changes
again.
One reference isn't upstream yet, so had to be commented out.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Now we've tested it:
1. open_channel needs to write response to REQ_FD not STATUS_FD.
2. recv_channel needs to send our next_per_commit, not echo theirs!
3. print the problematic signature if it's wrong, not our own.
Cleanups:
1. Return the message from open_channel/recv_channel for simplicity.
2. Trace signing information.
3. More tracing messages.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The signing helper was really just for testing, so remove it. But
turn the funding_tx() function into a useful one by making it take the
utxo array.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
I made it privkey to prove we owned one key, but without the HSM checking
we have a valid sig for the first commitment transaction, and that
we haven't revealed the revocation secret key, why bother?
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Or for blackbox tests --gdb1=<subdaemon> / --gdb2=<subdaemon>.
This makes the subdaemon wait as soon as it's execed, so we can attach
the debugger.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We should check that the peer it says it's returning is under its control,
we need to take back the peer fd, and use the correct conversion routine
for the packet it sends us.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
For the moment this is simply handed through to lightningd for
generating the per-peer secrets; eventually the HSM should keep it and
all peer secret key operations would be done via HSM-ops.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Raw crypto_state is what we send across the wire: the peer one is for
use in async crypto io routines (peer_read_message/peer_write_message).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The requirements for accepting the remote config are more complex than
a simple min/max value, as various parameters are related. It turns
out that with a few assumptions, we can boil this down to:
1. The valid feerate range.
2. The minimum effective HTLC throughput we want
3. The a maximum delay we'll accept for us to redeem.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Unless the transaction is confirmed, the UTXOs should be released if
something happens to the peer.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
wire_sync_write() adds length, but we already have it, so use write_all.
sync_crypto_read() handed an on-stack buffer to cryptomsg_decrypt_header,
which expected a tal() pointer, so use the known length instead.
sync_crypto_read() also failed to read the tag; add that in (no
overflow possible as 16 is an int, len is a u16).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The peer is woken up every 30 seconds to deliver the backlog of
messages. Additionally I added the normal message queue to be able to
send non-gossip message to the peer.
Turns out we want to permute transactions for the wallet too, so we
use void ** rather than assume we're shuffling htlc ** (and do inputs,
too!).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This object is basically the embodyment of BOLT #2. Each HTLC already
knows its own state; this moves them between states and keeps them
consistent.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It's currently written to produce "local" commit-txs, but of course we
need to produce remote ones too, for signing.
Thus instead of using "remote" and "local" we use "other" and "self",
and indicate with a single "side" flag which we're generating (because
that changes how HTLCs are interpreted).
This also adds to the tests: generate the remote view of the commit_tx
and make sure it matches!
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We were using the remote per_commitment_point instead of the local
per_commitment_point to generate the remotekey for the local transaction.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It's awkward to handle them differently. But this change means we
need to expose them to the generated code.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We used to have a permutation map; this reintroduces a variant which
uses the htlc pointers directly.
We need this because we have to send the htlc-tx signatures in output
order as part of the protocol: without two-stage HTLCs we only needed
to wire them up in the unilateral spend case so we simply brute-forced
the ordering.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Moved the broadcast functionality to broadcast.[ch]. So far this
includes only the enqueuing side of broadcasts, the dequeuing and
actual push to the peer is daemon dependent. This also adds the
broadcast_state to the routing_state and the last broadcast index to
the peer for the legacy daemon.
This used to be part of `lightningd_state` which is being split up for
the various subdaemons. The main change is the addition of the `struct
routing_state` in `routing.h` and the addition of `rstate` in `struct
lightningd_state` for backwards compatibility.
We can't run them in parallel, but we can at least have 'make check'
run them all.
Developers should be running "make check-source && make check".
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The problem with wire headers not being generated in time before stuff
depended on it turns out to be related with inclusion order of
sub-makefiles. The inclusions must preceed the use of
LIGHTNINGD_HEADERS since they append to that variable.
Now we hand peers off to the gossip daemon, to do the INIT handshake and
re-transmit/receive gossip. They may stay there forever if neither we nor
them wants to open a channel.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It's a bit messy, since some status messages are accompanied by an FD:
in this case, the handler returns STATUS_NEED_FD and we read that then
re-call the handler.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
These use the same infrastructure as the daemon/test blackbox tests,
so they're not currently wired into make check; use make
"lightningd-blackbox-tests".