We need this later, to generate its seed. When we switch to lnd's key system,
we'll only need this, and not peerid.
Note also that the peerid is not just for messages any more, too.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Well, it's generated by shachain, so technically it is a sha256, but
that's an internal detail. It's a secret.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
- fixes problem with polling interval > 150 * 0.9
- fixes log message 'feerate hit floor' at every feerate change
- smoothed fee now reaches 90% of (exp weighted) fee estimates polled in last
120s, independent of polling interval
- only apply smoothing when effect > 10 percent so it doesn't correct forever
- fix indentation
This is a cosmetic change only. No functional changes.
I shortened the names of macros and changed param_parse() to param().
Also went through params.h with a fine-toothed comb and updated the comments
to reflect the current API.
I wanted to change the files:
params.c -> param.c
params.h -> param.h
run-params.c -> run->param.c
but that confused `git diff` for params.h so its best left for another PR.
I'm keeping #1682 updated locally with all these changes.
Signed-off-by: Mark Beckwith <wythe@intrig.com>
@wythe points out that many cases want a default value, not NULL.
Nicer to do it in the param_parse() call.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
I'm not completely convinced that it's only ever set to a failcode
with the BADONION bit set, especially after the previous patches in
this series. Now that channeld can handle arbitrary failcodes passed
this way, simply rename it.
We add marshalling assertions that only one of failcode and failreason
is set, and we unmarshal an empty 'fail' to NULL (just the the
generated unmarshalling code does).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
==1224== Uninitialised byte(s) found during client check request
==1224== at 0x152CAD: memcheck_ (mem.h:247)
==1224== by 0x152D18: towire (towire.c:17)
==1224== by 0x152DA1: towire_u16 (towire.c:28)
==1224== by 0x142189: towire_failed_htlc (htlc_wire.c:29)
==1224== by 0x16343F: towire_channel_init (gen_channel_wire.c:596)
==1224== by 0x115C2C: peer_start_channeld (channel_control.c:249)
==1224== by 0x131701: peer_connected (peer_control.c:503)
==1224== by 0x117820: gossip_msg (gossip_control.c:182)
==1224== by 0x139D97: sd_msg_read (subd.c:500)
==1224== by 0x139676: read_fds (subd.c:327)
==1224== by 0x179D52: next_plan (io.c:59)
==1224== by 0x17A84F: do_plan (io.c:387)
==1224== Address 0x1ffefffabe is on thread 1's stack
==1224== in frame #2, created by towire_u16 (towire.c:26)
Followed by:
2018-06-18T21:53:04.129Z lightningd(1224): 03933884aaf1d6b108397e5efe5c86bcf2d8ca8d2f700eda99db9214fc2712b134 chan #1: Peer permanent failure in CHANNELD_NORMAL: lightning_channeld: received ERROR channel d0101486543e1a8b6871556a4fe1fba4ad4d83ce7f6f92919fd17bd1545d2fd5: UpdateFailMalformedHtlc message doesn't have BADONION bit set
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
typesafe_cb isn't suitable here, as it is simply a conditional cast,
and the result is passed through '...' and doesn't matter.
Reported-by: @wythe
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
json_listpeers returns an array of peers, and an array of nodes: the latter
is a subset of the former, and is used for printing alias/color information.
This changes it so there is a 1:1 correspondance between the peer information
and nodes, meaning no more O(n^2) search.
If there is no node_announce for a peer, we use a negative timestamp
(already used to indicate that the rest of the gossip_getnodes_entry
is not valid).
Other fixes:
1. Use get_node instead of iterating through the node map.
2. A node without addresses is perfectly valid: we have to use the timestamp
to see if the alias/color are set. Previously we wouldn't print that
if it didn't also advertize an address.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
There doesn't seeem to be a need for this anymore (unless I'm missing something).
I added the sendpay_nulltok() unit test to confirm.
Signed-off-by: Mark Beckwith <wythe@intrig.com>
@rustyrussell showed we don't need temporary objects for params.
This means params no longer need a tal context.
Signed-off-by: Mark Beckwith <wythe@intrig.com>
@wythe points out we don't need to keep the around now param_is_set()
is removed. We can in fact go further and avoid marshalling them into
temporary objects at the caller altogether.
This means internally we have an array of struct param, rather than an
array of 'struct param *', which causes most of the noise in this
patch.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We're using a macro anyway, so appending "" make it a compile-time check.
Complicates testing a bit, since we actually use generated names there.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is a bit more natural, IMHO. The only issue is that json_tok_tok is
special, so we end up with param_opt_tok() if you really want an optional
generic token.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is part of #1464 and incorporates Rusty's suggested updates from #1569.
See comment in param.h for description, here's the basics:
unsigned cltv;
const jsmntok_t *note;
u64 msatoshi;
struct param * mp;
if (!param_parse(cmd, buffer, tokens,
param_req("cltv", json_tok_number, &cltv),
param_opt("note", json_tok_tok, ¬e),
mp = param_opt("msatoshi", json_tok_u64, &msatoshi),
NULL))
return;
if (param_is_set(mp))
do_something()
There is a lot of developer mode code to make sure we don't make mistakes,
like trying to unmarshal into the same variable twice or adding a required param
after optional.
During testing, I found a bug (of sorts) in the current system. It allows you
to provide two named parameters with the same name without error; e.g.:
# cli/lightning-cli -k newaddr addresstype=p2sh-segwit addresstype=bech32
{
"address": "2N3r6fT65PhfhE1mcMS6TtcdaEurud6M7pA"
}
It just takes the first and ignores the second. The new system reports this as an
error for now. We can always change this later.
structeq() is too dangerous: if a structure has padding, it can fail
silently.
The new ccan/structeq instead provides a macro to define foo_eq(),
which does the right thing in case of padding (which none of our
structures currently have anyway).
Upgrade ccan, and use it everywhere. Except run-peer-wire.c, which
is only testing code and can use raw memcmp(): valgrind will tell us
if padding exists.
Interestingly, we still declared short_channel_id_eq, even though
we didn't define it any more!
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
`getinfo` has been providing the blockheight for a good while and doesn't
require the `DEVELOPER=1` flag during compilation, so it should be the preferred
method to retrieve the blockchain height.
Implements an EWMA for the fee estimation. Achieves 90% influence of the newer
fee after 5 minutes, and adjusts to the polling rate that is configured.
Gossipd will ignore the second one, but doing it in the front end
gives an explicit error message.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
New codes: FUND_MAX_EXCEEDED, FUND_CANNOT_AFFORD, FUND_DUST_LIMIT_UNMET.
The error message "Cannot afford fee" was not exactly correct because
it would also occur if the amount requested could not be afforded. So
I changed it to the more generic "Cannot afford transaction".
Other things:
* Fixed off-by-one satoshi in fundchannel manpage.
* Changed 'arror' to 'error' because we are not pirates.
Turn req_running into a pointer to the current bcli structure, which means
the leak detection can find it.
Also suppress leaks in the case where we're only attached to a timer
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
During a meeting earlier this week we agreed with Eclair to temporarily
increase the final CLTV delta in our invoices to establish
compatibility with the already deployed Eclair wallets. They in turn
agreed to remove the enforcement of higher final CLTV deltas, or bump
it locally should it not match their expectations as allowed by
BOLT 11. This has since been implemented in ACINQ/eclair#627.
satoshis.place was slowing to a crawl, c-lightning was unresponsive.
Logs revealed charged doing many, many listinvoice <label> RPCs.
We were iterating the entire db every time: stop that!
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The fee range can sometimes cause channels to be closed when the estimator
jumps. This has been the case a few times in the last months, and causes a
number of channels to be closed, and issue reports to be filed.
Increasing this from 5x to 10x should get rid of 84%+ of these
closures (measured based on 1h windows over the last 6 months and assuming
worst case situations).
Signed-off-by: Christian Decker <decker.christian@gmail.com>
I still believe that 2 weeks is way too much, but we were promised that these
defaults would be slowly reduced to saner values as the stability increases.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
Compares the `blocknum` in the `short_channel_id` with the range of blocks we
store in the database and abort if we should have known about it. Avoids
bombarding `bitcoind` with requests for channels that have already been spent or
were invalid in the first place.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
Since we currently only (ab)use it to send everything, we need a way to
generate boutique queries for testing.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We're telling gossipd about disconnections anyway, so let's just use that signal
to disable both sides of the channel.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
This was failing some of our integration tests, i.e., the ones closing a channel
and not waiting for sigexchange. The remote node would often not be quick enough
to send us its disabling channel_update, and hence we'd still remember the
incoming direction. That could then be sent out as part of an invoice, and fail
subsequently. So just set both directions to be disabled and let the onchain
spend clean up once it happens.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
Until now, `command_fail()` reported an error code of -1 for all uses.
This PR adds an `int code` parameter to `command_fail()`, requiring the
caller to explicitly include the error code.
This is part of #1464.
The majority of the calls are used during parameter validation and
their error code is now JSONRPC2_INVALID_PARAMS.
The rest of the calls report an error code of LIGHTNINGD, which I defined to
-1 in `jsonrpc_errors.h`. The intention here is that as we improve our error
reporting, all occurenaces of LIGHTNINGD will go away and we can eventually
remove it.
I also converted calls to `command_fail_detailed()` that took a `NULL` `data`
parameter to use the new `command_fail()`.
The only difference from an end user perspecive is that bad input errors that
used to be -1 will now be -32602 (JSONRPC2_INVALID_PARAMS).
This resolves the problem where both channeld and gossipd can generate
updates, and they can have the same timestamp. gossipd is always able
to generate them, so can ensure timestamp moves forward.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Because we have too many which are never used and I don't want to document
them.
1. Remove unused anchor_onchain_wait. When implemented, it should be
hardcoded to 100 or more.
2. Remove anchor_confirms_max. 10 always reasonable, and we can readd
an override option should someone need it.
3. max_htlc_expiry should be the same as locktime_max (which increases
from 3 to 5 days by default): they're both a limit on how long
funds can be locked up.
4. channel_update_interval should always be a dev option.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Make --override-fee-rates a dev option. We use default-fee-rate in
its place, which (since bitcoind won't give fee estimates in regtest
mode for short chains) gives an effective feerate of 15000/7500/3750.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We never hit the guess_feerate() path, because we turned a 0 ("can't
estimate fee") into 253.
This also revealed that we weren't initializing topo->feerate, and
that we were giving spurious updates even if we were using override-fee-rates.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Just have a "new depth" callback, and let channeld do the right thing.
This makes the channeld paths a bit more straightforward.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Tor wasn't actually working for me to connect to anything, but it worked
for 'ssh -D' testing.
Note that the resulting 'netaddr' is a bit weird, but I guess it's honest.
$ ./cli/lightning-cli connect 021f2cbffc4045ca2d70678ecf8ed75e488290874c9da38074f6d378248337062b
{
"id": "021f2cbffc4045ca2d70678ecf8ed75e488290874c9da38074f6d378248337062b"
}
$ ./cli/lightning-cli listpeers
{
"peers": [
{
"state": "GOSSIPING",
"id": "021f2cbffc4045ca2d70678ecf8ed75e488290874c9da38074f6d378248337062b",
"netaddr": [
"ln1qg0je0lugpzu5ttsv78vlrkhteyg9yy8fjw68qr57mfhsfyrxurzkq522ah.lseed.bitcoinstats.com:9735"
],
"connected": true,
"owner": "lightning_gossipd"
}
]
}
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is useful for the next patch, where we want to hand the unresolved
name through to the proxy.
This also addresses @Saibato's worry that we still called getaddrinfo()
(with the AI_NUMERICHOST option) even if we didn't want a lookup.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
1. If we have a channel_announcement, the channel is public, otherwise
it's not. Not all channels are public, as they can be local: those
have a NULL channel_announcement.
2. If we don't have a channel_update, we know nothing about that half
of the channel, and no other fields are valid.
3. We can tell if a half channel is disabled by the flags field directly.
Note that we never send halfchannels without an update over
gossip_getchannels_reply so that marshalling/unmarshalling can be
vastly simplified.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This means it will effect connect commands too (though it's too
late to stop DNS lookups caused by commandline options).
We also warn that this is one case where we allow forcing through Tor
without a proxy set: it just means all connections will fail.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This takes the Tor service address in the same option, rather than using
a separate one. Gossipd now digests this like any other type.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
For the moment, this is a straight handing of current parameters through
from master to the gossip daemon. Next we'll change that.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Currently it's always for messages to peer: make that status_peer_io and
add a new status_io for other IO.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Risks leakage. We could do lookup via the proxy, but that's a TODO.
There's only one occurance of getaddrinfo (and no gethostbyname), so
we add a flag to the callers.
Note: the use of --always-use-proxy suppresses *all* DNS lookups, even
those from connect commands and the command line.
FIXME: An implicit setting of use_proxy_always is done in gossipd if it
determines that we are announcing nothing but Tor addresses, but that
does *not* suppress 'connect'.
This is fixed in a later patch.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rename tor_proxyaddrs and tor_serviceaddrs to tor_proxyaddr and tor_serviceaddr:
the 's' at the end suggests that there can be more than one.
Make them NULL or non-NULL, rather than using all-zero if unset.
Hand them the same way to gossipd; it's a bit of a hack since we don't
have optional fields, so we use a counter which is always 0 or 1.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
There's no reason to do this async, and far easier to follow using normal
read/write.
The previous parsing was deeply questionable, using substring searches
only, and relying on the fact that a single non-blocking read would get
the entire response. This is changed to do (somewhat) proper parsing
using ccan/rbuf.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is simply the code to set up the automatic hidden service, so move
it into lightningd.
I removed the undefined parse_tor_wireaddr, and added a parameter name
to the create_tor_hidden_service_conn() declaration for update-mocks.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is a rebased and combined patch for Tor support. It is extensively
reworked in the following patches, but the basis remains Saibato's work,
so it seemed fairest to begin with this.
Minor changes:
1. Use --announce-addr instead of --tor-external.
2. I also reverted some whitespace and unrelated changes from the patch.
3. Removed unnecessary ';' after } in functions.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Someone could try to announce an internal address, and we might probe
it.
This breaks tests, so we add '--dev-allow-localhost' for our tests, so
we don't eliminate that one. Of course, now we need to skip some more
tests in non-developer mode.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
If we're given a wildcard address, we can't announce it like that: we need
to try to turn it into a real address (using guess_address). Then we
use that address. As a side-effect of this cleanup, we only announce
*any* '--addr' if it's routable.
This fix means that our tests have to force '--announce-addr' because
otherwise localhost isn't routable.
This means that gossipd really controls the addresses now, and breaks
them into two arrays: what we bind to, and what we announce. That is
now what we return to the master for json_getinfo(), which prints them
as 'bindings' and 'addresses' respectively.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
1. Add special option where an empty host means 'wildcard for IPv4 and/or IPv6'
which means ':1234' can be used to set only the portnum.
2. Only add this protocol wildcard if --autolisten=1 (default)
and no other addresses specified.
3. Pass it down to gossipd, so it can handle errors correctly: in most cases,
it's fatal not to be able to bind to a port, but for this case, it's OK
if we can only bind to one of IPv4/v6 (fatal iff neither).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This replacement is a little menial, but it explicitly catches all
the places where we allow a local socket. The actual implementation of
opening a AF_UNIX socket is almost hidden in the patch.
The detection of "valid address" is now more complex:
p->addr.itype != ADDR_INTERNAL_WIREADDR || p->addr.u.wireaddr.type != ADDR_TYPE_PADDING
But most places we do this, we should audit: I'm pretty sure we can't
get an invalid address any more from gossipd (they may be in db, but
we should fix that too).
Closes: #1323
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It does all the other address handling, do this too. It also proves useful
as we clean up wildcard address handling.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Now we only bind to addresses in our wireaddrs array, we would not
autobind to local sockets if they couldn't reach google's nameserver.
That's clearly wrong: we should only not bind if there's a protocol
issue (eg. no IPv6 support).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It's become clear that our network options are insufficient, with the coming
addition of Tor and unix domain support.
Currently:
1. We always bind to local IPv4 and IPv6 sockets, unless --port=0, --offline,
or any address is specified explicitly. If they're routable, we announce.
2. --addr is used to announce, but not to control binding.
After this change:
1. --port is deprecated.
2. --addr controls what we bind to and announce.
3. --bind-addr/--announce-addr can be used to control one and not the other.
4. Unless --autolisten=0, we add local IPv4 & IPv6 port 9735 (and announce if they are routable).
5. --offline still overrides listening (though announcing is still the same).
This means we can bind to as many ports/interfaces as we want, and for
special effects we can announce different things (eg. we're sitting
behind a port forward or a proxy).
What remains to implement is semi-automatic binding: we should be able
to say '--addr=0.0.0.0:9999' and have the address resolve at bind
time, or even '--addr=0.0.0.0:0' and have the port autoresolve too
(you could determine what it was from 'lightning-cli getinfo'.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We set no_reconnect with --offline, but that doesn't work if !DEVELOPER.
Make the flag positive, and non-DEVELOPER mode for gossipd.
We also don't override portnum with --offline, but have an explicit
'listen' flag.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We can create the hsm file from python directly; that works even if we
don't have DEVELOPER set, and is simpler.
We add a test that the aliases are correct.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Originally we were supposed to tell the HSM we had just created the directory,
otherwise it wouldn't create a new seed. But we modified it to check if
there was a seed file anyway: just move that logic into a branch of hsmd.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
No new functionality, just a continuation of my work toward completing #665.
I removed the common members of `struct withdrawal` and `struct fund_channel`
and placed them in a new `struct wallet_tx`. Then it was fairly straightforward
to reimplement the existing code in terms of `wallet_tx`.
Since I made some structural changes I wanted to get this approved before I
go any farther.
Added 'all' to fundchannel help message.
Fixes: #1445
Hacky fix, possibly. First cut at avoiding starting up onchaind and gossipd (which might make queries of chaintopology, which might start up a bitcoin-cli) before we can daemonize.
We're getting spurious closures, even on mainnet. Using --ignore-fee-limits
is dangerous; it's slightly less so to lower the minimum (which is the
usual cause of problems).
So let's halve it, but beware the floor.
This is a workaround, until we get independent feerates in the spec.
Fixes: #613
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This means gossipd is live and we can tell it things, but it won't
receive incoming connections. The split also means that the main daemon
continues (eg. loading peers from db) while gossipd is loading from the store,
potentially speeding startup.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
If we start accepting peer connections before we initialized some of the other
parts (mainly the chaintopology) we could end up asking for stuff that isn't
ready yet (blockchain head for example).
Signed-off-by: Christian Decker <decker.christian@gmail.com>
If channeld dies for some reason (eg, reconnect) and we didn't yet announce
the channel, we can miss doing so. This is unusual, because if lightningd
restarts it rearms the callback which gives us funding_locked, so it only
happens if just channel dies before sending the announcement message.
This problem applies to both temporary announcement (for gossipd) and
the real one. For the temporary one, simply re-send on startup, and
remote the error msg gossipd gives if it sees a second one. For the
real one, we need a flag to tell us the depth is sufficient; the peer
will ignore re-sends anyway.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
When we get a reconnection, kill the current remote peer, and wait for the
master to tell us it's dead. Then we hand it the new peer.
Previously, we would end up with gossipd holding multiple peers, and
the logging was really hard to interpret; I'm not completely convinced
that we did the right thing when one terminated, either.
Note that this now means we can have peers with neither ->local nor ->remote
populated, so we check that more carefully.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Currently we intuit it from the fd being closed, but that may happen out
of order with when the master thinks it's dead.
So now if the gossip fd closes we just ignore it, and we'll get a
notification from the master when the peer is disconnected.
The notification is slightly ugly in that we have to disable it for
a channel when we manually hand the channel back to gossipd.
Note: as stands, this is racy with reconnects. See the next patch.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This was sitting in my gossip-enchancement patch queue, but it simplifies
this set too, so I moved it here).
In 94711969f we added an explicit gossip_index so when gossipd gets
peers back from other daemons, it knows what gossip it has sent (since
gossipd can send gossip after the other daemon is already complete).
This solution is insufficient for the more general case where gossipd
wants to send other messages reliably, so replace it with the other
solution: have gossipd drain the "gossip fd" which the daemon returns.
This turns out to be quite simple, and is probably how I should have
done it originally :(
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
1. Lifetime of 'struct reaching' now only while we're actively doing connect.
2. Always free after a single attempt: if it's an important peer, retry
on a timer.
3. Have a single response message to master, rather than relying on
peer_connected on success and other msgs on failure.
4. If we are actively connecting and we get another command for the same
id, just increment the counter
The result is much simpler in the master daemon, and much nicer for
reconnection: if they say to connect they get an immediate response,
rather than waiting for 10 retries. Even if it's an important peer,
it fires off another reconnect attempt, unless it's actively
connecting now.
This removes exponential backoff: that's restored in next patch. It
also doesn't handle multiple addresses for a single peer.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
And on channel_fail_permanent and closing (the two places we drop to
chain), we tell gossipd it's no longer important.
Fixes: #1316
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
These don't have a maximum number of reconnect attempts, and ensure
that we try to reconnect when the peer dies.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Simplification of the offset calculation to use the rescan parameter, and rename
of `wallet_first_blocknum`. We now use either relative rescan from our last
known location, or absolute if a negative rescan was given. It's all handled in
a single location (except the case in which the blockcount is below our
precomputed offset), so this should reduce surprises.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
This is intended to recover from an inconsistent state, involving
`onchaind`. Should we for some reason not restore the `onchaind` process
correctly we can instruct `lightningd` to go back in time and just replay
everything.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
Since we reference the channel ID to allow cascades in the database we also need
the ability to look up a channel by its database ID.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
This will allow us in the next commit to store the transactions that triggered
this event in the DB and thus allowing us to replay them later on.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
We used to queue the preimages to be sent to onchaind only after receiving the
onchaind_init_reply. Once we start replaying we might end up in a situation in
which we queue the tx that onchaind should react to before providing it with the
preimages. This commit just moves the preimages being sent, making it atomic
with the init, and without changing the order.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
These were so far only used for bolt11 construction, but we'll need them for the
DNS seed as well, so here we just pull them out into their own unit and prefix
them.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
d822ba1ee accidentally removed this case, which is important: if the
other side didn't get our final matching closing_signed, it will
reconnect and try again. We consider the channel no longer "active"
and thus ignore it, and get upset when it send the
`channel_reestablish` message.
We could just consider CLOSINGD_COMPLETE to be active, but then we'd
have to wait for the closing transaction to be mined before we'd allow
another connection.
We can't special case it when the peer reconnects, because there
could be (in theory) multiple channels for that peer in CLOSINGD_COMPLETE,
and we don't know which one to reestablish.
So, we need to catch this when they send the reestablish, and hand
that msg to closingd to do negotiation again. We already have code
to note that we're in CLOSINGD_COMPLETE and thus ignore any result
it gives us.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We're about to remove automatic retrying of connect, and that uncovered
that we actually print out our "Server started" message before we create
the listening socket.
Move the init higher (outside the db transaction) and make it a
request/response, the loop until it's done.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The new connect code revealed an existing race: we tell gossipd to
release the peer, but at the same time it connects in. gossipd fails
the release because the peer is remote, and json_fundchannel fails.
Instead, we catch this race when we get peer_connected() and we were
trying to open a channel. It means keeping a list of fundchannels which
are awaiting a gossipd response though.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We missed it in some corner cases where we crashed/were killed between
being told of the lockin and sending the channel_normal_operation message.
When we were restarted, we were told both sides were locked in already,
so we never updated the state.
Pull the entire "tell channeld" logic into channel_control.c, and make
it clear that we need to keep waching if we cant't tell channeld. I think
we did get this correct in practice, since funding_announce_cb has the
same test, but it's better to be clear.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We'd usually commit to the db soon, but there's a window where it
could be missed.
Also moves loc into the block it's used and make it tmpctx to avoid
an explicit free.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Without this, we can get errors on shutdown:
Valgrind error file: valgrind-errors.27444
==27444== Invalid read of size 8
==27444== at 0x1950E2: secp256k1_pubkey_load (secp256k1.c:127)
==27444== by 0x19CF87: secp256k1_ec_pubkey_serialize (secp256k1.c:189)
==27444== by 0x14FED9: towire_pubkey (towire.c:59)
==27444== by 0x15AAFB: towire_gossipctl_peer_disconnected (gen_gossip_wire.c:969)
==27444== by 0x1253EF: opening_channel_errmsg (opening_control.c:526)
==27444== by 0x1386A3: destroy_subd (subd.c:589)
==27444== by 0x18222C: notify (tal.c:240)
==27444== by 0x1826E1: del_tree (tal.c:400)
==27444== by 0x182733: del_tree (tal.c:410)
==27444== by 0x182733: del_tree (tal.c:410)
==27444== by 0x182B1F: tal_free (tal.c:511)
==27444== by 0x11FC53: main (lightningd.c:410)
==27444== Address 0x6c3af98 is 72 bytes inside a block of size 216 free'd
==27444== at 0x4C30D3B: free (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==27444== by 0x1827BC: del_tree (tal.c:421)
==27444== by 0x182B1F: tal_free (tal.c:511)
==27444== by 0x11F3C7: shutdown_subdaemons (lightningd.c:211)
==27444== by 0x11FC27: main (lightningd.c:406)
==27444== Block was alloc'd at
==27444== at 0x4C2FB0F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==27444== by 0x182296: allocate (tal.c:250)
==27444== by 0x182863: tal_alloc_ (tal.c:448)
==27444== by 0x12F2DF: new_peer (peer_control.c:74)
==27444== by 0x125600: new_uncommitted_channel (opening_control.c:576)
==27444== by 0x125870: peer_accept_channel (opening_control.c:668)
==27444== by 0x13032A: peer_sent_nongossip (peer_control.c:427)
==27444== by 0x116B9E: peer_nongossip (gossip_control.c:60)
==27444== by 0x116F2B: gossip_msg (gossip_control.c:172)
==27444== by 0x138323: sd_msg_read (subd.c:503)
==27444== by 0x137C02: read_fds (subd.c:330)
==27444== by 0x175550: next_plan (io.c:59)
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Also report tx and txid, and whether we closed unilaterally or
bilaterally, if we could close the channel.
Also make a manpage.
Fixes: #1207Fixes: #714Fixes: #622
We had an intermittant test failure, where the fee we negotiated was
further from our ideal than the final commitment transaction. It worked
fine if the other side sent the mutual close first, but not if we sent
our unilateral close first.
ERROR: test_closing_different_fees (__main__.LightningDTests)
----------------------------------------------------------------------
Traceback (most recent call last):
File "tests/test_lightningd.py", line 1319, in test_closing_different_fees
wait_for(lambda: p.rpc.listpeers(l1.info['id'])['peers'][0]['channels'][0]['status'][1] == 'ONCHAIN:Tracking mutual close transaction')
File "tests/test_lightningd.py", line 74, in wait_for
raise ValueError("Error waiting for {}", success)
ValueError: ('Error waiting for {}', <function LightningDTests.test_closing_different_fees.<locals>.<lambda> at 0x7f4b43e31a60>)
Really, if we're prepared to negotiate it, we should be prepared to
accept it ourselves. Simply take the cheapest tx which is above our
minimum.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The only use for these was to compute their txids so we could notify depth
in case of reorgs.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
We are slowly hollowing out the in-memory blockchain representation to make
restarts easier.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
All of the callback functions were only using the tx to generate the txid again,
so we just pass that in directly and save passing the tx itself.
This is a simplification to move to the DB backed depth callbacks. It'd be
rather wasteful to read the rawtx and deserialize just to serialize right away
again to find the txid, when we already searched the DB for exactly that txid.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
This will later allow us to determine the transaction confirmation count, and
recover transactions for rebroadcasts.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
Internally both payment and routing use 64-bit, but the interface
between them used 32-bit.
Since both components already support 64-bit we should use that.
In the short_channel_id check we were copying the entire result into the next
bitcoin-cli call, including the newline character.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
Reported-By: @gdassori
Creating the pid-file before daemonizing results in the pid-file containing the
pid of the process that started the daemon, but is now dead.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
Reported-By: Torkel Rogstad @torkelrogstad
We can have more than one; eg we might offer both bech32 and a p2sh
address, and in future we might offer v1 segwit, etc.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
There are two very hard problems in software engineering:
1. Off-by-one errors
In this case we were rolling back further than needed and we were starting the
catchup one block further than expected.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
I saw a failure in test_funding_fail():
assert l2.rpc.listpeers()['peers'][0]['connected']
This can happen if l2 hasn't yet handed back to gossipd. Turns out
we didn't mark uncommitted channels as connected:
[{'id': '03afa3c78bb39217feb8aac308852e6383d59409839c2b91955b2d992421f4a41e', 'connected': False, 'channels': [{'state': 'OPENINGD', 'owner': 'lightning_openingd', 'funder': 'REMOTE', 'status': ['Incoming channel: accepted, now waiting for them to create funding tx']}]}]
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is probably covered by our "channel capacity" heuristic which
requires the channel be significant, but best to be explicit and sure.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
So we know how much counterparty could theoretically steal from us
(msatoshi_to_us - msatoshi_to_us_min) and how much we could
theoretically steal from counterparty (msatoshi_to_us_max -
msatoshi_to_us).
For more piloting goodness.
In particular, the main daemon and subdaemons share the backtrace code,
with hooks for logging.
The daemon hook inserts the io_poll override, which means we no longer
need io_debug.[ch]. Though most daemons don't need it, they still link
against ccan/io, so it's harmess (suggested by @ZmnSCPxj).
This was tested manually to make sure we get backtraces still.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
I didn't convert all tests: they can still use a standalone context.
It's just marginally more efficient to share the libwally one for all
our daemons which link against it anyway.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
If we only remember the actions that added channels then we'd restore them when
re-reading the gossip_store, so put a tombstone in there to remember to delete
it. These will be cleared upon re-writing the store since the announcements wont
be written anymore.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
This is necessary since we have onchaind tell us about the
their_unilateral/to_us output, after it is already in a block.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
We don't handle \u, since we assume everyone sane is using UTF-8. We'd
still have to reject '\u0000' and maybe other weird cases if we did.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Since we may want to extend the on-disk format by adding custom information we
may as well just go the extra mile and reuse the serialization primitives we
already have.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
But only if we're actually going to change the feerate, otherwise we'd
log every time.
Suggested-by: @ZmnSCPxj
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Naively, this would be 250 satoshi per sipa, but it's not since bitcoind's
fee calculation was not rewritten to deal with weight, but instead bolted
on using vbytes.
The resulting calculations made me cry; I dried my tears on the thorns
of BUILD_ASSERT (I know that makes no sense, but bear with me here as I'm
trying not to swear at my bitcoind colleagues right now).
Fixes: #1194
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This bug is a classic case of being lazy:
1. peer_accept_channel() allocated its return off the input message,
rather than taking an explicit allocation context. This concealed the
lifetime nature of the return.
2. The context for sanitize_error was the error itself, rather than the
more obvious tmpctx (connect_failed does not take).
The global tmpctx removes the "efficiency" excuse for grabbing a random
object to use as context, and is also nice and explicit.
All-the-hard-work-by: @ZmnSCPxj
This fixes the root cause of https://github.com/ElementsProject/lightning/issues/1212
where we deleted the payment because we wanted to retry, then retry failed
so we had an (old) HTLC without a matching payment. We then fed that
HTLC to onchaind, which tells us it's missing, and we try to fail the
payment and deref a NULL pointer.
Fixes: #1212
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We would `block_map_add` inside `add_tip`, but we never
`block_map_del` inside `remove_tip`, which is dangerous as
we actually `tal_free` the block inside `remove_tip`.
Our CI did not reliably trap this problem since block
hashes are random and rerunning the `test_blockchaintrack`
often passed spuriously.
If we're going to simply take() a pointer, don't allocate it off a random
object. Using NULL makes our intent clear, particularly with allocating
packets we're going to take() onto a queue.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
I did a brief audit of tmpctx uses, and we do leak them in various
corner cases. Fortunely, all our daemons are based on some kind of
I/O loop, so it's fairly easy to clean a global tmpctx at that point.
This makes things a bit neater, and slightly more efficient, but also
clearer: I avoided creating a tmpctx in a few places because I didn't
want to add another allocation. With that penalty removed, I can use
it more freely and hopefully write clearer code.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Needed for particular race condition: client calls `sendpay` with
intent to call `waitsendpay` later to get information, but the
payment fails after `sendpay` returns but before client can invoke
`waitsendpay`.
This lets client know of information even if it manages to invoke
`waitsendpay` "late".