Encapsulating the peer state was a win for lightningd; not surprisingly,
it's even more of a win for the other daemons, especially as we want
to add a little gossip information.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We ask gossipd for the channel_update for the outgoing channel; any other
messages it sends us get queued for later processing.
But this is overzealous: we can shunt those msgs to the peer while
we're waiting. This fixes a nasty case where we have to handle
WIRE_GOSSIPD_NEW_STORE_FD messages by queuing the fd for later.
This then means that WIRE_GOSSIPD_NEW_STORE_FD can be handled
internally inside handle_gossip_msg(), since it's always dealt with
the same, simplifying all callers.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Instead of lightningd telling us when it's ready, we ask it.
This also provides an opportunity to have a plugin hook at this point.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
My raspberry pi node hung up on my other node:
lightning_openingd-... chan #1: Got bad message from gossipd: 0db1
This is because we didn't handle that message in one path.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Instead of reading the store ourselves, we can just send them an
offset. This saves gossipd a lot of work, putting it where it belongs
(in the daemon responsible for the specific peer).
MCP bench results:
store_load_msec:28509-31001(29206.6+/-9.4e+02)
vsz_kb:580004-580016(580006+/-4.8)
store_rewrite_sec:11.640000-12.730000(11.908+/-0.41)
listnodes_sec:1.790000-1.880000(1.83+/-0.032)
listchannels_sec:21.180000-21.950000(21.476+/-0.27)
routing_sec:2.210000-11.160000(7.126+/-3.1)
peer_write_all_sec:36.270000-41.200000(38.168+/-1.9)
Signficant savings in streaming gossip:
-peer_write_all_sec:48.160000-51.480000(49.608+/-1.1)
+peer_write_all_sec:35.780000-37.980000(36.43+/-0.81)
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
1. Rename channel_funding_locked to channel_funding_depth in
channeld/channel_wire.csv.
2. Add minimum_depth in struct channel in common/initial_channel.h and
change corresponding init function: new_initial_channel().
3. Add confirmation_needed in struct peer in channeld/channeld.c.
4. Rename channel_tell_funding_locked to channel_tell_depth.
5. Call channel_tell_depth even if depth < minimum, and still call
lockin_complete in channel_tell_depth, iff depth > minimum_depth.
6. channeld ignore the channel_funding_depth unless its >
minimum_depth(except to update billboard, and set
peer->confirmation_needed = minimum_depth - depth).
We need to do it in various places, but we shouldn't do it lightly:
the primitives are there to help us get overflow handling correct.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Basically we tell it that every field ending in '_msat' is a struct
amount_msat, and 'satoshis' is an amount_sat. The exceptions are
channel_update's fee_base_msat which is a u32, and
final_incorrect_htlc_amount's incoming_htlc_amt which is also a
'struct amount_msat'.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
As a side-effect of using amount_msat in gossipd/routing.c, we explicitly
handle overflows and don't need to pre-prune ridiculous-fee channels.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
They're generally used pass-by-copy (unusual for C structs, but
convenient they're basically u64) and all possibly problematic
operations return WARN_UNUSED_RESULT bool to make you handle the
over/underflow cases.
The new #include in json.h means we bolt11.c sees the amount.h definition
of MSAT_PER_BTC, so delete its local version.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Otherwise recent additional checks in tal() complain that we're freeing a
take() pointer. In this case, we're exiting so it's harmless, but it's
still a latent bug.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This existed previously, but code perturbations seem to have revealed it
now: test_bad_opening reports a leak.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is prep work for when we sign htlc txs with
SIGHASH_SINGLE|SIGHASH_ANYONECANPAY.
We still deal with raw signatures for the htlc txs at the moment, since
we send them like that across the wire, and changing that was simply too
painful (for the moment?).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is a bit different from the other cases: we need to iterate through
the peers and ask all the ones in openingd.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This avoids some very ugly switch() statements which mixed the two,
but we also take the chance to rename 'towire_gossip_' to
'towire_gossipd_' for those inter-daemon messages; they're messages to
gossipd, not gossip messages.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
That matches the other CSV names (HSM was the first, so it was written
before the pattern emerged).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
- reduces probability for a deadlock where we block on sending data because
the other peer cannot receive because it blocks on sending data etc.
- when either side sends so much data that it fills up the kernel/network buffer
- however sending out gossip can still block when (malicious) peer never receives
@renepickhardt: why is it actually lightningd.c with a d but hsm.c without d ?
And delete unused gossipd/gossip.h.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
In particular, all opening_read_peer_msg() callers need to know there
was an error (presumably, negotiating) so they can stop, but we should
not exit.
This lets us reenable the final disabled test.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We don't want to exit just because channel parameter negotiation
failed, but we do want to tell the master if it was a channel we were
trying to fund.
Note that lightningd still needs to fail the funding cmd if it gets a
fromwire_opening_fundee (they raced us and won), or an outright
failure.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Previously master would fail once the channel has been negotiated,
which is terrible, since the funder will have already broadcast tx.
Now we tell them if we have an active channel, and update if it goes away.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Prior to this, lightningd would hand uninteresting peers back to connectd,
which would then return it to lightningd if it sent a non-gossip msg,
or if lightningd asked it to release the peer.
Now connectd hands the peer to lightningd once we've done the init
handshake, which hands it off to openingd.
This is a deep structural change, so we do the minimum here and cleanup
in the following patches.
Lightningd:
1. Remove peer_nongossip handling from connect_control and peer_control.
2. Remove list of outstanding fundchannel command; it was only needed to
find the race between us asking connectd to release the peer and it
reconnecting.
3. We can no longer tell if the remote end has started trying to fund a
channel (until it has succeeded): it's very transitory anyway so not
worth fixing.
4. We now always have a struct peer, and allocate an uncommitted_channel
for it, though it may never be used if neither end funds a channel.
5. We start funding on messages for openingd: we can get a funder_reply
or a fundee, or an error in response to our request to fund a channel.
so we handle all of them.
6. A new peer_start_openingd() is called after connectd hands us a peer.
7. json_fund_channel just looks through local peers; there are none
hidden in connectd any more.
8. We sometimes start a new openingd just to send an error message.
Openingd:
1. We always have information we need to accept them funding a channel (in
the init message).
2. We have to listen for three fds: peer, gossip and master, so we opencode
the poll.
3. We have an explicit message to start trying to fund a channel.
4. We can be told to send a message in our init message.
Testing:
1. We don't handle some things gracefully yet, so two tests are disabled.
2. 'hand_back_peer .*: now local again' from connectd is no longer a message,
openingd says 'Handed peer, entering loop' once its managing it.
3. peer['state'] used to be set to 'GOSSIPING' (otherwise this field doesn't
exist; 'state' is now per-channel. It doesn't exist at all now.
4. Some tests now need to turn on IO logging in openingd, not connectd.
5. There's a gap between connecting on one node and having connectd on
the peer hand over the connection to openingd. Our tests sometimes
checked getpeers() on the peer, and didn't see anything, so line_graph
needed updating.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
openingd calculates our reserve based on the channel amount (even if
we're funding, to keep the calculation in one place), but it wasn't
reporting it back to the master daemon. We initialized it to 0 so that
valgrind wouldn't get upset, as it's part of a structure we send over
the wire.
Have openingd report back, and also initialize it to an impossible value
as extra assurance. And remove a stray (harmless but weird) semicolon.
Reported-by: Gálli Zoltán
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Also means we simplify the handle_gossip_msg() since everyone wants it to
use sync_crypto_write().
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We use 'our_commit' for the commit we sign (ie. the remote commitment tx),
and vice versa. Use local/remote nomenclature.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
structeq() is too dangerous: if a structure has padding, it can fail
silently.
The new ccan/structeq instead provides a macro to define foo_eq(),
which does the right thing in case of padding (which none of our
structures currently have anyway).
Upgrade ccan, and use it everywhere. Except run-peer-wire.c, which
is only testing code and can use raw memcmp(): valgrind will tell us
if padding exists.
Interestingly, we still declared short_channel_id_eq, even though
we didn't define it any more!
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Some of these are from the master branch, and were not when the query-gossip
extensions were made, so I've had to mark some with FIXME.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Because we have too many which are never used and I don't want to document
them.
1. Remove unused anchor_onchain_wait. When implemented, it should be
hardcoded to 100 or more.
2. Remove anchor_confirms_max. 10 always reasonable, and we can readd
an override option should someone need it.
3. max_htlc_expiry should be the same as locktime_max (which increases
from 3 to 5 days by default): they're both a limit on how long
funds can be locked up.
4. channel_update_interval should always be a dev option.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is a rebased and combined patch for Tor support. It is extensively
reworked in the following patches, but the basis remains Saibato's work,
so it seemed fairest to begin with this.
Minor changes:
1. Use --announce-addr instead of --tor-external.
2. I also reverted some whitespace and unrelated changes from the patch.
3. Removed unnecessary ';' after } in functions.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This means that openingd and closingd now forward our gossip. But the real
reason we want to do this is that it gives an easy way for gossipd to kill
any active daemon, by closing its fd: previously closingd and openingd didn't
read the fd, so tended not to notice.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This was sitting in my gossip-enchancement patch queue, but it simplifies
this set too, so I moved it here).
In 94711969f we added an explicit gossip_index so when gossipd gets
peers back from other daemons, it knows what gossip it has sent (since
gossipd can send gossip after the other daemon is already complete).
This solution is insufficient for the more general case where gossipd
wants to send other messages reliably, so replace it with the other
solution: have gossipd drain the "gossip fd" which the daemon returns.
This turns out to be quite simple, and is probably how I should have
done it originally :(
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
'negotiation_failed' is currently just a useless wrapper around
peer_failed (a vestige from when peer_failed would close the
connection). Change it to send different local and remote messages,
and use it wherever we dislike their parameters: stick with
peer_failed if we dislike our own parameters.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is probably covered by our "channel capacity" heuristic which
requires the channel be significant, but best to be explicit and sure.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This quotes from the BOLT proposal at https://github.com/lightningnetwork/lightning-rfc/pull/389
Don't try to fund channels which would do this, and don't allow others
to fund channels which would do this.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This quotes from the BOLT proposal at https://github.com/lightningnetwork/lightning-rfc/pull/389
Don't try to fund channels with reserve less than dust, nor allow them
to fund channels with reserve less than dust.
Fixes: #632
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
In particular, the main daemon and subdaemons share the backtrace code,
with hooks for logging.
The daemon hook inserts the io_poll override, which means we no longer
need io_debug.[ch]. Though most daemons don't need it, they still link
against ccan/io, so it's harmess (suggested by @ZmnSCPxj).
This was tested manually to make sure we get backtraces still.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
I did a brief audit of tmpctx uses, and we do leak them in various
corner cases. Fortunely, all our daemons are based on some kind of
I/O loop, so it's fairly easy to clean a global tmpctx at that point.
This makes things a bit neater, and slightly more efficient, but also
clearer: I avoided creating a tmpctx in a few places because I didn't
want to add another allocation. With that penalty removed, I can use
it more freely and hopefully write clearer code.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We always hand in "NULL" (which means use tal_len on the msg), except
for two places which do that manually for no good reason.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Because peer_failed would previously drop the connection, we had a
special 'negotiation_failed' message which made the master hand it
back to gossipd. We don't need that any more.
This also meant we no longer need a special hook in read_peer_msg
for openingd to send this message.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Several daemons (onchaind, hsm) want to use the status messages, but
don't communicate with peers. The coming changes made them drag in
more code they didn't need, so instead we have a different
non-overlapping type.
We combine the status_received_errmsg and status_sent_errmsg
into a single status_peer_error, with the presence or not of the
'error_for_them' field indicating direction.
We also rename status_fatal_connection_lost() to
peer_failed_connection_lost() to fit in.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We make it a macro, since everyone uses PEER_FD and GOSSIP_FD constants
(they're actually always the same, but this is slightly safer), and
add a gossip_index arg: this is groundwork for when we want to hand
the peer back to master for gossipd.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Now we have wirestring, this is much more natural. And with the
24M length limit, we needn't be so concerned about dumping 64k peer
messages in hex.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
These are now logically arrays of pointers. This is much more natural,
and gets rid of the horrible utxo array converters.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We need to override the err_pkt handler, so we can tell the master
that it's just a current-channel negotiation failure.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
In particular, this one didn't handle them trying to open a different
channel at the same time. Again, deliberately very similar, but
unfortunately different enough that sharing is awkward.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Our handling of SIGPIPE was incoherent and inconsistent, and we had much
cut & paste between the daemons. They should *ALL* ignore SIGPIPE, and
much of the rest of the boilerplate can be shared, so should be.
Reported-by: @ZmnSCPxj
Fixes: #528
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It's just a sha256_double, but importantly when we convert it to a
string (in type_to_string, which is used in logging) we use
bitcoin_blkid_to_hex() so it's reversed as people expect.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It's just a sha256_double, but importantly when we convert it to a
string (in type_to_string, which is used in logging) we use
bitcoin_txid_to_hex() so it's reversed as people expect.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We create a temporary tx which is a child of the real tx, for simplicity of
marshalling. That's OK.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
When gossipd sends a message, have a gossip_index. When it gets back a
peer, the current gossip_index is included, so it can know exactly where
it's up to.
Most of this is mechanical plumbing through openingd, channeld and closingd,
even though openingd and closingd don't (currently) read gossip, so their
gossip_index will be unchanged.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
All peers come from gossipd, and maintain an fd to talk to it. Sometimes
we hand the peer back, but to avoid a race, we always recreated it.
The race was that a daemon closed the gossip_fd, which made gossipd
forget the peer, then master handed the peer back to gossipd. We stop
the race by never closing the gossipfd, but hand it back to gossipd
for closing.
Now gossipd has to accept two fds, but the handling of peers is far
clearer.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
All the callers need to pass it in: currently channeld and openingd just
fake it by copying the payment point.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
In particular, the main daemon needs to pass it about (marshal/unmarshal)
but it won't need to actually use it after the next patch.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We were sending a channeld message to onchaind, which was v. confusing
due to overlap. We make all the numbers distinct, which means we can
also add an assert() that it's valid for that daemon, which catches
such errors immediately.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This change is really to allow us to have a --dev-fail-on-subdaemon-fail option
so we can handle failures from subdaemons generically.
It also neatens handling so we can have an explicit callback for "peer
did something wrong" (which matters if we want to close the channel in
that case).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>