They're generally used pass-by-copy (unusual for C structs, but
convenient they're basically u64) and all possibly problematic
operations return WARN_UNUSED_RESULT bool to make you handle the
over/underflow cases.
The new #include in json.h means we bolt11.c sees the amount.h definition
of MSAT_PER_BTC, so delete its local version.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We need to still accept it when parsing the database, but this flag
should allow upgrade testing for devs building on top
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Up until now, riskfactor was useless due to implementation bugs, and
also the default setting is wrong (too low to have an effect on
reasonable payment scenarios).
Let's simplify the definition (by assuming that P(failure) of a node
is 1), to make it a simple percentage. I examined the current network
fees to see what would work, and under this definition, a default of
10 seems reasonable (equivalent to 1000 under the old definition).
It is *this* change which finally fixes our test case! The riskfactor
is now 40msat (1500000 * 14 * 10 / 5259600 = 39.9), comparable with
worst-case fuzz is 50msat (1001 * 0.05 = 50).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We were only comparing by total msatoshis.
Note, this *still* isn't sufficient to fix our indirect problem, as
our risk values are all 1 (the minimum):
lightning_gossipd(25480): 2 hop solution: 1501990 + 2
lightning_gossipd(25480): 3 hop solution: 1501971 + 3
...
lightning_gossipd(25480): => chose 3 hop solution
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We used a u16, and a 1000 multiplier, which meant we wrapped at
riskfactor 66. We also never undid the multiplier, so we ended up
applying 1000x the riskfactor they specified.
This changes us to pass the riskfactor with a 1M multiplier. The next
patch changes the definition of riskfactor to be more useful.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We have a seed, which is for (future!) unit testing consistency. This
makes it change every time, so our pay_direct_test is more useful.
I tried restarting the noed around the loop, but it tended to fail
rebinding to the same port for some reason?
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
As a general rule, lightningd shouldn't parse user packets. We move the
parsing into gossipd, and have it respond only to permanent failures.
Note that we should *not* unconditionally remove a channel on
WIRE_INVALID_ONION_HMAC, as this can be triggered (and we do!) by
feeding sendpay a route with an incorrect pubkey.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We had a bug 0ba547ee10 caused by
short_channel_id overflow. If we'd caught this, we'd have terminated
the peer instead of crashing, so add appropriate checks.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Don't do this:
(gdb) bt
#0 0x00007f37ae667c40 in ?? () from /lib/x86_64-linux-gnu/libz.so.1
#1 0x00007f37ae668b38 in ?? () from /lib/x86_64-linux-gnu/libz.so.1
#2 0x00007f37ae669907 in deflate () from /lib/x86_64-linux-gnu/libz.so.1
#3 0x00007f37ae674c65 in compress2 () from /lib/x86_64-linux-gnu/libz.so.1
#4 0x000000000040cfe3 in zencode_scids (ctx=0xc1f118, scids=0x2599bc49 "\a\325{", len=176320) at gossipd/gossipd.c:218
#5 0x000000000040d0b3 in encode_short_channel_ids_end (encoded=0x7fff8f98d9f0, max_bytes=65490) at gossipd/gossipd.c:236
#6 0x000000000040dd28 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17290511, number_of_blocks=8) at gossipd/gossipd.c:576
#7 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17290511, number_of_blocks=16) at gossipd/gossipd.c:595
#8 0x000000000040ddee in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17290495, number_of_blocks=32) at gossipd/gossipd.c:596
#9 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17290495, number_of_blocks=64) at gossipd/gossipd.c:595
#10 0x000000000040ddee in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17290431, number_of_blocks=128) at gossipd/gossipd.c:596
#11 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17290431, number_of_blocks=256) at gossipd/gossipd.c:595
#12 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17290431, number_of_blocks=512) at gossipd/gossipd.c:595
#13 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17290431, number_of_blocks=1024) at gossipd/gossipd.c:595
#14 0x000000000040ddee in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17289408, number_of_blocks=2047) at gossipd/gossipd.c:596
#15 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17289408, number_of_blocks=4095) at gossipd/gossipd.c:595
#16 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17289408, number_of_blocks=8191) at gossipd/gossipd.c:595
#17 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17289408, number_of_blocks=16382) at gossipd/gossipd.c:595
#18 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17289408, number_of_blocks=32764) at gossipd/gossipd.c:595
#19 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17289408, number_of_blocks=65528) at gossipd/gossipd.c:595
#20 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17289408, number_of_blocks=131056) at gossipd/gossipd.c:595
#21 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17289408, number_of_blocks=262112) at gossipd/gossipd.c:595
#22 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17289408, number_of_blocks=524225) at gossipd/gossipd.c:595
#23 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17289408, number_of_blocks=1048450) at gossipd/gossipd.c:595
#24 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17289408, number_of_blocks=2096900) at gossipd/gossipd.c:595
#25 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17289408, number_of_blocks=4193801) at gossipd/gossipd.c:595
#26 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17289408, number_of_blocks=8387603) at gossipd/gossipd.c:595
#27 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=17289408, number_of_blocks=16775207) at gossipd/gossipd.c:595
#28 0x000000000040ddee in queue_channel_ranges (peer=0x3868fc8, first_blocknum=514201, number_of_blocks=33550414) at gossipd/gossipd.c:596
#29 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=514201, number_of_blocks=67100829) at gossipd/gossipd.c:595
#30 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=514201, number_of_blocks=134201659) at gossipd/gossipd.c:595
#31 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=514201, number_of_blocks=268403318) at gossipd/gossipd.c:595
#32 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=514201, number_of_blocks=536806636) at gossipd/gossipd.c:595
#33 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=514201, number_of_blocks=1073613273) at gossipd/gossipd.c:595
#34 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=514201, number_of_blocks=2147226547) at gossipd/gossipd.c:595
#35 0x000000000040ddc6 in queue_channel_ranges (peer=0x3868fc8, first_blocknum=514201, number_of_blocks=4294453094) at gossipd/gossipd.c:595
#36 0x000000000040df26 in handle_query_channel_range (peer=0x3868fc8, msg=0x37e0678 "\001\ao\342\214\n\266\361\263r\301\246\242F\256c\367O\223\036\203e\341Z\b\234h\326\031") at gossipd/gossipd.c:625
The cause was that converting a block number to an scid truncates it
at 24 bits. When we look through the index from (truncated number) to
(real end number) we get every channel, which is too large to encode,
so we iterate again.
This fixes both that problem, and also the issue that we'd end up
dividing into many empty sections until we get to the highest block
number. Instead, we just tack the empty blocks on to then end of the
final query.
(My initial version requested 0xFFFFFFFE blocks, but the dev code
which records what blocks were returned can't make a bitmap that big
on 32 bit).
Reported-by: George Vaccaro
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Currently only used by gossipd for channel elimination.
Also print them in canonical form (/[01]), so tests need to be
changed.
Suggested-by: @cdecker
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We didn't populate the channels properly so it always failed.
Additionally, somewhere along the line we kept using the single scid
so we only created one channel.
Also, the next patch will start comparing the pubkeys, so make valid
ones: use an array so we don't affect the benchmark too much.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
seed isn't very useful at this level: I've left it in routing.c
because it might be useful for detailed testing. Pretty sure it's unused,
so I simply removed it.
The fuzzpercent is documented to default at 5%, but actually was 75%.
Fix that too.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Christian and I both unwittingly used it in form:
*tal_arr_expand(&x) = tal(x, ...)
Since '=' isn't a sequence point, the compiler can (and does!) cache
the value of x, handing it to tal *after* tal_arr_expand() moves it
due to tal_resize().
The new version is somewhat less convenient to use, but doesn't have
this problem, since the assignment is always evaluated after the
resize.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is mainly just copying over the copy-editing from the
lightning-rfc repository.
[ Split to just perform changes after the UNKNOWN_PAYMENT_HASH change --RR ]
Signed-off-by: Christian Decker <decker.christian@gmail.com>
Reported-by: Rusty Russell <@rustyrussell>
This is mainly just copying over the copy-editing from the
lightning-rfc repository.
[ Split to just perform changes prior to the UNKNOWN_PAYMENT_HASH change --RR ]
Signed-off-by: Christian Decker <decker.christian@gmail.com>
Reported-by: Rusty Russell <@rustyrussell>
We keep a chain_hash in struct daemon, becayse otherwise we end up with
`&peer->daemon->rstate->chainparams->genesis_blockhash` which is a bit
ridiculous.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This avoids some very ugly switch() statements which mixed the two,
but we also take the chance to rename 'towire_gossip_' to
'towire_gossipd_' for those inter-daemon messages; they're messages to
gossipd, not gossip messages.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We had at least one bug caused by it not returning true when it had
queued something. Instead, just re-check thq queue after it's called.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We shouldn't insist on an exact reponse match: they can batch it and send
a whole batch, as long as it overlaps what we ask.
We also change to a bitmap to save some memory.
This isn't note in the CHANGELOG since we don't actually send gossip
range queries except for testing.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Messages from a peer may be invalid in many ways: we send an error
packet in that case. Rather than internally calling peer_error,
however, we make it explicit by having the handle_ functions return
NULL or an error packet.
Messages from the daemon itself should not be invalid: we log an error
and close the fd to them if it is. Previously we logged an error but
didn't kill them.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The idea is that `plugin` is an early arg that is parsed (from command
line or the config file). We can then start the plugins and have them
tell us about the options they'd like to add to the mix, before we
actually parse them.
Signed-off-by: Christian Decker <@cdecker>
It means an extra allocation at startup, but it means we can hide the definition,
and use standard patterns (new_daemon_conn and typesafe callbacks).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We spend quite a bit of time in libsecp256k1 moving them to and from
DER encoding. With a bit of care, we can transfer the raw bytes from
gossipd and manually decode them so a malformed one can't make us
abort().
Before:
real 0m0.629000-0.695000(0.64985+/-0.019)s
After:
real 0m0.359000-0.433000(0.37645+/-0.023)s
At this point, the main issues are 11% of time spent in ccan/io's
backend_wake (I tried using a hash table there, but that actually makes
the small-number-of-fds case slower), and 65% of gossipd's time is
in marshalling the response (all those tal_resize add up!).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Usually Travis triggers corner cases because it's so slow, but this
time the moons aligned, and it managed to fail test_node_reannounce
because it generated the updated node_announcement with the same
timestamp as the old one.
This is because we only updated "last_announce_timestamp" when
we generated the announcement, not when we got it off the wire or
loaded it from the gossip store.
The fix is to ask the routing code what the latest timestamp is;
we could still generate a clashing timestamp if (1) the gossip store
is lost, and (2) we restart within one second. Hard to care.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Have c-lightning nodes send out the largest value for
`htlc_maximum_msat` that makes sense, ie the lesser of
the peer's max_inflight_htlc value or the total channel
capacity minus the total channel reserve.
We initialize it to 30 seconds, but it's *always* overridden by the
gossip_init message (and usually to 60 seconds, so it's doubly
misleading).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Gossipd provided a generic "get endpoints of this scid" and we only
use it in one place: to look up htlc forwards. But lightningd just
assumed that one would be us.
Instead, provide a simpler API which only returns the peer node
if any, and now we handle it much more gracefully.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We don't create unannouncable channels, but other implementations can.
Not only is it rude to expose these via invoices, it's probably not
useable anyway.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
If another channel has set the optional `htlc_maximum_msat` field,
we should correctly parse that field and respect it when drawing up
routes for payments.
globalfeatures should not be accessed if we haven't received a
channel_update. Treat it like the other fields which are only
initialized and marshalled/unmarshalled if the timestamp is positive.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
And use ARRAY_SIZE() everywhere which will break compile if it's not a
literal array, plus assertions that it's the same length.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
For routeboost, we want to select from all our enabled channels with
sufficient incoming capacity. Gossipd knows which are enabled (ie. we
have received a `channel_update` from the peer), but doesn't know the
current incoming capacity.
So we get gossipd to give us all the candidates, and lightningd
selects from those.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Otherwise, if we don't announce the last node, we'll not flush this
out; it will be delayed until the next time we send gossip!
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is consistent: we don't broadcast a channel_announce until we've seen
a channel_update, so we probably shouldn't advertise it here.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We do this a lot, and had boutique helpers in various places. So add
a more generic one; for convenience it returns a pointer to the new
end element.
I prefer the name tal_arr_expand to tal_arr_append, since it's up to
the caller to populate the new array entry.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We have a lot of infrastructure to delay local channel_updates to
avoid spamming on each peer reconnect; we had to keep tracking of
pending ones though, in case we needed the very latest for sending an
error when failing an HTLC.
Instead, it's far simpler to set the local_disabled flag on a channel
when we disconnect, but only send a disabling channel_update if we
actually fail an HTLC.
Note: handle_channel_update() TAKES update (due to tal_arr_dup), but we
didn't use that before. Now we do, add annotation.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We trade channel_update before channel_announce makes the channel
public, and currently forget them when we finally get the
channel_announce. We should instead apply them, and not rely on
retransmission (which we remove in the next patch!).
This earlier channel_update means test_gossip_jsonrpc triggers too
early, so have that wait for node_announcement.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It's simpler and more robust to just check that it's not yet announced
(the broadcast index will be 0).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Incrementing version number means stores which were prior to the previous
commit will be removed, and refreshed. The simplest fix, if not the most
efficient.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
BOLT 7's been updated to split the flags field in `channel_update`
into two: `channel_flags` and `message_flags`. This changeset does the
minimal necessary to get to building with the new flags.
That matches the other CSV names (HSM was the first, so it was written
before the pattern emerged).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We would never complete further ping commands if we had < responses
than pings. Oops.
Fixes: #1928
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
If we receive a channel_announce but not a channel_update, we store the announce
but don't put it in the broadcast map.
When we delete a channel, we check if the node_announcement broadcast
now preceeds all channel_announcements, and if so, we move it to the
end of the map. However, with a channel_announcement at index '0',
this test fails.
This is at least one potential cause of the node map getting out of order.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
These happen after we compact the store; every log I've seen of a
restart on a real node has a message about truncating the store,
because node_announcements predate channel_announcements.
I extracted one such case from testnet, and reduced it to test here.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
@renepickhardt: why is it actually lightningd.c with a d but hsm.c without d ?
And delete unused gossipd/gossip.h.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
1. Move the list to the start of `struct peer`: memleak walks the
list correctly this way.
2. Don't create tal parent loop daemon->conn->daemon.
The second one is silly anyway: we exit via master_gone when the master
conn is closed.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Lightning charge tests stopped working without a timeout, being unable
to find a route. The 15 second delay doesn't matter in real life, but
in these scenarios it does. This fixes it by making sure the channel
is usable immediately.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
As pointed out by @rustyrussell the capacity is now always defined, so we can
fold that into the construction of the channel itself.
Reported-by: Rusty Russell <@rustyrussell>
Signed-off-by: Christian Decker <@cdecker>
The `htlc_minimum_msat` parameter was ignored so far, and we'd be attempting to
pay and hitting a brick wall by doing so. This patch just skips channels that
are not eligible anyway.
We know the total channel capacity after checking for its existence on-chain, so
we can actually make use of that information to discard channels that don't have
a sufficient capacity anyway, reducing the number of failed attempts.
We were adding channels without their capacity, and eventually annotated them
when we exchanged `channel_update`s. This worked as long as we weren't
considering the channel capacity, but would result in local-only channels to be
unusable once we start checking.
'cursor < ser + max' isn't valid because we reduce 'max' as we go! Effectively
we'll stop once we're past halfway, which can only happen with ipv6 + a torv2
address.
Ths fix is one-line, but we rename 'max' to 'len' which makes its purpose
clearer.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
tal_count() is used where there's a type, even if it's char or u8, and
tal_bytelen() is going to replace tal_len() for clarity: it's only needed
where a pointer is void.
We shim tal_bytelen() for now.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We delay internally to reduce broadcastig route flap, but errors are
a special case: we want to send the latest, otherwise we might send an
old (non-disabled) update.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We used to just manually set ROUTING_FLAGS_DISABLED, but that means we
then suppressed the real channel_update because we thought it was a
duplicate!
So use a local flag: set it for the channel when the peer disconnects,
and clear it when channeld sends a local update.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
gossip_getnodes_entry was used by gossipd for reporting nodes, and for
reporting peers. But the local_features field is only available for peers,
and most other fields are only available from node_announcement.
Note that the connectd change actually means we get less information
about peers: gossipd used to do the node lookup for peers and include the
node_announcement information if it had it.
Since generate_wire.py can't create arrays-of-arrays, we add a 'struct
peer_features' to encapsulate the two feature arrays for each peer, and
for convenience we add it to lightningd/gossip_msg.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This patch guts gossipd of all peer-related functionality, and hands
all the peer-related requests to channeld instead.
gossipd now gets the final announcable addresses in its init msg, since
it doesn't handle socket binding any more.
lightningd now actually starts connectd, and activates it. The init
messages for both gossipd and connectd still contain redundant fields
which need cleaning up.
There are shims to handle the fact that connectd's wire messages are
still (mostly) gossipd messages.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
connectd has a dedicated fd to gossipd, so it can ask for a new gossip_fd
for a peer.
gossipd has a standalone routine to create a remote peer (this will
eventually be the only way gossipd creates a new peer).
For now lightningd creates a socketpair but doesn't run connectd, so
gossipd never sees any requests on this fd.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Store the two we care about as booleans. Once connectd is complete we won't
even have the feature bitmaps for peers.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We weren't waiting for gossipd to actually process the
dev_set_max_scids_encode_size message, so under Travis it sometimes
split the reply before processing that.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Note that we mark both directions of the channel disabled immediately,
it's just the broadcast of the update which is delayed, just like the
ones generated when channeld tells us to.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We disable the channel every time the peer disconnects; if it reconnects
we get two updates.
The simplest solution: delay all updates by 15 seconds. Replace any
pending delayed update. If update is redundant after 15 seconds,
discard.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This doesn't do anything for us now, since we actually tend to produce
DISABLE/ENABLE update pairs. But the infrastructure is useful for the
next patch.
We also add more details to the trace message in the core update code.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
json_listpeers returns an array of peers, and an array of nodes: the latter
is a subset of the former, and is used for printing alias/color information.
This changes it so there is a 1:1 correspondance between the peer information
and nodes, meaning no more O(n^2) search.
If there is no node_announce for a peer, we use a negative timestamp
(already used to indicate that the rest of the gossip_getnodes_entry
is not valid).
Other fixes:
1. Use get_node instead of iterating through the node map.
2. A node without addresses is perfectly valid: we have to use the timestamp
to see if the alias/color are set. Previously we wouldn't print that
if it didn't also advertize an address.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
structeq() is too dangerous: if a structure has padding, it can fail
silently.
The new ccan/structeq instead provides a macro to define foo_eq(),
which does the right thing in case of padding (which none of our
structures currently have anyway).
Upgrade ccan, and use it everywhere. Except run-peer-wire.c, which
is only testing code and can use raw memcmp(): valgrind will tell us
if padding exists.
Interestingly, we still declared short_channel_id_eq, even though
we didn't define it any more!
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We wrap it in 'struct pubkey' for typesafety and consistency, and the
next patch takes advantage of that when we move to pubkey_eq.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Only --addr implies announce-if-public: --bind-addr does not.
It's also possible to have --bind-addr to an automatic Tor address:
you'd have to dig the onion address out of the logs or getinfo to use
it, but it's possible.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is a best effort attempt to skip connection attempts if we detect a broken
ISP resolver. A broken ISP resolver is a resolver that will replace NXDOMAIN
replies with a dummy response. This is best effort in that it'll only detect a
single fixed dummy reply, it'll check only on startup, and will not detect if we
switched networks. It should be good enough for most cases, and in the worst
case it will result in a connection attempt that does not complete.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
Reported-by: Glenn Willen <@gwillen>
Cut & paste means we sometimes sent NULL:
```
2018-06-15T00:13:51.908Z lightningd(23653): lightning_closingd-03864ef025fde8fb587d989186ce6a4a186895ee44a926bfc370e2c366597a3f8f chan #436: Gossipd gave us bad send_gossip message 0bc80000
```
Fixes: #1581
Reported-by: @Xian001
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
In this case, local and remote are *both* NULL; so if someone tries to
send a packet with take(), we need to free it.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
I think this is what is causing #1536: getting disconnected causes gossipd to
attempt to reach the peer again, unconditionally setting the flag to tell the
master. At the same time the master also issues a reaching command (which is
allowed since it is its first), but then it clashes on the already set
flag. Setting this flag only when the master actually needs to be told should
fix this.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
A failed compaction shouldn't be deadly, but we should also not attempt to do
one on every gossip message after the first one fails.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
`gossip_store_add` is the entry point for messages from the network, so it
should do the bookkeeping and disable on failures. `gossip_store_append` is the
shared function that wraps messages and writes it to the given file. This is
shared between the from network path and the compaction path, so we don't
directly use the `gossip_store` instance, but `fd`s.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
We write both when coming from outside, as well as when compacting, so we
extract the write functionality to use it in both cases.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
This makes the exposed interface much smaller, cleaner and will allow us to just
replay gossip messages from the broadcast.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
Two cases:
1. Node no longer has any public channels: remove node_announcement.
2. Node's node_announcement now preceeds all the channel_announcements:
move node_announcement to the end.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This lets detect if a node announce preceeds a channel announce once we
delete the node announcement.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We *accept* a node_announce if we have a channel_announce, but we
can't queue it until we queue the channel_announce, which we only do
once we have recieved a channel_update.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Since we currently only (ab)use it to send everything, we need a way to
generate boutique queries for testing.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We have a function called 'wake_pkt_out' which is really 'start
gossiping', so rename it to 'wake_gossip_out'.
In addition, it's fired both on a timer, and in response to our first
gossip_timestamp_filter, which leads to very confusing (though,
technically, not incorrect) behavior.
Keep a single timer at all times, which now doubles as the flag to
indicating we're syncing right now. Set it once we're done syncing
gossip.
Technically this means we got from once-every-60-seconds to
quiet-for-60-seconds-between-gossip, but that's OK.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
And initialize filter (to "never") when we negotiated LOCAL_GOSSIP_QUERIES,
and send initial filter message.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is kind of orthogonal to the other changes, but makes sense: if we
would instantly or never prune the message, don't accept it.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We use the same system as for gossip: we trickle out replies when we're
otherwise idle.
As we trickle out replies to query_short_channel_ids, we remember the
pubkeys of nodes we mention. At the end, we sort and uniquify, and
then send any node_announcements we have for those.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We use the same system as for gossip: we trickle out replies when we're
otherwise idle.
This is minimal infrastructure: we don't actually process the
query_short_channel_ids message yet, nor do we append node
announcements.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
In general, we need to only publish node announcements after
publishing channel announcements, though we can accept node
announcements as soon as we see channel announcements. So we keep a
flag for those node_announcement which haven't been broadcast yet.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
handle_pending_cannouncement might not actually add the announcment,
as it could be waiting for a channel_update. We need to wait for
the actual announcement before considering announcing our node.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We generate new ones anyway; removing this code changes fixes coming
up which now only need to change one place.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We don't have any connection yet, so how could they be active? Disable both
sides to avoid trying to route through them or telling others to use them as
`contact_points` in invoices.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
We're telling gossipd about disconnections anyway, so let's just use that signal
to disable both sides of the channel.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
This was failing some of our integration tests, i.e., the ones closing a channel
and not waiting for sigexchange. The remote node would often not be quick enough
to send us its disabling channel_update, and hence we'd still remember the
incoming direction. That could then be sent out as part of an invoice, and fail
subsequently. So just set both directions to be disabled and let the onchain
spend clean up once it happens.
Signed-off-by: Christian Decker <decker.christian@gmail.com>
This resolves the problem where both channeld and gossipd can generate
updates, and they can have the same timestamp. gossipd is always able
to generate them, so can ensure timestamp moves forward.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We erroneously create updates with the same timestamps when tests run
quickly, and the second one is ignored.
We've already noted that this should be fixed: gossipd should generate
all the updates, as it already has to do the case where channeld
crashed, for example. But that's a bigger change.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
@cdecker points out that in test_forward, where we manually create a route,
we get an error back which contains an update for an unknown channel.
We should still note this, but it's not an error for testing.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>