Merge bitcoin/bitcoin#30246: contrib: asmap-tool - Compare ASMaps with respect to specific addresses

5215c925d1 Compare ASMaps with respect to specific addresses (virtu)

Pull request description:

  Right now, we have no way to quantify the "degradation" of an ASMap over time in the context of Bitcoin's P2P network in a meaningful way. However, such data would be useful for:
  1. Making sure the minimum shelf life of ASMaps is compatible with the release cycle (we wouldn't want to start shipping ASMaps with releases before making sure ASMaps typically do not become obsolete before the time of the next release)
  2. Node operators eager to keep their ASMaps up-to-date between releases.

  While `asmap-tool.py` has a `diff` command to perform a prefix-based comparison of two ASMaps, it is hard to reason about whether an old ASMap still is "good enough" or should be replaced with a newer one based on a prefix-based diff such as the following:

  ```shell
  $ ./asmap-tool.py diff 1704463200_asmap.dat 1710770400_asmap.dat
  [...]
  # 2c0f:fc98::/32 was AS37282
  # 2c0f:fcb8::/32 was AS37323
  2c0f:ff18::/32 AS37044 # was unassigned
  2c0f:ff98::/32 AS37113 # was unassigned
  2c0f:ffa0::/32 AS37273 # was unassigned
  # 76082350 (2^26.18) IPv4 addresses changed; 834271985742505274886878979424260 (2^109.36) IPv6 addresses changed
  ```

  One option for a more Bitcoin-centric ASMap comparison comprises comparing ASNs for the addresses of Bitcoin nodes and reporting on the number/share of addresses of nodes with disagreeing ASNs. By applying this approach to a node's set of known peers, a node operator can estimate how many of the node's peers are mapped to out-of-date AS when using the currently deployed and an up-to-date ASMap as input.

  This PR adds this functionality to `asmap-tool.py` by introducing a `diff_addrs` subcommand. In addition to two ASMaps, the subcommand reads addresses from a (`getnodeaddresses`-compatible) file, and computes statistics for those addresses:

  ```bash
  $ ./asmap-tool.py diff_addrs 1704463200_asmap.dat 1710770400_asmap.dat <(bitcoin-cli getnodeaddresses 0)
  275 address(es) reassigned from unassigned to AS51167
  84 address(es) reassigned from AS198949 to AS15557
  66 address(es) reassigned from AS45758 to AS45629
  33 address(es) reassigned from AS174 to AS212238
  [...]
  1 address(es) reassigned from unassigned to AS399619
  Summary: 919 (1.67%) of 54,901 addresses were reassigned.
  ```

  When the `-s / --show-addresses` flag is used, addresses subject to reassignment are included in the output.

ACKs for top commit:
  fjahr:
    tACK 5215c925d1
  achow101:
    ACK 5215c925d1
  brunoerg:
    reACK 5215c925d1

Tree-SHA512: ebcf47754bce92794fad9f4c3bfc1c5e9daf077db5975f444c5135092eb6a26ecaa1eca6748a03ae0c87d9e45532426966fe8f3c17249b17f9dcad490d6dd3bf
This commit is contained in:
Ava Chow 2024-08-12 16:17:42 -04:00
commit 5fdbc8b4ee
No known key found for this signature in database
GPG Key ID: 17565732E08E5E41

View File

@ -6,7 +6,9 @@
import argparse
import sys
import ipaddress
import json
import math
from collections import defaultdict
import asmap
@ -113,6 +115,18 @@ def main():
parser_diff.add_argument('infile2', type=argparse.FileType('rb'),
help="second file to compare (text or binary)")
parser_diff_addrs = subparsers.add_parser("diff_addrs",
help="compute difference between two asmap files for a set of addresses")
parser_diff_addrs.add_argument('-s', '--show-addresses', dest="show_addresses", default=False, action="store_true",
help="include reassigned addresses in the output")
parser_diff_addrs.add_argument("infile1", type=argparse.FileType("rb"),
help="first file to compare (text or binary)")
parser_diff_addrs.add_argument("infile2", type=argparse.FileType("rb"),
help="second file to compare (text or binary)")
parser_diff_addrs.add_argument("addrs_file", type=argparse.FileType("r"),
help="address file containing getnodeaddresses output to use in the comparison "
"(make sure to set the count parameter to zero to get all node addresses, "
"e.g. 'bitcoin-cli getnodeaddresses 0 > addrs.json')")
args = parser.parse_args()
if args.subcommand is None:
parser.print_help()
@ -148,6 +162,33 @@ def main():
f"# {ipv4_changed}{ipv4_change_str} IPv4 addresses changed; "
f"{ipv6_changed}{ipv6_change_str} IPv6 addresses changed"
)
elif args.subcommand == "diff_addrs":
state1 = load_file(args.infile1)
state2 = load_file(args.infile2)
address_info = json.load(args.addrs_file)
addrs = {a["address"] for a in address_info if a["network"] in ["ipv4", "ipv6"]}
reassignments = defaultdict(list)
for addr in addrs:
net = ipaddress.ip_network(addr)
prefix = asmap.net_to_prefix(net)
old_asn = state1.lookup(prefix)
new_asn = state2.lookup(prefix)
if new_asn != old_asn:
reassignments[(old_asn, new_asn)].append(addr)
reassignments = sorted(reassignments.items(), key=lambda item: len(item[1]), reverse=True)
num_reassignment_type = defaultdict(int)
for (old_asn, new_asn), reassigned_addrs in reassignments:
num_reassigned = len(reassigned_addrs)
num_reassignment_type[(bool(old_asn), bool(new_asn))] += num_reassigned
old_asn_str = f"AS{old_asn}" if old_asn else "unassigned"
new_asn_str = f"AS{new_asn}" if new_asn else "unassigned"
opt = ": " + ", ".join(reassigned_addrs) if args.show_addresses else ""
print(f"{num_reassigned} address(es) reassigned from {old_asn_str} to {new_asn_str}{opt}")
num_reassignments = sum(len(addrs) for _, addrs in reassignments)
share = num_reassignments / len(addrs) if len(addrs) > 0 else 0
print(f"Summary: {num_reassignments:,} ({share:.2%}) of {len(addrs):,} addresses were reassigned "
f"(migrations={num_reassignment_type[True, True]}, assignments={num_reassignment_type[False, True]}, "
f"unassignments={num_reassignment_type[True, False]})")
else:
parser.print_help()
sys.exit("No command provided.")