mirror of
https://gitlab.torproject.org/tpo/core/tor.git
synced 2024-11-19 09:50:29 +01:00
Remove specs from 0.2.1 branch: they have moved to a new repository.
This commit is contained in:
parent
28de4d83fd
commit
7bdb7d4811
@ -2,11 +2,11 @@
|
||||
EXTRA_DIST = HACKING \
|
||||
tor-resolve.1 tor-gencert.1 \
|
||||
tor-osx-dmg-creation.txt tor-rpm-creation.txt \
|
||||
tor-win32-mingw-creation.txt
|
||||
tor-win32-mingw-creation.txt spec/README
|
||||
|
||||
man_MANS = tor.1 tor-resolve.1 tor-gencert.1
|
||||
|
||||
SUBDIRS = design-paper spec
|
||||
SUBDIRS = design-paper
|
||||
|
||||
DIST_SUBDIRS = design-paper spec
|
||||
DIST_SUBDIRS = design-paper
|
||||
|
||||
|
@ -1,5 +0,0 @@
|
||||
|
||||
EXTRA_DIST = tor-spec.txt rend-spec.txt control-spec.txt \
|
||||
dir-spec.txt socks-extensions.txt path-spec.txt \
|
||||
version-spec.txt address-spec.txt
|
||||
|
10
doc/spec/README
Normal file
10
doc/spec/README
Normal file
@ -0,0 +1,10 @@
|
||||
The Tor specifications and proposals have moved to a new repository.
|
||||
|
||||
To browse the specifications, go to
|
||||
https://gitweb.torproject.org/torspec.git/tree
|
||||
|
||||
To check out the specification repository, run
|
||||
git clone git://git.torproject.org/torspec.git
|
||||
|
||||
For other information on the repository, see
|
||||
http://gitweb.torproject.org/torspec.git
|
@ -1,68 +0,0 @@
|
||||
$Id$
|
||||
|
||||
Special Hostnames in Tor
|
||||
Nick Mathewson
|
||||
|
||||
1. Overview
|
||||
|
||||
Most of the time, Tor treats user-specified hostnames as opaque: When
|
||||
the user connects to www.torproject.org, Tor picks an exit node and uses
|
||||
that node to connect to "www.torproject.org". Some hostnames, however,
|
||||
can be used to override Tor's default behavior and circuit-building
|
||||
rules.
|
||||
|
||||
These hostnames can be passed to Tor as the address part of a SOCKS4a or
|
||||
SOCKS5 request. If the application is connected to Tor using an IP-only
|
||||
method (such as SOCKS4, TransPort, or NatdPort), these hostnames can be
|
||||
substituted for certain IP addresses using the MapAddress configuration
|
||||
option or the MAPADDRESS control command.
|
||||
|
||||
2. .exit
|
||||
|
||||
SYNTAX: [hostname].[name-or-digest].exit
|
||||
[name-or-digest].exit
|
||||
|
||||
Hostname is a valid hostname; [name-or-digest] is either the nickname of a
|
||||
Tor node or the hex-encoded digest of that node's public key.
|
||||
|
||||
When Tor sees an address in this format, it uses the specified hostname as
|
||||
the exit node. If no "hostname" component is given, Tor defaults to the
|
||||
published IPv4 address of the exit node.
|
||||
|
||||
It is valid to try to resolve hostnames, and in fact upon success Tor
|
||||
will cache an internal mapaddress of the form
|
||||
"www.google.com.foo.exit=64.233.161.99.foo.exit" to speed subsequent
|
||||
lookups.
|
||||
|
||||
EXAMPLES:
|
||||
www.example.com.exampletornode.exit
|
||||
|
||||
Connect to www.example.com from the node called "exampletornode."
|
||||
|
||||
exampletornode.exit
|
||||
|
||||
Connect to the published IP address of "exampletornode" using
|
||||
"exampletornode" as the exit.
|
||||
|
||||
3. .onion
|
||||
|
||||
SYNTAX: [digest].onion
|
||||
|
||||
The digest is the first eighty bits of a SHA1 hash of the identity key for
|
||||
a hidden service, encoded in base32.
|
||||
|
||||
When Tor sees an address in this format, it tries to look up and connect to
|
||||
the specified hidden service. See rend-spec.txt for full details.
|
||||
|
||||
4. .noconnect
|
||||
|
||||
SYNTAX: [string].noconnect
|
||||
|
||||
When Tor sees an address in this format, it immediately closes the
|
||||
connection without attaching it to any circuit. This is useful for
|
||||
controllers that want to test whether a given application is indeed using
|
||||
the same instance of Tor that they're controlling.
|
||||
|
||||
5. [XXX Is there a ".virtual" address that we expose too, or is that
|
||||
just intended to be internal? -RD]
|
||||
|
@ -1,250 +0,0 @@
|
||||
$Id$
|
||||
|
||||
Tor bridges specification
|
||||
|
||||
0. Preface
|
||||
|
||||
This document describes the design decisions around support for bridge
|
||||
users, bridge relays, and bridge authorities. It acts as an overview
|
||||
of the bridge design and deployment for developers, and it also tries
|
||||
to point out limitations in the current design and implementation.
|
||||
|
||||
For more details on what all of these mean, look at blocking.tex in
|
||||
/doc/design-paper/
|
||||
|
||||
1. Bridge relays
|
||||
|
||||
Bridge relays are just like normal Tor relays except they don't publish
|
||||
their server descriptors to the main directory authorities.
|
||||
|
||||
1.1. PublishServerDescriptor
|
||||
|
||||
To configure your relay to be a bridge relay, just add
|
||||
BridgeRelay 1
|
||||
PublishServerDescriptor bridge
|
||||
to your torrc. This will cause your relay to publish its descriptor
|
||||
to the bridge authorities rather than to the default authorities.
|
||||
|
||||
Alternatively, you can say
|
||||
BridgeRelay 1
|
||||
PublishServerDescriptor 0
|
||||
which will cause your relay to not publish anywhere. This could be
|
||||
useful for private bridges.
|
||||
|
||||
1.2. Recommendations.
|
||||
|
||||
Bridge relays should use an exit policy of "reject *:*". This is
|
||||
because they only need to relay traffic between the bridge users
|
||||
and the rest of the Tor network, so there's no need to let people
|
||||
exit directly from them.
|
||||
|
||||
We invented the RelayBandwidth* options for this situation: Tor clients
|
||||
who want to allow relaying too. See proposal 111 for details. Relay
|
||||
operators should feel free to rate-limit their relayed traffic.
|
||||
|
||||
1.3. Implementation note.
|
||||
|
||||
Vidalia 0.0.15 has turned its "Relay" settings page into a tri-state
|
||||
"Don't relay" / "Relay for the Tor network" / "Help censored users".
|
||||
|
||||
If you click the third choice, it forces your exit policy to reject *:*.
|
||||
|
||||
If all the bridges end up on port 9001, that's not so good. On the
|
||||
other hand, putting the bridges on a low-numbered port in the Unix
|
||||
world requires jumping through extra hoops. The current compromise is
|
||||
that Vidalia makes the ORPort default to 443 on Windows, and 9001 on
|
||||
other platforms.
|
||||
|
||||
At the bottom of the relay config settings window, Vidalia displays
|
||||
the bridge identifier to the operator (see Section 3.1) so he can pass
|
||||
it on to bridge users.
|
||||
|
||||
2. Bridge authorities.
|
||||
|
||||
Bridge authorities are like normal v3 directory authorities, except
|
||||
they don't create their own network-status documents or votes. So if
|
||||
you ask a bridge authority for a network-status document or consensus,
|
||||
they behave like a directory mirror: they give you one from one of
|
||||
the main authorities. But if you ask the bridge authority for the
|
||||
descriptor corresponding to a particular identity fingerprint, it will
|
||||
happily give you the latest descriptor for that fingerprint.
|
||||
|
||||
To become a bridge authority, add these lines to your torrc:
|
||||
AuthoritativeDirectory 1
|
||||
BridgeAuthoritativeDir 1
|
||||
|
||||
Right now there's one bridge authority, running on the Tonga relay.
|
||||
|
||||
2.1. Exporting bridge-purpose descriptors
|
||||
|
||||
We've added a new purpose for server descriptors: the "bridge"
|
||||
purpose. With the new router-descriptors file format that includes
|
||||
annotations, it's easy to look through it and find the bridge-purpose
|
||||
descriptors.
|
||||
|
||||
Currently we export the bridge descriptors from Tonga to the
|
||||
BridgeDB server, so it can give them out according to the policies
|
||||
in blocking.pdf.
|
||||
|
||||
2.2. Reachability/uptime testing
|
||||
|
||||
Right now the bridge authorities do active reachability testing of
|
||||
bridges, so we know which ones to recommend for users.
|
||||
|
||||
But in the design document, we suggested that bridges should publish
|
||||
anonymously (i.e. via Tor) to the bridge authority, so somebody watching
|
||||
the bridge authority can't just enumerate all the bridges. But if we're
|
||||
doing active measurement, the game is up. Perhaps we should back off on
|
||||
this goal, or perhaps we should do our active measurement anonymously?
|
||||
|
||||
Answering this issue is scheduled for 0.2.1.x.
|
||||
|
||||
2.3. Future work: migrating to multiple bridge authorities
|
||||
|
||||
Having only one bridge authority is both a trust bottleneck (if you
|
||||
break into one place you learn about every single bridge we've got)
|
||||
and a robustness bottleneck (when it's down, bridge users become sad).
|
||||
|
||||
Right now if we put up a second bridge authority, all the bridges would
|
||||
publish to it, and (assuming the code works) bridge users would query
|
||||
a random bridge authority. This resolves the robustness bottleneck,
|
||||
but makes the trust bottleneck even worse.
|
||||
|
||||
In 0.2.2.x and later we should think about better ways to have multiple
|
||||
bridge authorities.
|
||||
|
||||
3. Bridge users.
|
||||
|
||||
Bridge users are like ordinary Tor users except they use encrypted
|
||||
directory connections by default, and they use bridge relays as both
|
||||
entry guards (their first hop) and directory guards (the source of
|
||||
all their directory information).
|
||||
|
||||
To become a bridge user, add the following line to your torrc:
|
||||
UseBridges 1
|
||||
|
||||
and then add at least one "Bridge" line to your torrc based on the
|
||||
format below.
|
||||
|
||||
3.1. Format of the bridge identifier.
|
||||
|
||||
The canonical format for a bridge identifier contains an IP address,
|
||||
an ORPort, and an identity fingerprint:
|
||||
bridge 128.31.0.34:9009 4C17 FB53 2E20 B2A8 AC19 9441 ECD2 B017 7B39 E4B1
|
||||
|
||||
However, the identity fingerprint can be left out, in which case the
|
||||
bridge user will connect to that relay and use it as a bridge regardless
|
||||
of what identity key it presents:
|
||||
bridge 128.31.0.34:9009
|
||||
This might be useful for cases where only short bridge identifiers
|
||||
can be communicated to bridge users.
|
||||
|
||||
In a future version we may also support bridge identifiers that are
|
||||
only a key fingerprint:
|
||||
bridge 4C17 FB53 2E20 B2A8 AC19 9441 ECD2 B017 7B39 E4B1
|
||||
and the bridge user can fetch the latest descriptor from the bridge
|
||||
authority (see Section 3.4).
|
||||
|
||||
3.2. Bridges as entry guards
|
||||
|
||||
For now, bridge users add their bridge relays to their list of "entry
|
||||
guards" (see path-spec.txt for background on entry guards). They are
|
||||
managed by the entry guard algorithms exactly as if they were a normal
|
||||
entry guard -- their keys and timing get cached in the "state" file,
|
||||
etc. This means that when the Tor user starts up with "UseBridges"
|
||||
disabled, he will skip past the bridge entries since they won't be
|
||||
listed as up and usable in his networkstatus consensus. But to be clear,
|
||||
the "entry_guards" list doesn't currently distinguish guards by purpose.
|
||||
|
||||
Internally, each bridge user keeps a smartlist of "bridge_info_t"
|
||||
that reflects the "bridge" lines from his torrc along with a download
|
||||
schedule (see Section 3.5 below). When he starts Tor, he attempts
|
||||
to fetch a descriptor for each configured bridge (see Section 3.4
|
||||
below). When he succeeds at getting a descriptor for one of the bridges
|
||||
in his list, he adds it directly to the entry guard list using the
|
||||
normal add_an_entry_guard() interface. Once a bridge descriptor has
|
||||
been added, should_delay_dir_fetches() will stop delaying further
|
||||
directory fetches, and the user begins to bootstrap his directory
|
||||
information from that bridge (see Section 3.3).
|
||||
|
||||
Currently bridge users cache their bridge descriptors to the
|
||||
"cached-descriptors" file (annotated with purpose "bridge"), but
|
||||
they don't make any attempt to reuse descriptors they find in this
|
||||
file. The theory is that either the bridge is available now, in which
|
||||
case you can get a fresh descriptor, or it's not, in which case an
|
||||
old descriptor won't do you much good.
|
||||
|
||||
We could disable writing out the bridge lines to the state file, if
|
||||
we think this is a problem.
|
||||
|
||||
As an exception, if we get an application request when we have one
|
||||
or more bridge descriptors but we believe none of them are running,
|
||||
we mark them all as running again. This is similar to the exception
|
||||
already in place to help long-idle Tor clients realize they should
|
||||
fetch fresh directory information rather than just refuse requests.
|
||||
|
||||
3.3. Bridges as directory guards
|
||||
|
||||
In addition to using bridges as the first hop in their circuits, bridge
|
||||
users also use them to fetch directory updates. Other than initial
|
||||
bootstrapping to find a working bridge descriptor (see Section 3.4
|
||||
below), all further non-anonymized directory fetches will be redirected
|
||||
to the bridge.
|
||||
|
||||
This means that bridge relays need to have cached answers for all
|
||||
questions the bridge user might ask. This makes the upgrade path
|
||||
tricky --- for example, if we migrate to a v4 directory design, the
|
||||
bridge user would need to keep using v3 so long as his bridge relays
|
||||
only knew how to answer v3 queries.
|
||||
|
||||
In a future design, for cases where the user has enough information
|
||||
to build circuits yet the chosen bridge doesn't know how to answer a
|
||||
given query, we might teach bridge users to make an anonymized request
|
||||
to a more suitable directory server.
|
||||
|
||||
3.4. How bridge users get their bridge descriptor
|
||||
|
||||
Bridge users can fetch bridge descriptors in two ways: by going directly
|
||||
to the bridge and asking for "/tor/server/authority", or by going to
|
||||
the bridge authority and asking for "/tor/server/fp/ID". By default,
|
||||
they will only try the direct queries. If the user sets
|
||||
UpdateBridgesFromAuthority 1
|
||||
in his config file, then he will try querying the bridge authority
|
||||
first for bridges where he knows a digest (if he only knows an IP
|
||||
address and ORPort, then his only option is a direct query).
|
||||
|
||||
If the user has at least one working bridge, then he will do further
|
||||
queries to the bridge authority through a full three-hop Tor circuit.
|
||||
But when bootstrapping, he will make a direct begin_dir-style connection
|
||||
to the bridge authority.
|
||||
|
||||
As of Tor 0.2.0.10-alpha, if the user attempts to fetch a descriptor
|
||||
from the bridge authority and it returns a 404 not found, the user
|
||||
will automatically fall back to trying a direct query. Therefore it is
|
||||
recommended that bridge users always set UpdateBridgesFromAuthority,
|
||||
since at worst it will delay their fetches a little bit and notify
|
||||
the bridge authority of the identity fingerprint (but not location)
|
||||
of their intended bridges.
|
||||
|
||||
3.5. Bridge descriptor retry schedule
|
||||
|
||||
Bridge users try to fetch a descriptor for each bridge (using the
|
||||
steps in Section 3.4 above) on startup. Whenever they receive a
|
||||
bridge descriptor, they reschedule a new descriptor download for 1
|
||||
hour from then.
|
||||
|
||||
If on the other hand it fails, they try again after 15 minutes for the
|
||||
first attempt, after 15 minutes for the second attempt, and after 60
|
||||
minutes for subsequent attempts.
|
||||
|
||||
In 0.2.2.x we should come up with some smarter retry schedules.
|
||||
|
||||
3.6. Implementation note.
|
||||
|
||||
Vidalia 0.1.0 has a new checkbox in its Network config window called
|
||||
"My ISP blocks connections to the Tor network." Users who click that
|
||||
box change their configuration to:
|
||||
UseBridges 1
|
||||
UpdateBridgesFromAuthority 1
|
||||
and should add at least one bridge identifier.
|
||||
|
@ -1,499 +0,0 @@
|
||||
$Id$
|
||||
|
||||
TC: A Tor control protocol (Version 0)
|
||||
|
||||
-1. Deprecation
|
||||
|
||||
THIS PROTOCOL IS DEPRECATED. It is still documented here because Tor
|
||||
0.1.1.x happens to support much of it; but the support for v0 is not
|
||||
maintained, so you should expect it to rot in unpredictable ways. Support
|
||||
for v0 will be removed some time after Tor 0.1.2.
|
||||
|
||||
0. Scope
|
||||
|
||||
This document describes an implementation-specific protocol that is used
|
||||
for other programs (such as frontend user-interfaces) to communicate
|
||||
with a locally running Tor process. It is not part of the Tor onion
|
||||
routing protocol.
|
||||
|
||||
We're trying to be pretty extensible here, but not infinitely
|
||||
forward-compatible.
|
||||
|
||||
1. Protocol outline
|
||||
|
||||
TC is a bidirectional message-based protocol. It assumes an underlying
|
||||
stream for communication between a controlling process (the "client") and
|
||||
a Tor process (the "server"). The stream may be implemented via TCP,
|
||||
TLS-over-TCP, a Unix-domain socket, or so on, but it must provide
|
||||
reliable in-order delivery. For security, the stream should not be
|
||||
accessible by untrusted parties.
|
||||
|
||||
In TC, the client and server send typed variable-length messages to each
|
||||
other over the underlying stream. By default, all messages from the server
|
||||
are in response to messages from the client. Some client requests, however,
|
||||
will cause the server to send messages to the client indefinitely far into
|
||||
the future.
|
||||
|
||||
Servers respond to messages in the order they're received.
|
||||
|
||||
2. Message format
|
||||
|
||||
The messages take the following format:
|
||||
|
||||
Length [2 octets; big-endian]
|
||||
Type [2 octets; big-endian]
|
||||
Body [Length octets]
|
||||
|
||||
Upon encountering a recognized Type, implementations behave as described in
|
||||
section 3 below. If the type is not recognized, servers respond with an
|
||||
"ERROR" message (code UNRECOGNIZED; see 3.1 below), and clients simply ignore
|
||||
the message.
|
||||
|
||||
2.1. Types and encodings
|
||||
|
||||
All numbers are given in big-endian (network) order.
|
||||
|
||||
OR identities are given in hexadecimal, in the same format as identity key
|
||||
fingerprints, but without spaces; see tor-spec.txt for more information.
|
||||
|
||||
3. Message types
|
||||
|
||||
Message types are drawn from the following ranges:
|
||||
|
||||
0x0000-0xEFFF : Reserved for use by official versions of this spec.
|
||||
0xF000-0xFFFF : Unallocated; usable by unofficial extensions.
|
||||
|
||||
3.1. ERROR (Type 0x0000)
|
||||
|
||||
Sent in response to a message that could not be processed as requested.
|
||||
|
||||
The body of the message begins with a 2-byte error code. The following
|
||||
values are defined:
|
||||
|
||||
0x0000 Unspecified error
|
||||
[]
|
||||
|
||||
0x0001 Internal error
|
||||
[Something went wrong inside Tor, so that the client's
|
||||
request couldn't be fulfilled.]
|
||||
|
||||
0x0002 Unrecognized message type
|
||||
[The client sent a message type we don't understand.]
|
||||
|
||||
0x0003 Syntax error
|
||||
[The client sent a message body in a format we can't parse.]
|
||||
|
||||
0x0004 Unrecognized configuration key
|
||||
[The client tried to get or set a configuration option we don't
|
||||
recognize.]
|
||||
|
||||
0x0005 Invalid configuration value
|
||||
[The client tried to set a configuration option to an
|
||||
incorrect, ill-formed, or impossible value.]
|
||||
|
||||
0x0006 Unrecognized byte code
|
||||
[The client tried to set a byte code (in the body) that
|
||||
we don't recognize.]
|
||||
|
||||
0x0007 Unauthorized.
|
||||
[The client tried to send a command that requires
|
||||
authorization, but it hasn't sent a valid AUTHENTICATE
|
||||
message.]
|
||||
|
||||
0x0008 Failed authentication attempt
|
||||
[The client sent a well-formed authorization message.]
|
||||
|
||||
0x0009 Resource exhausted
|
||||
[The server didn't have enough of a given resource to
|
||||
fulfill a given request.]
|
||||
|
||||
0x000A No such stream
|
||||
|
||||
0x000B No such circuit
|
||||
|
||||
0x000C No such OR
|
||||
|
||||
The rest of the body should be a human-readable description of the error.
|
||||
|
||||
In general, new error codes should only be added when they don't fall under
|
||||
one of the existing error codes.
|
||||
|
||||
3.2. DONE (Type 0x0001)
|
||||
|
||||
Sent from server to client in response to a request that was successfully
|
||||
completed, with no more information needed. The body is usually empty but
|
||||
may contain a message.
|
||||
|
||||
3.3. SETCONF (Type 0x0002)
|
||||
|
||||
Change the value of a configuration variable. The body contains a list of
|
||||
newline-terminated key-value configuration lines. An individual key-value
|
||||
configuration line consists of the key, followed by a space, followed by
|
||||
the value. The server behaves as though it had just read the key-value pair
|
||||
in its configuration file.
|
||||
|
||||
The server responds with a DONE message on success, or an ERROR message on
|
||||
failure.
|
||||
|
||||
When a configuration options takes multiple values, or when multiple
|
||||
configuration keys form a context-sensitive group (see below), then
|
||||
setting _any_ of the options in a SETCONF command is taken to reset all of
|
||||
the others. For example, if two ORBindAddress values are configured,
|
||||
and a SETCONF command arrives containing a single ORBindAddress value, the
|
||||
new command's value replaces the two old values.
|
||||
|
||||
To _remove_ all settings for a given option entirely (and go back to its
|
||||
default value), send a single line containing the key and no value.
|
||||
|
||||
3.4. GETCONF (Type 0x0003)
|
||||
|
||||
Request the value of a configuration variable. The body contains one or
|
||||
more NL-terminated strings for configuration keys. The server replies
|
||||
with a CONFVALUE message.
|
||||
|
||||
If an option appears multiple times in the configuration, all of its
|
||||
key-value pairs are returned in order.
|
||||
|
||||
Some options are context-sensitive, and depend on other options with
|
||||
different keywords. These cannot be fetched directly. Currently there
|
||||
is only one such option: clients should use the "HiddenServiceOptions"
|
||||
virtual keyword to get all HiddenServiceDir, HiddenServicePort,
|
||||
HiddenServiceNodes, and HiddenServiceExcludeNodes option settings.
|
||||
|
||||
3.5. CONFVALUE (Type 0x0004)
|
||||
|
||||
Sent in response to a GETCONF message; contains a list of "Key Value\n"
|
||||
(A non-whitespace keyword, a single space, a non-NL value, a NL)
|
||||
strings.
|
||||
|
||||
3.6. SETEVENTS (Type 0x0005)
|
||||
|
||||
Request the server to inform the client about interesting events.
|
||||
The body contains a list of 2-byte event codes (see "event" below).
|
||||
Any events *not* listed in the SETEVENTS body are turned off; thus, sending
|
||||
SETEVENTS with an empty body turns off all event reporting.
|
||||
|
||||
The server responds with a DONE message on success, and an ERROR message
|
||||
if one of the event codes isn't recognized. (On error, the list of active
|
||||
event codes isn't changed.)
|
||||
|
||||
3.7. EVENT (Type 0x0006)
|
||||
|
||||
Sent from the server to the client when an event has occurred and the
|
||||
client has requested that kind of event. The body contains a 2-byte
|
||||
event code followed by additional event-dependent information. Event
|
||||
codes are:
|
||||
0x0001 -- Circuit status changed
|
||||
|
||||
Status [1 octet]
|
||||
0x00 Launched - circuit ID assigned to new circuit
|
||||
0x01 Built - all hops finished, can now accept streams
|
||||
0x02 Extended - one more hop has been completed
|
||||
0x03 Failed - circuit closed (was not built)
|
||||
0x04 Closed - circuit closed (was built)
|
||||
Circuit ID [4 octets]
|
||||
(Must be unique to Tor process/time)
|
||||
Path [NUL-terminated comma-separated string]
|
||||
(For extended/failed, is the portion of the path that is
|
||||
built)
|
||||
|
||||
0x0002 -- Stream status changed
|
||||
|
||||
Status [1 octet]
|
||||
(Sent connect=0,sent resolve=1,succeeded=2,failed=3,
|
||||
closed=4, new connection=5, new resolve request=6,
|
||||
stream detached from circuit and still retriable=7)
|
||||
Stream ID [4 octets]
|
||||
(Must be unique to Tor process/time)
|
||||
Target (NUL-terminated address-port string]
|
||||
|
||||
0x0003 -- OR Connection status changed
|
||||
|
||||
Status [1 octet]
|
||||
(Launched=0,connected=1,failed=2,closed=3)
|
||||
OR nickname/identity [NUL-terminated]
|
||||
|
||||
0x0004 -- Bandwidth used in the last second
|
||||
|
||||
Bytes read [4 octets]
|
||||
Bytes written [4 octets]
|
||||
|
||||
0x0005 -- Notice/warning/error occurred
|
||||
|
||||
Message [NUL-terminated]
|
||||
|
||||
<obsolete: use 0x0007-0x000B instead.>
|
||||
|
||||
0x0006 -- New descriptors available
|
||||
|
||||
OR List [NUL-terminated, comma-delimited list of
|
||||
OR identity]
|
||||
|
||||
0x0007 -- Debug message occurred
|
||||
0x0008 -- Info message occurred
|
||||
0x0009 -- Notice message occurred
|
||||
0x000A -- Warning message occurred
|
||||
0x000B -- Error message occurred
|
||||
|
||||
Message [NUL-terminated]
|
||||
|
||||
3.8. AUTHENTICATE (Type 0x0007)
|
||||
|
||||
Sent from the client to the server. Contains a 'magic cookie' to prove
|
||||
that client is really allowed to control this Tor process. The server
|
||||
responds with DONE or ERROR.
|
||||
|
||||
The format of the 'cookie' is implementation-dependent; see 4.1 below for
|
||||
information on how the standard Tor implementation handles it.
|
||||
|
||||
3.9. SAVECONF (Type 0x0008)
|
||||
|
||||
Sent from the client to the server. Instructs the server to write out
|
||||
its config options into its torrc. Server returns DONE if successful, or
|
||||
ERROR if it can't write the file or some other error occurs.
|
||||
|
||||
3.10. SIGNAL (Type 0x0009)
|
||||
|
||||
Sent from the client to the server. The body contains one byte that
|
||||
indicates the action the client wishes the server to take.
|
||||
|
||||
1 (0x01) -- Reload: reload config items, refetch directory.
|
||||
2 (0x02) -- Controlled shutdown: if server is an OP, exit immediately.
|
||||
If it's an OR, close listeners and exit after 30 seconds.
|
||||
10 (0x0A) -- Dump stats: log information about open connections and
|
||||
circuits.
|
||||
12 (0x0C) -- Debug: switch all open logs to loglevel debug.
|
||||
15 (0x0F) -- Immediate shutdown: clean up and exit now.
|
||||
|
||||
The server responds with DONE if the signal is recognized (or simply
|
||||
closes the socket if it was asked to close immediately), else ERROR.
|
||||
|
||||
3.11. MAPADDRESS (Type 0x000A)
|
||||
|
||||
Sent from the client to the server. The body contains a sequence of
|
||||
address mappings, each consisting of the address to be mapped, a single
|
||||
space, the replacement address, and a NL character.
|
||||
|
||||
Addresses may be IPv4 addresses, IPv6 addresses, or hostnames.
|
||||
|
||||
The client sends this message to the server in order to tell it that future
|
||||
SOCKS requests for connections to the original address should be replaced
|
||||
with connections to the specified replacement address. If the addresses
|
||||
are well-formed, and the server is able to fulfill the request, the server
|
||||
replies with a single DONE message containing the source and destination
|
||||
addresses. If request is malformed, the server replies with a syntax error
|
||||
message. The server can't fulfill the request, it replies with an internal
|
||||
ERROR message.
|
||||
|
||||
The client may decline to provide a body for the original address, and
|
||||
instead send a special null address ("0.0.0.0" for IPv4, "::0" for IPv6, or
|
||||
"." for hostname), signifying that the server should choose the original
|
||||
address itself, and return that address in the DONE message. The server
|
||||
should ensure that it returns an element of address space that is unlikely
|
||||
to be in actual use. If there is already an address mapped to the
|
||||
destination address, the server may reuse that mapping.
|
||||
|
||||
If the original address is already mapped to a different address, the old
|
||||
mapping is removed. If the original address and the destination address
|
||||
are the same, the server removes any mapping in place for the original
|
||||
address.
|
||||
|
||||
{Note: This feature is designed to be used to help Tor-ify applications
|
||||
that need to use SOCKS4 or hostname-less SOCKS5. There are three
|
||||
approaches to doing this:
|
||||
1. Somehow make them use SOCKS4a or SOCKS5-with-hostnames instead.
|
||||
2. Use tor-resolve (or another interface to Tor's resolve-over-SOCKS
|
||||
feature) to resolve the hostname remotely. This doesn't work
|
||||
with special addresses like x.onion or x.y.exit.
|
||||
3. Use MAPADDRESS to map an IP address to the desired hostname, and then
|
||||
arrange to fool the application into thinking that the hostname
|
||||
has resolved to that IP.
|
||||
This functionality is designed to help implement the 3rd approach.}
|
||||
|
||||
[XXXX When, if ever, can mappings expire? Should they expire?]
|
||||
[XXXX What addresses, if any, are safe to use?]
|
||||
|
||||
3.12 GETINFO (Type 0x000B)
|
||||
|
||||
Sent from the client to the server. The message body is as for GETCONF:
|
||||
one or more NL-terminated strings. The server replies with an INFOVALUE
|
||||
message.
|
||||
|
||||
Unlike GETCONF, this message is used for data that are not stored in the
|
||||
Tor configuration file, but instead.
|
||||
|
||||
Recognized key and their values include:
|
||||
|
||||
"version" -- The version of the server's software, including the name
|
||||
of the software. (example: "Tor 0.0.9.4")
|
||||
|
||||
"desc/id/<OR identity>" or "desc/name/<OR nickname>" -- the latest server
|
||||
descriptor for a given OR, NUL-terminated. If no such OR is known, the
|
||||
corresponding value is an empty string.
|
||||
|
||||
"network-status" -- a space-separated list of all known OR identities.
|
||||
This is in the same format as the router-status line in directories;
|
||||
see tor-spec.txt for details.
|
||||
|
||||
"addr-mappings/all"
|
||||
"addr-mappings/config"
|
||||
"addr-mappings/cache"
|
||||
"addr-mappings/control" -- a NL-terminated list of address mappings, each
|
||||
in the form of "from-address" SP "to-address". The 'config' key
|
||||
returns those address mappings set in the configuration; the 'cache'
|
||||
key returns the mappings in the client-side DNS cache; the 'control'
|
||||
key returns the mappings set via the control interface; the 'all'
|
||||
target returns the mappings set through any mechanism.
|
||||
|
||||
3.13 INFOVALUE (Type 0x000C)
|
||||
|
||||
Sent from the server to the client in response to a GETINFO message.
|
||||
Contains one or more items of the format:
|
||||
|
||||
Key [(NUL-terminated string)]
|
||||
Value [(NUL-terminated string)]
|
||||
|
||||
The keys match those given in the GETINFO message.
|
||||
|
||||
3.14 EXTENDCIRCUIT (Type 0x000D)
|
||||
|
||||
Sent from the client to the server. The message body contains two fields:
|
||||
Circuit ID [4 octets]
|
||||
Path [NUL-terminated, comma-delimited string of OR nickname/identity]
|
||||
|
||||
This request takes one of two forms: either the Circuit ID is zero, in
|
||||
which case it is a request for the server to build a new circuit according
|
||||
to the specified path, or the Circuit ID is nonzero, in which case it is a
|
||||
request for the server to extend an existing circuit with that ID according
|
||||
to the specified path.
|
||||
|
||||
If the request is successful, the server sends a DONE message containing
|
||||
a message body consisting of the four-octet Circuit ID of the newly created
|
||||
circuit.
|
||||
|
||||
3.15 ATTACHSTREAM (Type 0x000E)
|
||||
|
||||
Sent from the client to the server. The message body contains two fields:
|
||||
Stream ID [4 octets]
|
||||
Circuit ID [4 octets]
|
||||
|
||||
This message informs the server that the specified stream should be
|
||||
associated with the specified circuit. Each stream may be associated with
|
||||
at most one circuit, and multiple streams may share the same circuit.
|
||||
Streams can only be attached to completed circuits (that is, circuits that
|
||||
have sent a circuit status 'built' event).
|
||||
|
||||
If the circuit ID is 0, responsibility for attaching the given stream is
|
||||
returned to Tor.
|
||||
|
||||
{Implementation note: By default, Tor automatically attaches streams to
|
||||
circuits itself, unless the configuration variable
|
||||
"__LeaveStreamsUnattached" is set to "1". Attempting to attach streams
|
||||
via TC when "__LeaveStreamsUnattached" is false may cause a race between
|
||||
Tor and the controller, as both attempt to attach streams to circuits.}
|
||||
|
||||
3.16 POSTDESCRIPTOR (Type 0x000F)
|
||||
|
||||
Sent from the client to the server. The message body contains one field:
|
||||
Descriptor [NUL-terminated string]
|
||||
|
||||
This message informs the server about a new descriptor.
|
||||
|
||||
The descriptor, when parsed, must contain a number of well-specified
|
||||
fields, including fields for its nickname and identity.
|
||||
|
||||
If there is an error in parsing the descriptor, the server must send an
|
||||
appropriate error message. If the descriptor is well-formed but the server
|
||||
chooses not to add it, it must reply with a DONE message whose body
|
||||
explains why the server was not added.
|
||||
|
||||
3.17 FRAGMENTHEADER (Type 0x0010)
|
||||
|
||||
Sent in either direction. Used to encapsulate messages longer than 65535
|
||||
bytes in length.
|
||||
|
||||
Underlying type [2 bytes]
|
||||
Total Length [4 bytes]
|
||||
Data [Rest of message]
|
||||
|
||||
A FRAGMENTHEADER message MUST be followed immediately by a number of
|
||||
FRAGMENT messages, such that lengths of the "Data" fields of the
|
||||
FRAGMENTHEADER and FRAGMENT messages add to the "Total Length" field of the
|
||||
FRAGMENTHEADER message.
|
||||
|
||||
Implementations MUST NOT fragment messages of length less than 65536 bytes.
|
||||
Implementations MUST be able to process fragmented messages that not
|
||||
optimally packed.
|
||||
|
||||
3.18 FRAGMENT (Type 0x0011)
|
||||
|
||||
Data [Entire message]
|
||||
|
||||
See FRAGMENTHEADER for more information
|
||||
|
||||
3.19 REDIRECTSTREAM (Type 0x0012)
|
||||
|
||||
Sent from the client to the server. The message body contains two fields:
|
||||
Stream ID [4 octets]
|
||||
Address [variable-length, NUL-terminated.]
|
||||
|
||||
Tells the server to change the exit address on the specified stream. No
|
||||
remapping is performed on the new provided address.
|
||||
|
||||
To be sure that the modified address will be used, this event must be sent
|
||||
after a new stream event is received, and before attaching this stream to
|
||||
a circuit.
|
||||
|
||||
3.20 CLOSESTREAM (Type 0x0013)
|
||||
|
||||
Sent from the client to the server. The message body contains three
|
||||
fields:
|
||||
Stream ID [4 octets]
|
||||
Reason [1 octet]
|
||||
Flags [1 octet]
|
||||
|
||||
Tells the server to close the specified stream. The reason should be
|
||||
one of the Tor RELAY_END reasons given in tor-spec.txt. Flags is not
|
||||
used currently. Tor may hold the stream open for a while to flush
|
||||
any data that is pending.
|
||||
|
||||
3.21 CLOSECIRCUIT (Type 0x0014)
|
||||
|
||||
Sent from the client to the server. The message body contains two
|
||||
fields:
|
||||
Circuit ID [4 octets]
|
||||
Flags [1 octet]
|
||||
|
||||
Tells the server to close the specified circuit. If the LSB of the flags
|
||||
field is nonzero, do not close the circuit unless it is unused.
|
||||
|
||||
4. Implementation notes
|
||||
|
||||
4.1. Authentication
|
||||
|
||||
By default, the current Tor implementation trusts all local users.
|
||||
|
||||
If the 'CookieAuthentication' option is true, Tor writes a "magic cookie"
|
||||
file named "control_auth_cookie" into its data directory. To authenticate,
|
||||
the controller must send the contents of this file.
|
||||
|
||||
If the 'HashedControlPassword' option is set, it must contain the salted
|
||||
hash of a secret password. The salted hash is computed according to the
|
||||
S2K algorithm in RFC 2440 (OpenPGP), and prefixed with the s2k specifier.
|
||||
This is then encoded in hexadecimal, prefixed by the indicator sequence
|
||||
"16:". Thus, for example, the password 'foo' could encode to:
|
||||
16:660537E3E1CD49996044A3BF558097A981F539FEA2F9DA662B4626C1C2
|
||||
++++++++++++++++**^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
salt hashed value
|
||||
indicator
|
||||
You can generate the salt of a password by calling
|
||||
'tor --hash-password <password>'
|
||||
or by using the example code in the Python and Java controller libraries.
|
||||
To authenticate under this scheme, the controller sends Tor the original
|
||||
secret that was used to generate the password.
|
||||
|
||||
4.2. Don't let the buffer get too big.
|
||||
|
||||
If you ask for lots of events, and 16MB of them queue up on the buffer,
|
||||
the Tor process will close the socket.
|
||||
|
File diff suppressed because it is too large
Load Diff
@ -1,315 +0,0 @@
|
||||
$Id$
|
||||
|
||||
Tor Protocol Specification
|
||||
|
||||
Roger Dingledine
|
||||
Nick Mathewson
|
||||
|
||||
0. Preliminaries
|
||||
|
||||
THIS SPECIFICATION IS OBSOLETE.
|
||||
|
||||
This document specifies the Tor directory protocol as used in version
|
||||
0.1.0.x and earlier. See dir-spec.txt for a current version.
|
||||
|
||||
1. Basic operation
|
||||
|
||||
There is a small number of directory authorities, and a larger number of
|
||||
caches. Client and servers know public keys for the directory authorities.
|
||||
Tor servers periodically upload self-signed "router descriptors" to the
|
||||
directory authorities. Each authority publishes a self-signed "directory"
|
||||
(containing all the router descriptors it knows, and a statement on which
|
||||
are running) and a self-signed "running routers" document containing only
|
||||
the statement on which routers are running.
|
||||
|
||||
All Tors periodically download these documents, downloading the directory
|
||||
less frequently than they do the "running routers" document. Clients
|
||||
preferentially download from caches rather than authorities.
|
||||
|
||||
1.1. Document format
|
||||
|
||||
Router descriptors, directories, and running-routers documents all obey the
|
||||
following lightweight extensible information format.
|
||||
|
||||
The highest level object is a Document, which consists of one or more
|
||||
Items. Every Item begins with a KeywordLine, followed by one or more
|
||||
Objects. A KeywordLine begins with a Keyword, optionally followed by
|
||||
whitespace and more non-newline characters, and ends with a newline. A
|
||||
Keyword is a sequence of one or more characters in the set [A-Za-z0-9-].
|
||||
An Object is a block of encoded data in pseudo-Open-PGP-style
|
||||
armor. (cf. RFC 2440)
|
||||
|
||||
More formally:
|
||||
|
||||
Document ::= (Item | NL)+
|
||||
Item ::= KeywordLine Object*
|
||||
KeywordLine ::= Keyword NL | Keyword WS ArgumentsChar+ NL
|
||||
Keyword = KeywordChar+
|
||||
KeywordChar ::= 'A' ... 'Z' | 'a' ... 'z' | '0' ... '9' | '-'
|
||||
ArgumentChar ::= any printing ASCII character except NL.
|
||||
WS = (SP | TAB)+
|
||||
Object ::= BeginLine Base-64-encoded-data EndLine
|
||||
BeginLine ::= "-----BEGIN " Keyword "-----" NL
|
||||
EndLine ::= "-----END " Keyword "-----" NL
|
||||
|
||||
The BeginLine and EndLine of an Object must use the same keyword.
|
||||
|
||||
When interpreting a Document, software MUST reject any document containing a
|
||||
KeywordLine that starts with a keyword it doesn't recognize.
|
||||
|
||||
The "opt" keyword is reserved for non-critical future extensions. All
|
||||
implementations MUST ignore any item of the form "opt keyword ....." when
|
||||
they would not recognize "keyword ....."; and MUST treat "opt keyword ....."
|
||||
as synonymous with "keyword ......" when keyword is recognized.
|
||||
|
||||
2. Router descriptor format.
|
||||
|
||||
Every router descriptor MUST start with a "router" Item; MUST end with a
|
||||
"router-signature" Item and an extra NL; and MUST contain exactly one
|
||||
instance of each of the following Items: "published" "onion-key" "link-key"
|
||||
"signing-key" "bandwidth". Additionally, a router descriptor MAY contain
|
||||
any number of "accept", "reject", "fingerprint", "uptime", and "opt" Items.
|
||||
Other than "router" and "router-signature", the items may appear in any
|
||||
order.
|
||||
|
||||
The items' formats are as follows:
|
||||
"router" nickname address ORPort SocksPort DirPort
|
||||
|
||||
Indicates the beginning of a router descriptor. "address"
|
||||
must be an IPv4 address in dotted-quad format. The last
|
||||
three numbers indicate the TCP ports at which this OR exposes
|
||||
functionality. ORPort is a port at which this OR accepts TLS
|
||||
connections for the main OR protocol; SocksPort is deprecated and
|
||||
should always be 0; and DirPort is the port at which this OR accepts
|
||||
directory-related HTTP connections. If any port is not supported,
|
||||
the value 0 is given instead of a port number.
|
||||
|
||||
"bandwidth" bandwidth-avg bandwidth-burst bandwidth-observed
|
||||
|
||||
Estimated bandwidth for this router, in bytes per second. The
|
||||
"average" bandwidth is the volume per second that the OR is willing
|
||||
to sustain over long periods; the "burst" bandwidth is the volume
|
||||
that the OR is willing to sustain in very short intervals. The
|
||||
"observed" value is an estimate of the capacity this server can
|
||||
handle. The server remembers the max bandwidth sustained output
|
||||
over any ten second period in the past day, and another sustained
|
||||
input. The "observed" value is the lesser of these two numbers.
|
||||
|
||||
"platform" string
|
||||
|
||||
A human-readable string describing the system on which this OR is
|
||||
running. This MAY include the operating system, and SHOULD include
|
||||
the name and version of the software implementing the Tor protocol.
|
||||
|
||||
"published" YYYY-MM-DD HH:MM:SS
|
||||
|
||||
The time, in GMT, when this descriptor was generated.
|
||||
|
||||
"fingerprint"
|
||||
|
||||
A fingerprint (a HASH_LEN-byte of asn1 encoded public key, encoded
|
||||
in hex, with a single space after every 4 characters) for this router's
|
||||
identity key. A descriptor is considered invalid (and MUST be
|
||||
rejected) if the fingerprint line does not match the public key.
|
||||
|
||||
[We didn't start parsing this line until Tor 0.1.0.6-rc; it should
|
||||
be marked with "opt" until earlier versions of Tor are obsolete.]
|
||||
|
||||
"hibernating" 0|1
|
||||
|
||||
If the value is 1, then the Tor server was hibernating when the
|
||||
descriptor was published, and shouldn't be used to build circuits.
|
||||
|
||||
[We didn't start parsing this line until Tor 0.1.0.6-rc; it should
|
||||
be marked with "opt" until earlier versions of Tor are obsolete.]
|
||||
|
||||
"uptime"
|
||||
|
||||
The number of seconds that this OR process has been running.
|
||||
|
||||
"onion-key" NL a public key in PEM format
|
||||
|
||||
This key is used to encrypt EXTEND cells for this OR. The key MUST
|
||||
be accepted for at least XXXX hours after any new key is published in
|
||||
a subsequent descriptor.
|
||||
|
||||
"signing-key" NL a public key in PEM format
|
||||
|
||||
The OR's long-term identity key.
|
||||
|
||||
"accept" exitpattern
|
||||
"reject" exitpattern
|
||||
|
||||
These lines, in order, describe the rules that an OR follows when
|
||||
deciding whether to allow a new stream to a given address. The
|
||||
'exitpattern' syntax is described below.
|
||||
|
||||
"router-signature" NL Signature NL
|
||||
|
||||
The "SIGNATURE" object contains a signature of the PKCS1-padded
|
||||
hash of the entire router descriptor, taken from the beginning of the
|
||||
"router" line, through the newline after the "router-signature" line.
|
||||
The router descriptor is invalid unless the signature is performed
|
||||
with the router's identity key.
|
||||
|
||||
"contact" info NL
|
||||
|
||||
Describes a way to contact the server's administrator, preferably
|
||||
including an email address and a PGP key fingerprint.
|
||||
|
||||
"family" names NL
|
||||
|
||||
'Names' is a whitespace-separated list of server nicknames. If two ORs
|
||||
list one another in their "family" entries, then OPs should treat them
|
||||
as a single OR for the purpose of path selection.
|
||||
|
||||
For example, if node A's descriptor contains "family B", and node B's
|
||||
descriptor contains "family A", then node A and node B should never
|
||||
be used on the same circuit.
|
||||
|
||||
"read-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM,NUM,NUM... NL
|
||||
"write-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM,NUM,NUM... NL
|
||||
|
||||
Declare how much bandwidth the OR has used recently. Usage is divided
|
||||
into intervals of NSEC seconds. The YYYY-MM-DD HH:MM:SS field defines
|
||||
the end of the most recent interval. The numbers are the number of
|
||||
bytes used in the most recent intervals, ordered from oldest to newest.
|
||||
|
||||
[We didn't start parsing these lines until Tor 0.1.0.6-rc; they should
|
||||
be marked with "opt" until earlier versions of Tor are obsolete.]
|
||||
|
||||
2.1. Nonterminals in routerdescriptors
|
||||
|
||||
nickname ::= between 1 and 19 alphanumeric characters, case-insensitive.
|
||||
|
||||
exitpattern ::= addrspec ":" portspec
|
||||
portspec ::= "*" | port | port "-" port
|
||||
port ::= an integer between 1 and 65535, inclusive.
|
||||
addrspec ::= "*" | ip4spec | ip6spec
|
||||
ipv4spec ::= ip4 | ip4 "/" num_ip4_bits | ip4 "/" ip4mask
|
||||
ip4 ::= an IPv4 address in dotted-quad format
|
||||
ip4mask ::= an IPv4 mask in dotted-quad format
|
||||
num_ip4_bits ::= an integer between 0 and 32
|
||||
ip6spec ::= ip6 | ip6 "/" num_ip6_bits
|
||||
ip6 ::= an IPv6 address, surrounded by square brackets.
|
||||
num_ip6_bits ::= an integer between 0 and 128
|
||||
|
||||
Ports are required; if they are not included in the router
|
||||
line, they must appear in the "ports" lines.
|
||||
|
||||
3. Directory format
|
||||
|
||||
A Directory begins with a "signed-directory" item, followed by one each of
|
||||
the following, in any order: "recommended-software", "published",
|
||||
"router-status", "dir-signing-key". It may include any number of "opt"
|
||||
items. After these items, a directory includes any number of router
|
||||
descriptors, and a single "directory-signature" item.
|
||||
|
||||
"signed-directory"
|
||||
|
||||
Indicates the start of a directory.
|
||||
|
||||
"published" YYYY-MM-DD HH:MM:SS
|
||||
|
||||
The time at which this directory was generated and signed, in GMT.
|
||||
|
||||
"dir-signing-key"
|
||||
|
||||
The key used to sign this directory; see "signing-key" for format.
|
||||
|
||||
"recommended-software" comma-separated-version-list
|
||||
|
||||
A list of which versions of which implementations are currently
|
||||
believed to be secure and compatible with the network.
|
||||
|
||||
"running-routers" whitespace-separated-list
|
||||
|
||||
A description of which routers are currently believed to be up or
|
||||
down. Every entry consists of an optional "!", followed by either an
|
||||
OR's nickname, or "$" followed by a hexadecimal encoding of the hash
|
||||
of an OR's identity key. If the "!" is included, the router is
|
||||
believed not to be running; otherwise, it is believed to be running.
|
||||
If a router's nickname is given, exactly one router of that nickname
|
||||
will appear in the directory, and that router is "approved" by the
|
||||
directory server. If a hashed identity key is given, that OR is not
|
||||
"approved". [XXXX The 'running-routers' line is only provided for
|
||||
backward compatibility. New code should parse 'router-status'
|
||||
instead.]
|
||||
|
||||
"router-status" whitespace-separated-list
|
||||
|
||||
A description of which routers are currently believed to be up or
|
||||
down, and which are verified or unverified. Contains one entry for
|
||||
every router that the directory server knows. Each entry is of the
|
||||
format:
|
||||
|
||||
!name=$digest [Verified router, currently not live.]
|
||||
name=$digest [Verified router, currently live.]
|
||||
!$digest [Unverified router, currently not live.]
|
||||
or $digest [Unverified router, currently live.]
|
||||
|
||||
(where 'name' is the router's nickname and 'digest' is a hexadecimal
|
||||
encoding of the hash of the routers' identity key).
|
||||
|
||||
When parsing this line, clients should only mark a router as
|
||||
'verified' if its nickname AND digest match the one provided.
|
||||
|
||||
"directory-signature" nickname-of-dirserver NL Signature
|
||||
|
||||
The signature is computed by computing the digest of the
|
||||
directory, from the characters "signed-directory", through the newline
|
||||
after "directory-signature". This digest is then padded with PKCS.1,
|
||||
and signed with the directory server's signing key.
|
||||
|
||||
If software encounters an unrecognized keyword in a single router descriptor,
|
||||
it MUST reject only that router descriptor, and continue using the
|
||||
others. Because this mechanism is used to add 'critical' extensions to
|
||||
future versions of the router descriptor format, implementation should treat
|
||||
it as a normal occurrence and not, for example, report it to the user as an
|
||||
error. [Versions of Tor prior to 0.1.1 did this.]
|
||||
|
||||
If software encounters an unrecognized keyword in the directory header,
|
||||
it SHOULD reject the entire directory.
|
||||
|
||||
4. Network-status descriptor
|
||||
|
||||
A "network-status" (a.k.a "running-routers") document is a truncated
|
||||
directory that contains only the current status of a list of nodes, not
|
||||
their actual descriptors. It contains exactly one of each of the following
|
||||
entries.
|
||||
|
||||
"network-status"
|
||||
|
||||
Must appear first.
|
||||
|
||||
"published" YYYY-MM-DD HH:MM:SS
|
||||
|
||||
(see section 3 above)
|
||||
|
||||
"router-status" list
|
||||
|
||||
(see section 3 above)
|
||||
|
||||
"directory-signature" NL signature
|
||||
|
||||
(see section 3 above)
|
||||
|
||||
5. Behavior of a directory server
|
||||
|
||||
lists nodes that are connected currently
|
||||
speaks HTTP on a socket, spits out directory on request
|
||||
|
||||
Directory servers listen on a certain port (the DirPort), and speak a
|
||||
limited version of HTTP 1.0. Clients send either GET or POST commands.
|
||||
The basic interactions are:
|
||||
"%s %s HTTP/1.0\r\nContent-Length: %lu\r\nHost: %s\r\n\r\n",
|
||||
command, url, content-length, host.
|
||||
Get "/tor/" to fetch a full directory.
|
||||
Get "/tor/dir.z" to fetch a compressed full directory.
|
||||
Get "/tor/running-routers" to fetch a network-status descriptor.
|
||||
Post "/tor/" to post a server descriptor, with the body of the
|
||||
request containing the descriptor.
|
||||
|
||||
"host" is used to specify the address:port of the dirserver, so
|
||||
the request can survive going through HTTP proxies.
|
||||
|
@ -1,897 +0,0 @@
|
||||
$Id$
|
||||
|
||||
Tor directory protocol, version 2
|
||||
|
||||
0. Scope and preliminaries
|
||||
|
||||
This directory protocol is used by Tor version 0.1.1.x and 0.1.2.x. See
|
||||
dir-spec-v1.txt for information on earlier versions, and dir-spec.txt
|
||||
for information on later versions.
|
||||
|
||||
0.1. Goals and motivation
|
||||
|
||||
There were several problems with the way Tor handles directory information
|
||||
in version 0.1.0.x and earlier. Here are the problems we try to fix with
|
||||
this new design, already implemented in 0.1.1.x:
|
||||
1. Directories were very large and use up a lot of bandwidth: clients
|
||||
downloaded descriptors for all router several times an hour.
|
||||
2. Every directory authority was a trust bottleneck: if a single
|
||||
directory authority lied, it could make clients believe for a time an
|
||||
arbitrarily distorted view of the Tor network.
|
||||
3. Our current "verified server" system is kind of nonsensical.
|
||||
|
||||
4. Getting more directory authorities would add more points of failure
|
||||
and worsen possible partitioning attacks.
|
||||
|
||||
There are two problems that remain unaddressed by this design.
|
||||
5. Requiring every client to know about every router won't scale.
|
||||
6. Requiring every directory cache to know every router won't scale.
|
||||
|
||||
We attempt to fix 1-4 here, and to build a solution that will work when we
|
||||
figure out an answer for 5. We haven't thought at all about what to do
|
||||
about 6.
|
||||
|
||||
1. Outline
|
||||
|
||||
There is a small set (say, around 10) of semi-trusted directory
|
||||
authorities. A default list of authorities is shipped with the Tor
|
||||
software. Users can change this list, but are encouraged not to do so, in
|
||||
order to avoid partitioning attacks.
|
||||
|
||||
Routers periodically upload signed "descriptors" to the directory
|
||||
authorities describing their keys, capabilities, and other information.
|
||||
Routers may act as directory mirrors (also called "caches"), to reduce
|
||||
load on the directory authorities. They announce this in their
|
||||
descriptors.
|
||||
|
||||
Each directory authority periodically generates and signs a compact
|
||||
"network status" document that lists that authority's view of the current
|
||||
descriptors and status for known routers, but which does not include the
|
||||
descriptors themselves.
|
||||
|
||||
Directory mirrors download, cache, and re-serve network-status documents
|
||||
to clients.
|
||||
|
||||
Clients, directory mirrors, and directory authorities all use
|
||||
network-status documents to find out when their list of routers is
|
||||
out-of-date. If it is, they download any missing router descriptors.
|
||||
Clients download missing descriptors from mirrors; mirrors and authorities
|
||||
download from authorities. Descriptors are downloaded by the hash of the
|
||||
descriptor, not by the server's identity key: this prevents servers from
|
||||
attacking clients by giving them descriptors nobody else uses.
|
||||
|
||||
All directory information is uploaded and downloaded with HTTP.
|
||||
|
||||
Coordination among directory authorities is done client-side: clients
|
||||
compute a vote-like algorithm among the network-status documents they
|
||||
have, and base their decisions on the result.
|
||||
|
||||
1.1. What's different from 0.1.0.x?
|
||||
|
||||
Clients used to download a signed concatenated set of router descriptors
|
||||
(called a "directory") from directory mirrors, regardless of which
|
||||
descriptors had changed.
|
||||
|
||||
Between downloading directories, clients would download "network-status"
|
||||
documents that would list which servers were supposed to running.
|
||||
|
||||
Clients would always believe the most recently published network-status
|
||||
document they were served.
|
||||
|
||||
Routers used to upload fresh descriptors all the time, whether their keys
|
||||
and other information had changed or not.
|
||||
|
||||
1.2. Document meta-format
|
||||
|
||||
Router descriptors, directories, and running-routers documents all obey the
|
||||
following lightweight extensible information format.
|
||||
|
||||
The highest level object is a Document, which consists of one or more
|
||||
Items. Every Item begins with a KeywordLine, followed by one or more
|
||||
Objects. A KeywordLine begins with a Keyword, optionally followed by
|
||||
whitespace and more non-newline characters, and ends with a newline. A
|
||||
Keyword is a sequence of one or more characters in the set [A-Za-z0-9-].
|
||||
An Object is a block of encoded data in pseudo-Open-PGP-style
|
||||
armor. (cf. RFC 2440)
|
||||
|
||||
More formally:
|
||||
|
||||
Document ::= (Item | NL)+
|
||||
Item ::= KeywordLine Object*
|
||||
KeywordLine ::= Keyword NL | Keyword WS ArgumentsChar+ NL
|
||||
Keyword = KeywordChar+
|
||||
KeywordChar ::= 'A' ... 'Z' | 'a' ... 'z' | '0' ... '9' | '-'
|
||||
ArgumentChar ::= any printing ASCII character except NL.
|
||||
WS = (SP | TAB)+
|
||||
Object ::= BeginLine Base-64-encoded-data EndLine
|
||||
BeginLine ::= "-----BEGIN " Keyword "-----" NL
|
||||
EndLine ::= "-----END " Keyword "-----" NL
|
||||
|
||||
The BeginLine and EndLine of an Object must use the same keyword.
|
||||
|
||||
When interpreting a Document, software MUST ignore any KeywordLine that
|
||||
starts with a keyword it doesn't recognize; future implementations MUST NOT
|
||||
require current clients to understand any KeywordLine not currently
|
||||
described.
|
||||
|
||||
The "opt" keyword was used until Tor 0.1.2.5-alpha for non-critical future
|
||||
extensions. All implementations MUST ignore any item of the form "opt
|
||||
keyword ....." when they would not recognize "keyword ....."; and MUST
|
||||
treat "opt keyword ....." as synonymous with "keyword ......" when keyword
|
||||
is recognized.
|
||||
|
||||
Implementations before 0.1.2.5-alpha rejected any document with a
|
||||
KeywordLine that started with a keyword that they didn't recognize.
|
||||
Implementations MUST prefix items not recognized by older versions of Tor
|
||||
with an "opt" until those versions of Tor are obsolete.
|
||||
|
||||
Other implementations that want to extend Tor's directory format MAY
|
||||
introduce their own items. The keywords for extension items SHOULD start
|
||||
with the characters "x-" or "X-", to guarantee that they will not conflict
|
||||
with keywords used by future versions of Tor.
|
||||
|
||||
2. Router operation
|
||||
|
||||
ORs SHOULD generate a new router descriptor whenever any of the
|
||||
following events have occurred:
|
||||
|
||||
- A period of time (18 hrs by default) has passed since the last
|
||||
time a descriptor was generated.
|
||||
|
||||
- A descriptor field other than bandwidth or uptime has changed.
|
||||
|
||||
- Bandwidth has changed by at least a factor of 2 from the last time a
|
||||
descriptor was generated, and at least a given interval of time
|
||||
(20 mins by default) has passed since then.
|
||||
|
||||
- Its uptime has been reset (by restarting).
|
||||
|
||||
After generating a descriptor, ORs upload it to every directory
|
||||
authority they know, by posting it to the URL
|
||||
|
||||
http://<hostname:port>/tor/
|
||||
|
||||
2.1. Router descriptor format
|
||||
|
||||
Every router descriptor MUST start with a "router" Item; MUST end with a
|
||||
"router-signature" Item and an extra NL; and MUST contain exactly one
|
||||
instance of each of the following Items: "published" "onion-key"
|
||||
"signing-key" "bandwidth".
|
||||
|
||||
A router descriptor MAY have zero or one of each of the following Items,
|
||||
but MUST NOT have more than one: "contact", "uptime", "fingerprint",
|
||||
"hibernating", "read-history", "write-history", "eventdns", "platform",
|
||||
"family".
|
||||
|
||||
Additionally, a router descriptor MAY contain any number of "accept",
|
||||
"reject", and "opt" Items. Other than "router" and "router-signature",
|
||||
the items may appear in any order.
|
||||
|
||||
The items' formats are as follows:
|
||||
"router" nickname address ORPort SocksPort DirPort
|
||||
|
||||
Indicates the beginning of a router descriptor. "address" must be an
|
||||
IPv4 address in dotted-quad format. The last three numbers indicate
|
||||
the TCP ports at which this OR exposes functionality. ORPort is a port
|
||||
at which this OR accepts TLS connections for the main OR protocol;
|
||||
SocksPort is deprecated and should always be 0; and DirPort is the
|
||||
port at which this OR accepts directory-related HTTP connections. If
|
||||
any port is not supported, the value 0 is given instead of a port
|
||||
number.
|
||||
|
||||
"bandwidth" bandwidth-avg bandwidth-burst bandwidth-observed
|
||||
|
||||
Estimated bandwidth for this router, in bytes per second. The
|
||||
"average" bandwidth is the volume per second that the OR is willing to
|
||||
sustain over long periods; the "burst" bandwidth is the volume that
|
||||
the OR is willing to sustain in very short intervals. The "observed"
|
||||
value is an estimate of the capacity this server can handle. The
|
||||
server remembers the max bandwidth sustained output over any ten
|
||||
second period in the past day, and another sustained input. The
|
||||
"observed" value is the lesser of these two numbers.
|
||||
|
||||
"platform" string
|
||||
|
||||
A human-readable string describing the system on which this OR is
|
||||
running. This MAY include the operating system, and SHOULD include
|
||||
the name and version of the software implementing the Tor protocol.
|
||||
|
||||
"published" YYYY-MM-DD HH:MM:SS
|
||||
|
||||
The time, in GMT, when this descriptor was generated.
|
||||
|
||||
"fingerprint"
|
||||
|
||||
A fingerprint (a HASH_LEN-byte of asn1 encoded public key, encoded in
|
||||
hex, with a single space after every 4 characters) for this router's
|
||||
identity key. A descriptor is considered invalid (and MUST be
|
||||
rejected) if the fingerprint line does not match the public key.
|
||||
|
||||
[We didn't start parsing this line until Tor 0.1.0.6-rc; it should
|
||||
be marked with "opt" until earlier versions of Tor are obsolete.]
|
||||
|
||||
"hibernating" 0|1
|
||||
|
||||
If the value is 1, then the Tor server was hibernating when the
|
||||
descriptor was published, and shouldn't be used to build circuits.
|
||||
|
||||
[We didn't start parsing this line until Tor 0.1.0.6-rc; it should be
|
||||
marked with "opt" until earlier versions of Tor are obsolete.]
|
||||
|
||||
"uptime"
|
||||
|
||||
The number of seconds that this OR process has been running.
|
||||
|
||||
"onion-key" NL a public key in PEM format
|
||||
|
||||
This key is used to encrypt EXTEND cells for this OR. The key MUST be
|
||||
accepted for at least 1 week after any new key is published in a
|
||||
subsequent descriptor.
|
||||
|
||||
"signing-key" NL a public key in PEM format
|
||||
|
||||
The OR's long-term identity key.
|
||||
|
||||
"accept" exitpattern
|
||||
"reject" exitpattern
|
||||
|
||||
These lines describe the rules that an OR follows when
|
||||
deciding whether to allow a new stream to a given address. The
|
||||
'exitpattern' syntax is described below. The rules are considered in
|
||||
order; if no rule matches, the address will be accepted. For clarity,
|
||||
the last such entry SHOULD be accept *:* or reject *:*.
|
||||
|
||||
"router-signature" NL Signature NL
|
||||
|
||||
The "SIGNATURE" object contains a signature of the PKCS1-padded
|
||||
hash of the entire router descriptor, taken from the beginning of the
|
||||
"router" line, through the newline after the "router-signature" line.
|
||||
The router descriptor is invalid unless the signature is performed
|
||||
with the router's identity key.
|
||||
|
||||
"contact" info NL
|
||||
|
||||
Describes a way to contact the server's administrator, preferably
|
||||
including an email address and a PGP key fingerprint.
|
||||
|
||||
"family" names NL
|
||||
|
||||
'Names' is a space-separated list of server nicknames or
|
||||
hexdigests. If two ORs list one another in their "family" entries,
|
||||
then OPs should treat them as a single OR for the purpose of path
|
||||
selection.
|
||||
|
||||
For example, if node A's descriptor contains "family B", and node B's
|
||||
descriptor contains "family A", then node A and node B should never
|
||||
be used on the same circuit.
|
||||
|
||||
"read-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM,NUM,NUM... NL
|
||||
"write-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM,NUM,NUM... NL
|
||||
|
||||
Declare how much bandwidth the OR has used recently. Usage is divided
|
||||
into intervals of NSEC seconds. The YYYY-MM-DD HH:MM:SS field
|
||||
defines the end of the most recent interval. The numbers are the
|
||||
number of bytes used in the most recent intervals, ordered from
|
||||
oldest to newest.
|
||||
|
||||
[We didn't start parsing these lines until Tor 0.1.0.6-rc; they should
|
||||
be marked with "opt" until earlier versions of Tor are obsolete.]
|
||||
|
||||
"eventdns" bool NL
|
||||
|
||||
Declare whether this version of Tor is using the newer enhanced
|
||||
dns logic. Versions of Tor without eventdns SHOULD NOT be used for
|
||||
reverse hostname lookups.
|
||||
|
||||
[All versions of Tor before 0.1.2.2-alpha should be assumed to have
|
||||
this option set to 0 if it is not present. All Tor versions at
|
||||
0.1.2.2-alpha or later should be assumed to have this option set to
|
||||
1 if it is not present. Until 0.1.2.1-alpha-dev, this option was
|
||||
not generated, even when eventdns was in use. Versions of Tor
|
||||
before 0.1.2.1-alpha-dev did not parse this option, so it should be
|
||||
marked "opt". With 0.2.0.1-alpha, the old 'dnsworker' logic has
|
||||
been removed, rendering this option of historical interest only.]
|
||||
|
||||
2.2. Nonterminals in router descriptors
|
||||
|
||||
nickname ::= between 1 and 19 alphanumeric characters, case-insensitive.
|
||||
hexdigest ::= a '$', followed by 20 hexadecimal characters.
|
||||
[Represents a server by the digest of its identity key.]
|
||||
|
||||
exitpattern ::= addrspec ":" portspec
|
||||
portspec ::= "*" | port | port "-" port
|
||||
port ::= an integer between 1 and 65535, inclusive.
|
||||
[Some implementations incorrectly generate ports with value 0.
|
||||
Implementations SHOULD accept this, and SHOULD NOT generate it.]
|
||||
|
||||
addrspec ::= "*" | ip4spec | ip6spec
|
||||
ipv4spec ::= ip4 | ip4 "/" num_ip4_bits | ip4 "/" ip4mask
|
||||
ip4 ::= an IPv4 address in dotted-quad format
|
||||
ip4mask ::= an IPv4 mask in dotted-quad format
|
||||
num_ip4_bits ::= an integer between 0 and 32
|
||||
ip6spec ::= ip6 | ip6 "/" num_ip6_bits
|
||||
ip6 ::= an IPv6 address, surrounded by square brackets.
|
||||
num_ip6_bits ::= an integer between 0 and 128
|
||||
|
||||
bool ::= "0" | "1"
|
||||
|
||||
Ports are required; if they are not included in the router
|
||||
line, they must appear in the "ports" lines.
|
||||
|
||||
3. Network status format
|
||||
|
||||
Directory authorities generate, sign, and compress network-status
|
||||
documents. Directory servers SHOULD generate a fresh network-status
|
||||
document when the contents of such a document would be different from the
|
||||
last one generated, and some time (at least one second, possibly longer)
|
||||
has passed since the last one was generated.
|
||||
|
||||
The network status document contains a preamble, a set of router status
|
||||
entries, and a signature, in that order.
|
||||
|
||||
We use the same meta-format as used for directories and router descriptors
|
||||
in "tor-spec.txt". Implementations MAY insert blank lines
|
||||
for clarity between sections; these blank lines are ignored.
|
||||
Implementations MUST NOT depend on blank lines in any particular location.
|
||||
|
||||
As used here, "whitespace" is a sequence of 1 or more tab or space
|
||||
characters.
|
||||
|
||||
The preamble contains:
|
||||
|
||||
"network-status-version" -- A document format version. For this
|
||||
specification, the version is "2".
|
||||
"dir-source" -- The authority's hostname, current IP address, and
|
||||
directory port, all separated by whitespace.
|
||||
"fingerprint" -- A base16-encoded hash of the signing key's
|
||||
fingerprint, with no additional spaces added.
|
||||
"contact" -- An arbitrary string describing how to contact the
|
||||
directory server's administrator. Administrators should include at
|
||||
least an email address and a PGP fingerprint.
|
||||
"dir-signing-key" -- The directory server's public signing key.
|
||||
"client-versions" -- A comma-separated list of recommended client
|
||||
versions.
|
||||
"server-versions" -- A comma-separated list of recommended server
|
||||
versions.
|
||||
"published" -- The publication time for this network-status object.
|
||||
"dir-options" -- A set of flags, in any order, separated by whitespace:
|
||||
"Names" if this directory authority performs name bindings.
|
||||
"Versions" if this directory authority recommends software versions.
|
||||
"BadExits" if the directory authority flags nodes that it believes
|
||||
are performing incorrectly as exit nodes.
|
||||
"BadDirectories" if the directory authority flags nodes that it
|
||||
believes are performing incorrectly as directory caches.
|
||||
|
||||
The dir-options entry is optional. The "-versions" entries are required if
|
||||
the "Versions" flag is present. The other entries are required and must
|
||||
appear exactly once. The "network-status-version" entry must appear first;
|
||||
the others may appear in any order. Implementations MUST ignore
|
||||
additional arguments to the items above, and MUST ignore unrecognized
|
||||
flags.
|
||||
|
||||
For each router, the router entry contains: (This format is designed for
|
||||
conciseness.)
|
||||
|
||||
"r" -- followed by the following elements, in order, separated by
|
||||
whitespace:
|
||||
- The OR's nickname,
|
||||
- A hash of its identity key, encoded in base64, with trailing =
|
||||
signs removed.
|
||||
- A hash of its most recent descriptor, encoded in base64, with
|
||||
trailing = signs removed. (The hash is calculated as for
|
||||
computing the signature of a descriptor.)
|
||||
- The publication time of its most recent descriptor, in the form
|
||||
YYYY-MM-DD HH:MM:SS, in GMT.
|
||||
- An IP address
|
||||
- An OR port
|
||||
- A directory port (or "0" for none")
|
||||
"s" -- A series of whitespace-separated status flags, in any order:
|
||||
"Authority" if the router is a directory authority.
|
||||
"BadExit" if the router is believed to be useless as an exit node
|
||||
(because its ISP censors it, because it is behind a restrictive
|
||||
proxy, or for some similar reason).
|
||||
"BadDirectory" if the router is believed to be useless as a
|
||||
directory cache (because its directory port isn't working,
|
||||
its bandwidth is always throttled, or for some similar
|
||||
reason).
|
||||
"Exit" if the router is useful for building general-purpose exit
|
||||
circuits.
|
||||
"Fast" if the router is suitable for high-bandwidth circuits.
|
||||
"Guard" if the router is suitable for use as an entry guard.
|
||||
"Named" if the router's identity-nickname mapping is canonical,
|
||||
and this authority binds names.
|
||||
"Stable" if the router is suitable for long-lived circuits.
|
||||
"Running" if the router is currently usable.
|
||||
"Valid" if the router has been 'validated'.
|
||||
"V2Dir" if the router implements this protocol.
|
||||
"v" -- The version of the Tor protocol that this server is running. If
|
||||
the value begins with "Tor" SP, the rest of the string is a Tor
|
||||
version number, and the protocol is "The Tor protocol as supported
|
||||
by the given version of Tor." Otherwise, if the value begins with
|
||||
some other string, Tor has upgraded to a more sophisticated
|
||||
protocol versioning system, and the protocol is "a version of the
|
||||
Tor protocol more recent than any we recognize."
|
||||
|
||||
The "r" entry for each router must appear first and is required. The
|
||||
"s" entry is optional (see Section 3.1 below for how the flags are
|
||||
decided). Unrecognized flags on the "s" line and extra elements
|
||||
on the "r" line must be ignored. The "v" line is optional; it was not
|
||||
supported until 0.1.2.5-alpha, and it must be preceded with an "opt"
|
||||
until all earlier versions of Tor are obsolete.
|
||||
|
||||
The signature section contains:
|
||||
|
||||
"directory-signature" nickname-of-dirserver NL Signature
|
||||
|
||||
Signature is a signature of this network-status document
|
||||
(the document up until the signature, including the line
|
||||
"directory-signature <nick>\n"), using the directory authority's
|
||||
signing key.
|
||||
|
||||
We compress the network status list with zlib before transmitting it.
|
||||
|
||||
3.1. Establishing server status
|
||||
|
||||
(This section describes how directory authorities choose which status
|
||||
flags to apply to routers, as of Tor 0.1.1.18-rc. Later directory
|
||||
authorities MAY do things differently, so long as clients keep working
|
||||
well. Clients MUST NOT depend on the exact behaviors in this section.)
|
||||
|
||||
In the below definitions, a router is considered "active" if it is
|
||||
running, valid, and not hibernating.
|
||||
|
||||
"Valid" -- a router is 'Valid' if it is running a version of Tor not
|
||||
known to be broken, and the directory authority has not blacklisted
|
||||
it as suspicious.
|
||||
|
||||
"Named" -- Directory authority administrators may decide to support name
|
||||
binding. If they do, then they must maintain a file of
|
||||
nickname-to-identity-key mappings, and try to keep this file consistent
|
||||
with other directory authorities. If they don't, they act as clients, and
|
||||
report bindings made by other directory authorities (name X is bound to
|
||||
identity Y if at least one binding directory lists it, and no directory
|
||||
binds X to some other Y'.) A router is called 'Named' if the router
|
||||
believes the given name should be bound to the given key.
|
||||
|
||||
"Running" -- A router is 'Running' if the authority managed to connect to
|
||||
it successfully within the last 30 minutes.
|
||||
|
||||
"Stable" -- A router is 'Stable' if it is active, and either its
|
||||
uptime is at least the median uptime for known active routers, or
|
||||
its uptime is at least 30 days. Routers are never called stable if
|
||||
they are running a version of Tor known to drop circuits stupidly.
|
||||
(0.1.1.10-alpha through 0.1.1.16-rc are stupid this way.)
|
||||
|
||||
"Fast" -- A router is 'Fast' if it is active, and its bandwidth is
|
||||
in the top 7/8ths for known active routers.
|
||||
|
||||
"Guard" -- A router is a possible 'Guard' if it is 'Stable' and its
|
||||
bandwidth is above median for known active routers. If the total
|
||||
bandwidth of active non-BadExit Exit servers is less than one third
|
||||
of the total bandwidth of all active servers, no Exit is listed as
|
||||
a Guard.
|
||||
|
||||
"Authority" -- A router is called an 'Authority' if the authority
|
||||
generating the network-status document believes it is an authority.
|
||||
|
||||
"V2Dir" -- A router supports the v2 directory protocol if it has an open
|
||||
directory port, and it is running a version of the directory protocol that
|
||||
supports the functionality clients need. (Currently, this is
|
||||
0.1.1.9-alpha or later.)
|
||||
|
||||
Directory server administrators may label some servers or IPs as
|
||||
blacklisted, and elect not to include them in their network-status lists.
|
||||
|
||||
Authorities SHOULD 'disable' any servers in excess of 3 on any single IP.
|
||||
When there are more than 3 to choose from, authorities should first prefer
|
||||
authorities to non-authorities, then prefer Running to non-Running, and
|
||||
then prefer high-bandwidth to low-bandwidth. To 'disable' a server, the
|
||||
authority *should* advertise it without the Running or Valid flag.
|
||||
|
||||
Thus, the network-status list includes all non-blacklisted,
|
||||
non-expired, non-superseded descriptors.
|
||||
|
||||
4. Directory server operation
|
||||
|
||||
All directory authorities and directory mirrors ("directory servers")
|
||||
implement this section, except as noted.
|
||||
|
||||
4.1. Accepting uploads (authorities only)
|
||||
|
||||
When a router posts a signed descriptor to a directory authority, the
|
||||
authority first checks whether it is well-formed and correctly
|
||||
self-signed. If it is, the authority next verifies that the nickname
|
||||
in question is not already assigned to a router with a different
|
||||
public key.
|
||||
Finally, the authority MAY check that the router is not blacklisted
|
||||
because of its key, IP, or another reason.
|
||||
|
||||
If the descriptor passes these tests, and the authority does not already
|
||||
have a descriptor for a router with this public key, it accepts the
|
||||
descriptor and remembers it.
|
||||
|
||||
If the authority _does_ have a descriptor with the same public key, the
|
||||
newly uploaded descriptor is remembered if its publication time is more
|
||||
recent than the most recent old descriptor for that router, and either:
|
||||
- There are non-cosmetic differences between the old descriptor and the
|
||||
new one.
|
||||
- Enough time has passed between the descriptors' publication times.
|
||||
(Currently, 12 hours.)
|
||||
|
||||
Differences between router descriptors are "non-cosmetic" if they would be
|
||||
sufficient to force an upload as described in section 2 above.
|
||||
|
||||
Note that the "cosmetic difference" test only applies to uploaded
|
||||
descriptors, not to descriptors that the authority downloads from other
|
||||
authorities.
|
||||
|
||||
4.2. Downloading network-status documents (authorities and caches)
|
||||
|
||||
All directory servers (authorities and mirrors) try to keep a fresh
|
||||
set of network-status documents from every authority. To do so,
|
||||
every 5 minutes, each authority asks every other authority for its
|
||||
most recent network-status document. Every 15 minutes, each mirror
|
||||
picks a random authority and asks it for the most recent network-status
|
||||
documents for all the authorities the authority knows about (including
|
||||
the chosen authority itself).
|
||||
|
||||
Directory servers and mirrors remember and serve the most recent
|
||||
network-status document they have from each authority. Other
|
||||
network-status documents don't need to be stored. If the most recent
|
||||
network-status document is over 10 days old, it is discarded anyway.
|
||||
Mirrors SHOULD store and serve network-status documents from authorities
|
||||
they don't recognize, but SHOULD NOT use such documents for any other
|
||||
purpose. Mirrors SHOULD discard network-status documents older than 48
|
||||
hours.
|
||||
|
||||
4.3. Downloading and storing router descriptors (authorities and caches)
|
||||
|
||||
Periodically (currently, every 10 seconds), directory servers check
|
||||
whether there are any specific descriptors (as identified by descriptor
|
||||
hash in a network-status document) that they do not have and that they
|
||||
are not currently trying to download.
|
||||
|
||||
If so, the directory server launches requests to the authorities for these
|
||||
descriptors, such that each authority is only asked for descriptors listed
|
||||
in its most recent network-status. When more than one authority lists the
|
||||
descriptor, we choose which to ask at random.
|
||||
|
||||
If one of these downloads fails, we do not try to download that descriptor
|
||||
from the authority that failed to serve it again unless we receive a newer
|
||||
network-status from that authority that lists the same descriptor.
|
||||
|
||||
Directory servers must potentially cache multiple descriptors for each
|
||||
router. Servers must not discard any descriptor listed by any current
|
||||
network-status document from any authority. If there is enough space to
|
||||
store additional descriptors, servers SHOULD try to hold those which
|
||||
clients are likely to download the most. (Currently, this is judged
|
||||
based on the interval for which each descriptor seemed newest.)
|
||||
|
||||
Authorities SHOULD NOT download descriptors for routers that they would
|
||||
immediately reject for reasons listed in 3.1.
|
||||
|
||||
4.4. HTTP URLs
|
||||
|
||||
"Fingerprints" in these URLs are base-16-encoded SHA1 hashes.
|
||||
|
||||
The authoritative network-status published by a host should be available at:
|
||||
http://<hostname>/tor/status/authority.z
|
||||
|
||||
The network-status published by a host with fingerprint
|
||||
<F> should be available at:
|
||||
http://<hostname>/tor/status/fp/<F>.z
|
||||
|
||||
The network-status documents published by hosts with fingerprints
|
||||
<F1>,<F2>,<F3> should be available at:
|
||||
http://<hostname>/tor/status/fp/<F1>+<F2>+<F3>.z
|
||||
|
||||
The most recent network-status documents from all known authorities,
|
||||
concatenated, should be available at:
|
||||
http://<hostname>/tor/status/all.z
|
||||
|
||||
The most recent descriptor for a server whose identity key has a
|
||||
fingerprint of <F> should be available at:
|
||||
http://<hostname>/tor/server/fp/<F>.z
|
||||
|
||||
The most recent descriptors for servers with identity fingerprints
|
||||
<F1>,<F2>,<F3> should be available at:
|
||||
http://<hostname>/tor/server/fp/<F1>+<F2>+<F3>.z
|
||||
|
||||
(NOTE: Implementations SHOULD NOT download descriptors by identity key
|
||||
fingerprint. This allows a corrupted server (in collusion with a cache) to
|
||||
provide a unique descriptor to a client, and thereby partition that client
|
||||
from the rest of the network.)
|
||||
|
||||
The server descriptor with (descriptor) digest <D> (in hex) should be
|
||||
available at:
|
||||
http://<hostname>/tor/server/d/<D>.z
|
||||
|
||||
The most recent descriptors with digests <D1>,<D2>,<D3> should be
|
||||
available at:
|
||||
http://<hostname>/tor/server/d/<D1>+<D2>+<D3>.z
|
||||
|
||||
The most recent descriptor for this server should be at:
|
||||
http://<hostname>/tor/server/authority.z
|
||||
[Nothing in the Tor protocol uses this resource yet, but it is useful
|
||||
for debugging purposes. Also, the official Tor implementations
|
||||
(starting at 0.1.1.x) use this resource to test whether a server's
|
||||
own DirPort is reachable.]
|
||||
|
||||
A concatenated set of the most recent descriptors for all known servers
|
||||
should be available at:
|
||||
http://<hostname>/tor/server/all.z
|
||||
|
||||
For debugging, directories SHOULD expose non-compressed objects at URLs like
|
||||
the above, but without the final ".z".
|
||||
Clients MUST handle compressed concatenated information in two forms:
|
||||
- A concatenated list of zlib-compressed objects.
|
||||
- A zlib-compressed concatenated list of objects.
|
||||
Directory servers MAY generate either format: the former requires less
|
||||
CPU, but the latter requires less bandwidth.
|
||||
|
||||
Clients SHOULD use upper case letters (A-F) when base16-encoding
|
||||
fingerprints. Servers MUST accept both upper and lower case fingerprints
|
||||
in requests.
|
||||
|
||||
5. Client operation: downloading information
|
||||
|
||||
Every Tor that is not a directory server (that is, those that do
|
||||
not have a DirPort set) implements this section.
|
||||
|
||||
5.1. Downloading network-status documents
|
||||
|
||||
Each client maintains an ordered list of directory authorities.
|
||||
Insofar as possible, clients SHOULD all use the same ordered list.
|
||||
|
||||
For each network-status document a client has, it keeps track of its
|
||||
publication time *and* the time when the client retrieved it. Clients
|
||||
consider a network-status document "live" if it was published within the
|
||||
last 24 hours.
|
||||
|
||||
Clients try to have a live network-status document hours from *every*
|
||||
authority, and try to periodically get new network-status documents from
|
||||
each authority in rotation as follows:
|
||||
|
||||
If a client is missing a live network-status document for any
|
||||
authority, it tries to fetch it from a directory cache. On failure,
|
||||
the client waits briefly, then tries that network-status document
|
||||
again from another cache. The client does not build circuits until it
|
||||
has live network-status documents from more than half the authorities
|
||||
it trusts, and it has descriptors for more than 1/4 of the routers
|
||||
that it believes are running.
|
||||
|
||||
If the most recently _retrieved_ network-status document is over 30
|
||||
minutes old, the client attempts to download a network-status document.
|
||||
When choosing which documents to download, clients treat their list of
|
||||
directory authorities as a circular ring, and begin with the authority
|
||||
appearing immediately after the authority for their most recently
|
||||
retrieved network-status document. If this attempt fails (either it
|
||||
fails to download at all, or the one it gets is not as good as the
|
||||
one it has), the client retries at other caches several times, before
|
||||
moving on to the next network-status document in sequence.
|
||||
|
||||
Clients discard all network-status documents over 24 hours old.
|
||||
|
||||
If enough mirrors (currently 4) claim not to have a given network status,
|
||||
we stop trying to download that authority's network-status, until we
|
||||
download a new network-status that makes us believe that the authority in
|
||||
question is running. Clients should wait a little longer after each
|
||||
failure.
|
||||
|
||||
Clients SHOULD try to batch as many network-status requests as possible
|
||||
into each HTTP GET.
|
||||
|
||||
(Note: clients can and should pick caches based on the network-status
|
||||
information they have: once they have first fetched network-status info
|
||||
from an authority, they should not need to go to the authority directly
|
||||
again.)
|
||||
|
||||
5.2. Downloading and storing router descriptors
|
||||
|
||||
Clients try to have the best descriptor for each router. A descriptor is
|
||||
"best" if:
|
||||
* It is the most recently published descriptor listed for that router
|
||||
by at least two network-status documents.
|
||||
OR,
|
||||
* No descriptor for that router is listed by two or more
|
||||
network-status documents, and it is the most recently published
|
||||
descriptor listed by any network-status document.
|
||||
|
||||
Periodically (currently every 10 seconds) clients check whether there are
|
||||
any "downloadable" descriptors. A descriptor is downloadable if:
|
||||
- It is the "best" descriptor for some router.
|
||||
- The descriptor was published at least 10 minutes in the past.
|
||||
(This prevents clients from trying to fetch descriptors that the
|
||||
mirrors have probably not yet retrieved and cached.)
|
||||
- The client does not currently have it.
|
||||
- The client is not currently trying to download it.
|
||||
- The client would not discard it immediately upon receiving it.
|
||||
- The client thinks it is running and valid (see 6.1 below).
|
||||
|
||||
If at least 16 known routers have downloadable descriptors, or if
|
||||
enough time (currently 10 minutes) has passed since the last time the
|
||||
client tried to download descriptors, it launches requests for all
|
||||
downloadable descriptors, as described in 5.3 below.
|
||||
|
||||
When a descriptor download fails, the client notes it, and does not
|
||||
consider the descriptor downloadable again until a certain amount of time
|
||||
has passed. (Currently 0 seconds for the first failure, 60 seconds for the
|
||||
second, 5 minutes for the third, 10 minutes for the fourth, and 1 day
|
||||
thereafter.) Periodically (currently once an hour) clients reset the
|
||||
failure count.
|
||||
|
||||
No descriptors are downloaded until the client has downloaded more than
|
||||
half of the network-status documents.
|
||||
|
||||
Clients retain the most recent descriptor they have downloaded for each
|
||||
router so long as it is not too old (currently, 48 hours), OR so long as
|
||||
it is recommended by at least one networkstatus AND no "better"
|
||||
descriptor has been downloaded. [Versions of Tor before 0.1.2.3-alpha
|
||||
would discard descriptors simply for being published too far in the past.]
|
||||
[The code seems to discard descriptors in all cases after they're 5
|
||||
days old. True? -RD]
|
||||
|
||||
5.3. Managing downloads
|
||||
|
||||
When a client has no live network-status documents, it downloads
|
||||
network-status documents from a randomly chosen authority. In all other
|
||||
cases, the client downloads from mirrors randomly chosen from among those
|
||||
believed to be V2 directory servers. (This information comes from the
|
||||
network-status documents; see 6 below.)
|
||||
|
||||
When downloading multiple router descriptors, the client chooses multiple
|
||||
mirrors so that:
|
||||
- At least 3 different mirrors are used, except when this would result
|
||||
in more than one request for under 4 descriptors.
|
||||
- No more than 128 descriptors are requested from a single mirror.
|
||||
- Otherwise, as few mirrors as possible are used.
|
||||
After choosing mirrors, the client divides the descriptors among them
|
||||
randomly.
|
||||
|
||||
After receiving any response client MUST discard any network-status
|
||||
documents and descriptors that it did not request.
|
||||
|
||||
6. Using directory information
|
||||
|
||||
Everyone besides directory authorities uses the approaches in this section
|
||||
to decide which servers to use and what their keys are likely to be.
|
||||
(Directory authorities just believe their own opinions, as in 3.1 above.)
|
||||
|
||||
6.1. Choosing routers for circuits.
|
||||
|
||||
Tor implementations only pay attention to "live" network-status documents.
|
||||
A network status is "live" if it is the most recently downloaded network
|
||||
status document for a given directory server, and the server is a
|
||||
directory server trusted by the client, and the network-status document is
|
||||
no more than 1 day old.
|
||||
|
||||
For time-sensitive information, Tor implementations focus on "recent"
|
||||
network-status documents. A network status is "recent" if it is live, and
|
||||
if it was published in the last 60 minutes. If there are fewer
|
||||
than 3 such documents, the most recently published 3 are "recent." If
|
||||
there are fewer than 3 in all, all are "recent.")
|
||||
|
||||
Circuits SHOULD NOT be built until the client has enough directory
|
||||
information: network-statuses (or failed attempts to download
|
||||
network-statuses) for all authorities, network-statuses for at more than
|
||||
half of the authorities, and descriptors for at least 1/4 of the servers
|
||||
believed to be running.
|
||||
|
||||
A server is "listed" if it is included by more than half of the live
|
||||
network status documents. Clients SHOULD NOT use unlisted servers.
|
||||
|
||||
Clients believe the flags "Valid", "Exit", "Fast", "Guard", "Stable", and
|
||||
"V2Dir" about a given router when they are asserted by more than half of
|
||||
the live network-status documents. Clients believe the flag "Running" if
|
||||
it is listed by more than half of the recent network-status documents.
|
||||
|
||||
These flags are used as follows:
|
||||
|
||||
- Clients SHOULD NOT use non-'Valid' or non-'Running' routers unless
|
||||
requested to do so.
|
||||
|
||||
- Clients SHOULD NOT use non-'Fast' routers for any purpose other than
|
||||
very-low-bandwidth circuits (such as introduction circuits).
|
||||
|
||||
- Clients SHOULD NOT use non-'Stable' routers for circuits that are
|
||||
likely to need to be open for a very long time (such as those used for
|
||||
IRC or SSH connections).
|
||||
|
||||
- Clients SHOULD NOT choose non-'Guard' nodes when picking entry guard
|
||||
nodes.
|
||||
|
||||
- Clients SHOULD NOT download directory information from non-'V2Dir'
|
||||
caches.
|
||||
|
||||
6.2. Managing naming
|
||||
|
||||
In order to provide human-memorable names for individual server
|
||||
identities, some directory servers bind names to IDs. Clients handle
|
||||
names in two ways:
|
||||
|
||||
When a client encounters a name it has not mapped before:
|
||||
|
||||
If all the live "Naming" network-status documents the client has
|
||||
claim that the name binds to some identity ID, and the client has at
|
||||
least three live network-status documents, the client maps the name to
|
||||
ID.
|
||||
|
||||
When a user tries to refer to a router with a name that does not have a
|
||||
mapping under the above rules, the implementation SHOULD warn the user.
|
||||
After giving the warning, the implementation MAY use a router that at
|
||||
least one Naming authority maps the name to, so long as no other naming
|
||||
authority maps that name to a different router. If no Naming authority
|
||||
maps the name to a router, the implementation MAY use any router that
|
||||
advertises the name.
|
||||
|
||||
Not every router needs a nickname. When a router doesn't configure a
|
||||
nickname, it publishes with the default nickname "Unnamed". Authorities
|
||||
SHOULD NOT ever mark a router with this nickname as Named; client software
|
||||
SHOULD NOT ever use a router in response to a user request for a router
|
||||
called "Unnamed".
|
||||
|
||||
6.3. Software versions
|
||||
|
||||
An implementation of Tor SHOULD warn when it has fetched (or has
|
||||
attempted to fetch and failed four consecutive times) a network-status
|
||||
for each authority, and it is running a software version
|
||||
not listed on more than half of the live "Versioning" network-status
|
||||
documents.
|
||||
|
||||
6.4. Warning about a router's status.
|
||||
|
||||
If a router tries to publish its descriptor to a Naming authority
|
||||
that has its nickname mapped to another key, the router SHOULD
|
||||
warn the operator that it is either using the wrong key or is using
|
||||
an already claimed nickname.
|
||||
|
||||
If a router has fetched (or attempted to fetch and failed four
|
||||
consecutive times) a network-status for every authority, and at
|
||||
least one of the authorities is "Naming", and no live "Naming"
|
||||
authorities publish a binding for the router's nickname, the
|
||||
router MAY remind the operator that the chosen nickname is not
|
||||
bound to this key at the authorities, and suggest contacting the
|
||||
authority operators.
|
||||
|
||||
...
|
||||
|
||||
6.5. Router protocol versions
|
||||
|
||||
A client should believe that a router supports a given feature if that
|
||||
feature is supported by the router or protocol versions in more than half
|
||||
of the live networkstatus's "v" entries for that router. In other words,
|
||||
if the "v" entries for some router are:
|
||||
v Tor 0.0.8pre1 (from authority 1)
|
||||
v Tor 0.1.2.11 (from authority 2)
|
||||
v FutureProtocolDescription 99 (from authority 3)
|
||||
then the client should believe that the router supports any feature
|
||||
supported by 0.1.2.11.
|
||||
|
||||
This is currently equivalent to believing the median declared version for
|
||||
a router in all live networkstatuses.
|
||||
|
||||
7. Standards compliance
|
||||
|
||||
All clients and servers MUST support HTTP 1.0.
|
||||
|
||||
7.1. HTTP headers
|
||||
|
||||
Servers MAY set the Content-Length: header. Servers SHOULD set
|
||||
Content-Encoding to "deflate" or "identity".
|
||||
|
||||
Servers MAY include an X-Your-Address-Is: header, whose value is the
|
||||
apparent IP address of the client connecting to them (as a dotted quad).
|
||||
For directory connections tunneled over a BEGIN_DIR stream, servers SHOULD
|
||||
report the IP from which the circuit carrying the BEGIN_DIR stream reached
|
||||
them. [Servers before version 0.1.2.5-alpha reported 127.0.0.1 for all
|
||||
BEGIN_DIR-tunneled connections.]
|
||||
|
||||
Servers SHOULD disable caching of multiple network statuses or multiple
|
||||
router descriptors. Servers MAY enable caching of single descriptors,
|
||||
single network statuses, the list of all router descriptors, a v1
|
||||
directory, or a v1 running routers document. XXX mention times.
|
||||
|
||||
7.2. HTTP status codes
|
||||
|
||||
XXX We should write down what return codes dirservers send in what situations.
|
||||
|
File diff suppressed because it is too large
Load Diff
@ -1,423 +0,0 @@
|
||||
$Id$
|
||||
|
||||
Tor Path Specification
|
||||
|
||||
Roger Dingledine
|
||||
Nick Mathewson
|
||||
|
||||
Note: This is an attempt to specify Tor as currently implemented. Future
|
||||
versions of Tor will implement improved algorithms.
|
||||
|
||||
This document tries to cover how Tor chooses to build circuits and assign
|
||||
streams to circuits. Other implementations MAY take other approaches, but
|
||||
implementors should be aware of the anonymity and load-balancing implications
|
||||
of their choices.
|
||||
|
||||
THIS SPEC ISN'T DONE YET.
|
||||
|
||||
1. General operation
|
||||
|
||||
Tor begins building circuits as soon as it has enough directory
|
||||
information to do so (see section 5 of dir-spec.txt). Some circuits are
|
||||
built preemptively because we expect to need them later (for user
|
||||
traffic), and some are built because of immediate need (for user traffic
|
||||
that no current circuit can handle, for testing the network or our
|
||||
reachability, and so on).
|
||||
|
||||
When a client application creates a new stream (by opening a SOCKS
|
||||
connection or launching a resolve request), we attach it to an appropriate
|
||||
open circuit if one exists, or wait if an appropriate circuit is
|
||||
in-progress. We launch a new circuit only
|
||||
if no current circuit can handle the request. We rotate circuits over
|
||||
time to avoid some profiling attacks.
|
||||
|
||||
To build a circuit, we choose all the nodes we want to use, and then
|
||||
construct the circuit. Sometimes, when we want a circuit that ends at a
|
||||
given hop, and we have an appropriate unused circuit, we "cannibalize" the
|
||||
existing circuit and extend it to the new terminus.
|
||||
|
||||
These processes are described in more detail below.
|
||||
|
||||
This document describes Tor's automatic path selection logic only; path
|
||||
selection can be overridden by a controller (with the EXTENDCIRCUIT and
|
||||
ATTACHSTREAM commands). Paths constructed through these means may
|
||||
violate some constraints given below.
|
||||
|
||||
1.1. Terminology
|
||||
|
||||
A "path" is an ordered sequence of nodes, not yet built as a circuit.
|
||||
|
||||
A "clean" circuit is one that has not yet been used for any traffic.
|
||||
|
||||
A "fast" or "stable" or "valid" node is one that has the 'Fast' or
|
||||
'Stable' or 'Valid' flag
|
||||
set respectively, based on our current directory information. A "fast"
|
||||
or "stable" circuit is one consisting only of "fast" or "stable" nodes.
|
||||
|
||||
In an "exit" circuit, the final node is chosen based on waiting stream
|
||||
requests if any, and in any case it avoids nodes with exit policy of
|
||||
"reject *:*". An "internal" circuit, on the other hand, is one where
|
||||
the final node is chosen just like a middle node (ignoring its exit
|
||||
policy).
|
||||
|
||||
A "request" is a client-side stream or DNS resolve that needs to be
|
||||
served by a circuit.
|
||||
|
||||
A "pending" circuit is one that we have started to build, but which has
|
||||
not yet completed.
|
||||
|
||||
A circuit or path "supports" a request if it is okay to use the
|
||||
circuit/path to fulfill the request, according to the rules given below.
|
||||
A circuit or path "might support" a request if some aspect of the request
|
||||
is unknown (usually its target IP), but we believe the path probably
|
||||
supports the request according to the rules given below.
|
||||
|
||||
2. Building circuits
|
||||
|
||||
2.1. When we build
|
||||
|
||||
2.1.1. Clients build circuits preemptively
|
||||
|
||||
When running as a client, Tor tries to maintain at least a certain
|
||||
number of clean circuits, so that new streams can be handled
|
||||
quickly. To increase the likelihood of success, Tor tries to
|
||||
predict what circuits will be useful by choosing from among nodes
|
||||
that support the ports we have used in the recent past (by default
|
||||
one hour). Specifically, on startup Tor tries to maintain one clean
|
||||
fast exit circuit that allows connections to port 80, and at least
|
||||
two fast clean stable internal circuits in case we get a resolve
|
||||
request or hidden service request (at least three if we _run_ a
|
||||
hidden service).
|
||||
|
||||
After that, Tor will adapt the circuits that it preemptively builds
|
||||
based on the requests it sees from the user: it tries to have two fast
|
||||
clean exit circuits available for every port seen within the past hour
|
||||
(each circuit can be adequate for many predicted ports -- it doesn't
|
||||
need two separate circuits for each port), and it tries to have the
|
||||
above internal circuits available if we've seen resolves or hidden
|
||||
service activity within the past hour. If there are 12 or more clean
|
||||
circuits open, it doesn't open more even if it has more predictions.
|
||||
|
||||
Only stable circuits can "cover" a port that is listed in the
|
||||
LongLivedPorts config option. Similarly, hidden service requests
|
||||
to ports listed in LongLivedPorts make us create stable internal
|
||||
circuits.
|
||||
|
||||
Note that if there are no requests from the user for an hour, Tor
|
||||
will predict no use and build no preemptive circuits.
|
||||
|
||||
The Tor client SHOULD NOT store its list of predicted requests to a
|
||||
persistent medium.
|
||||
|
||||
2.1.2. Clients build circuits on demand
|
||||
|
||||
Additionally, when a client request exists that no circuit (built or
|
||||
pending) might support, we create a new circuit to support the request.
|
||||
For exit connections, we pick an exit node that will handle the
|
||||
most pending requests (choosing arbitrarily among ties), launch a
|
||||
circuit to end there, and repeat until every unattached request
|
||||
might be supported by a pending or built circuit. For internal
|
||||
circuits, we pick an arbitrary acceptable path, repeating as needed.
|
||||
|
||||
In some cases we can reuse an already established circuit if it's
|
||||
clean; see Section 2.3 (cannibalizing circuits) for details.
|
||||
|
||||
2.1.3. Servers build circuits for testing reachability and bandwidth
|
||||
|
||||
Tor servers test reachability of their ORPort once they have
|
||||
successfully built a circuit (on start and whenever their IP address
|
||||
changes). They build an ordinary fast internal circuit with themselves
|
||||
as the last hop. As soon as any testing circuit succeeds, the Tor
|
||||
server decides it's reachable and is willing to publish a descriptor.
|
||||
|
||||
We launch multiple testing circuits (one at a time), until we
|
||||
have NUM_PARALLEL_TESTING_CIRC (4) such circuits open. Then we
|
||||
do a "bandwidth test" by sending a certain number of relay drop
|
||||
cells down each circuit: BandwidthRate * 10 / CELL_NETWORK_SIZE
|
||||
total cells divided across the four circuits, but never more than
|
||||
CIRCWINDOW_START (1000) cells total. This exercises both outgoing and
|
||||
incoming bandwidth, and helps to jumpstart the observed bandwidth
|
||||
(see dir-spec.txt).
|
||||
|
||||
Tor servers also test reachability of their DirPort once they have
|
||||
established a circuit, but they use an ordinary exit circuit for
|
||||
this purpose.
|
||||
|
||||
2.1.4. Hidden-service circuits
|
||||
|
||||
See section 4 below.
|
||||
|
||||
2.1.5. Rate limiting of failed circuits
|
||||
|
||||
If we fail to build a circuit N times in a X second period (see Section
|
||||
2.3 for how this works), we stop building circuits until the X seconds
|
||||
have elapsed.
|
||||
XXXX
|
||||
|
||||
2.1.6. When to tear down circuits
|
||||
|
||||
XXXX
|
||||
|
||||
2.2. Path selection and constraints
|
||||
|
||||
We choose the path for each new circuit before we build it. We choose the
|
||||
exit node first, followed by the other nodes in the circuit. All paths
|
||||
we generate obey the following constraints:
|
||||
- We do not choose the same router twice for the same path.
|
||||
- We do not choose any router in the same family as another in the same
|
||||
path.
|
||||
- We do not choose more than one router in a given /16 subnet
|
||||
(unless EnforceDistinctSubnets is 0).
|
||||
- We don't choose any non-running or non-valid router unless we have
|
||||
been configured to do so. By default, we are configured to allow
|
||||
non-valid routers in "middle" and "rendezvous" positions.
|
||||
- If we're using Guard nodes, the first node must be a Guard (see 5
|
||||
below)
|
||||
- XXXX Choosing the length
|
||||
|
||||
For circuits that do not need to be "fast", when choosing among
|
||||
multiple candidates for a path element, we choose randomly.
|
||||
|
||||
For "fast" circuits, we pick a given router as an exit with probability
|
||||
proportional to its advertised bandwidth [the smaller of the 'rate' and
|
||||
'observed' arguments to the "bandwidth" element in its descriptor]. If a
|
||||
router's advertised bandwidth is greater than MAX_BELIEVABLE_BANDWIDTH
|
||||
(currently 10 MB/s), we clip to that value.
|
||||
|
||||
For non-exit positions on "fast" circuits, we pick routers as above, but
|
||||
we weight the clipped advertised bandwidth of Exit-flagged nodes depending
|
||||
on the fraction of bandwidth available from non-Exit nodes. Call the
|
||||
total clipped advertised bandwidth for Exit nodes under consideration E,
|
||||
and the total clipped advertised bandwidth for all nodes under
|
||||
consideration T. If E<T/3, we do not consider Exit-flagged nodes.
|
||||
Otherwise, we weight their bandwidth with the factor (E-T/3)/E. This
|
||||
ensures that bandwidth is evenly distributed over nodes in 3-hop paths.
|
||||
|
||||
Similarly, guard nodes are weighted by the factor (G-T/3)/G, and not
|
||||
considered for non-guard positions if this value is less than 0.
|
||||
|
||||
Additionally, we may be building circuits with one or more requests in
|
||||
mind. Each kind of request puts certain constraints on paths:
|
||||
|
||||
- All service-side introduction circuits and all rendezvous paths
|
||||
should be Stable.
|
||||
- All connection requests for connections that we think will need to
|
||||
stay open a long time require Stable circuits. Currently, Tor decides
|
||||
this by examining the request's target port, and comparing it to a
|
||||
list of "long-lived" ports. (Default: 21, 22, 706, 1863, 5050,
|
||||
5190, 5222, 5223, 6667, 6697, 8300.)
|
||||
- DNS resolves require an exit node whose exit policy is not equivalent
|
||||
to "reject *:*".
|
||||
- Reverse DNS resolves require a version of Tor with advertised eventdns
|
||||
support (available in Tor 0.1.2.1-alpha-dev and later).
|
||||
- All connection requests require an exit node whose exit policy
|
||||
supports their target address and port (if known), or which "might
|
||||
support it" (if the address isn't known). See 2.2.1.
|
||||
- Rules for Fast? XXXXX
|
||||
|
||||
2.2.1. Choosing an exit
|
||||
|
||||
If we know what IP address we want to connect to or resolve, we can
|
||||
trivially tell whether a given router will support it by simulating
|
||||
its declared exit policy.
|
||||
|
||||
Because we often connect to addresses of the form hostname:port, we do not
|
||||
always know the target IP address when we select an exit node. In these
|
||||
cases, we need to pick an exit node that "might support" connections to a
|
||||
given address port with an unknown address. An exit node "might support"
|
||||
such a connection if any clause that accepts any connections to that port
|
||||
precedes all clauses (if any) that reject all connections to that port.
|
||||
|
||||
Unless requested to do so by the user, we never choose an exit server
|
||||
flagged as "BadExit" by more than half of the authorities who advertise
|
||||
themselves as listing bad exits.
|
||||
|
||||
2.2.2. User configuration
|
||||
|
||||
Users can alter the default behavior for path selection with configuration
|
||||
options.
|
||||
|
||||
- If "ExitNodes" is provided, then every request requires an exit node on
|
||||
the ExitNodes list. (If a request is supported by no nodes on that list,
|
||||
and StrictExitNodes is false, then Tor treats that request as if
|
||||
ExitNodes were not provided.)
|
||||
|
||||
- "EntryNodes" and "StrictEntryNodes" behave analogously.
|
||||
|
||||
- If a user tries to connect to or resolve a hostname of the form
|
||||
<target>.<servername>.exit, the request is rewritten to a request for
|
||||
<target>, and the request is only supported by the exit whose nickname
|
||||
or fingerprint is <servername>.
|
||||
|
||||
2.3. Cannibalizing circuits
|
||||
|
||||
If we need a circuit and have a clean one already established, in
|
||||
some cases we can adapt the clean circuit for our new
|
||||
purpose. Specifically,
|
||||
|
||||
For hidden service interactions, we can "cannibalize" a clean internal
|
||||
circuit if one is available, so we don't need to build those circuits
|
||||
from scratch on demand.
|
||||
|
||||
We can also cannibalize clean circuits when the client asks to exit
|
||||
at a given node -- either via the ".exit" notation or because the
|
||||
destination is running at the same location as an exit node.
|
||||
|
||||
|
||||
2.4. Handling failure
|
||||
|
||||
If an attempt to extend a circuit fails (either because the first create
|
||||
failed or a subsequent extend failed) then the circuit is torn down and is
|
||||
no longer pending. (XXXX really?) Requests that might have been
|
||||
supported by the pending circuit thus become unsupported, and a new
|
||||
circuit needs to be constructed.
|
||||
|
||||
If a stream "begin" attempt fails with an EXITPOLICY error, we
|
||||
decide that the exit node's exit policy is not correctly advertised,
|
||||
so we treat the exit node as if it were a non-exit until we retrieve
|
||||
a fresh descriptor for it.
|
||||
|
||||
XXXX
|
||||
|
||||
3. Attaching streams to circuits
|
||||
|
||||
When a circuit that might support a request is built, Tor tries to attach
|
||||
the request's stream to the circuit and sends a BEGIN, BEGIN_DIR,
|
||||
or RESOLVE relay
|
||||
cell as appropriate. If the request completes unsuccessfully, Tor
|
||||
considers the reason given in the CLOSE relay cell. [XXX yes, and?]
|
||||
|
||||
|
||||
After a request has remained unattached for SocksTimeout (2 minutes
|
||||
by default), Tor abandons the attempt and signals an error to the
|
||||
client as appropriate (e.g., by closing the SOCKS connection).
|
||||
|
||||
XXX Timeouts and when Tor auto-retries.
|
||||
* What stream-end-reasons are appropriate for retrying.
|
||||
|
||||
If no reply to BEGIN/RESOLVE, then the stream will timeout and fail.
|
||||
|
||||
4. Hidden-service related circuits
|
||||
|
||||
XXX Tracking expected hidden service use (client-side and hidserv-side)
|
||||
|
||||
5. Guard nodes
|
||||
|
||||
We use Guard nodes (also called "helper nodes" in the literature) to
|
||||
prevent certain profiling attacks. Here's the risk: if we choose entry and
|
||||
exit nodes at random, and an attacker controls C out of N servers
|
||||
(ignoring advertised bandwidth), then the
|
||||
attacker will control the entry and exit node of any given circuit with
|
||||
probability (C/N)^2. But as we make many different circuits over time,
|
||||
then the probability that the attacker will see a sample of about (C/N)^2
|
||||
of our traffic goes to 1. Since statistical sampling works, the attacker
|
||||
can be sure of learning a profile of our behavior.
|
||||
|
||||
If, on the other hand, we picked an entry node and held it fixed, we would
|
||||
have probability C/N of choosing a bad entry and being profiled, and
|
||||
probability (N-C)/N of choosing a good entry and not being profiled.
|
||||
|
||||
When guard nodes are enabled, Tor maintains an ordered list of entry nodes
|
||||
as our chosen guards, and stores this list persistently to disk. If a Guard
|
||||
node becomes unusable, rather than replacing it, Tor adds new guards to the
|
||||
end of the list. When choosing the first hop of a circuit, Tor
|
||||
chooses at
|
||||
random from among the first NumEntryGuards (default 3) usable guards on the
|
||||
list. If there are not at least 2 usable guards on the list, Tor adds
|
||||
routers until there are, or until there are no more usable routers to add.
|
||||
|
||||
A guard is unusable if any of the following hold:
|
||||
- it is not marked as a Guard by the networkstatuses,
|
||||
- it is not marked Valid (and the user hasn't set AllowInvalid entry)
|
||||
- it is not marked Running
|
||||
- Tor couldn't reach it the last time it tried to connect
|
||||
|
||||
A guard is unusable for a particular circuit if any of the rules for path
|
||||
selection in 2.2 are not met. In particular, if the circuit is "fast"
|
||||
and the guard is not Fast, or if the circuit is "stable" and the guard is
|
||||
not Stable, or if the guard has already been chosen as the exit node in
|
||||
that circuit, Tor can't use it as a guard node for that circuit.
|
||||
|
||||
If the guard is excluded because of its status in the networkstatuses for
|
||||
over 30 days, Tor removes it from the list entirely, preserving order.
|
||||
|
||||
If Tor fails to connect to an otherwise usable guard, it retries
|
||||
periodically: every hour for six hours, every 4 hours for 3 days, every
|
||||
18 hours for a week, and every 36 hours thereafter. Additionally, Tor
|
||||
retries unreachable guards the first time it adds a new guard to the list,
|
||||
since it is possible that the old guards were only marked as unreachable
|
||||
because the network was unreachable or down.
|
||||
|
||||
Tor does not add a guard persistently to the list until the first time we
|
||||
have connected to it successfully.
|
||||
|
||||
6. Router descriptor purposes
|
||||
|
||||
There are currently three "purposes" supported for router descriptors:
|
||||
general, controller, and bridge. Most descriptors are of type general
|
||||
-- these are the ones listed in the consensus, and the ones fetched
|
||||
and used in normal cases.
|
||||
|
||||
Controller-purpose descriptors are those delivered by the controller
|
||||
and labelled as such: they will be kept around (and expire like
|
||||
normal descriptors), and they can be used by the controller in its
|
||||
CIRCUITEXTEND commands. Otherwise they are ignored by Tor when it
|
||||
chooses paths.
|
||||
|
||||
Bridge-purpose descriptors are for routers that are used as bridges. See
|
||||
doc/design-paper/blocking.pdf for more design explanation, or proposal
|
||||
125 for specific details. Currently bridge descriptors are used in place
|
||||
of normal entry guards, for Tor clients that have UseBridges enabled.
|
||||
|
||||
|
||||
X. Old notes
|
||||
|
||||
X.1. Do we actually do this?
|
||||
|
||||
How to deal with network down.
|
||||
- While all helpers are down/unreachable and there are no established
|
||||
or on-the-way testing circuits, launch a testing circuit. (Do this
|
||||
periodically in the same way we try to establish normal circuits
|
||||
when things are working normally.)
|
||||
(Testing circuits are a special type of circuit, that streams won't
|
||||
attach to by accident.)
|
||||
- When a testing circuit succeeds, mark all helpers up and hold
|
||||
the testing circuit open.
|
||||
- If a connection to a helper succeeds, close all testing circuits.
|
||||
Else mark that helper down and try another.
|
||||
- If the last helper is marked down and we already have a testing
|
||||
circuit established, then add the first hop of that testing circuit
|
||||
to the end of our helper node list, close that testing circuit,
|
||||
and go back to square one. (Actually, rather than closing the
|
||||
testing circuit, can we get away with converting it to a normal
|
||||
circuit and beginning to use it immediately?)
|
||||
|
||||
[Do we actually do any of the above? If so, let's spec it. If not, let's
|
||||
remove it. -NM]
|
||||
|
||||
X.2. A thing we could do to deal with reachability.
|
||||
|
||||
And as a bonus, it leads to an answer to Nick's attack ("If I pick
|
||||
my helper nodes all on 18.0.0.0:*, then I move, you'll know where I
|
||||
bootstrapped") -- the answer is to pick your original three helper nodes
|
||||
without regard for reachability. Then the above algorithm will add some
|
||||
more that are reachable for you, and if you move somewhere, it's more
|
||||
likely (though not certain) that some of the originals will become useful.
|
||||
Is that smart or just complex?
|
||||
|
||||
X.3. Some stuff that worries me about entry guards. 2006 Jun, Nickm.
|
||||
|
||||
It is unlikely for two users to have the same set of entry guards.
|
||||
Observing a user is sufficient to learn its entry guards. So, as we move
|
||||
around, entry guards make us linkable. If we want to change guards when
|
||||
our location (IP? subnet?) changes, we have two bad options. We could
|
||||
- Drop the old guards. But if we go back to our old location,
|
||||
we'll not use our old guards. For a laptop that sometimes gets used
|
||||
from work and sometimes from home, this is pretty fatal.
|
||||
- Remember the old guards as associated with the old location, and use
|
||||
them again if we ever go back to the old location. This would be
|
||||
nasty, since it would force us to record where we've been.
|
||||
|
||||
[Do we do any of this now? If not, this should move into 099-misc or
|
||||
098-todo. -NM]
|
||||
|
@ -1,161 +0,0 @@
|
||||
Filename: 000-index.txt
|
||||
Title: Index of Tor Proposals
|
||||
Version: $Revision$
|
||||
Last-Modified: $Date$
|
||||
Author: Nick Mathewson
|
||||
Created: 26-Jan-2007
|
||||
Status: Meta
|
||||
|
||||
Overview:
|
||||
|
||||
This document provides an index to Tor proposals.
|
||||
|
||||
This is an informational document.
|
||||
|
||||
Everything in this document below the line of '=' signs is automatically
|
||||
generated by reindex.py; do not edit by hand.
|
||||
|
||||
============================================================
|
||||
Proposals by number:
|
||||
|
||||
000 Index of Tor Proposals [META]
|
||||
001 The Tor Proposal Process [META]
|
||||
098 Proposals that should be written [META]
|
||||
099 Miscellaneous proposals [META]
|
||||
100 Tor Unreliable Datagram Extension Proposal [DEAD]
|
||||
101 Voting on the Tor Directory System [CLOSED]
|
||||
102 Dropping "opt" from the directory format [CLOSED]
|
||||
103 Splitting identity key from regularly used signing key [CLOSED]
|
||||
104 Long and Short Router Descriptors [CLOSED]
|
||||
105 Version negotiation for the Tor protocol [CLOSED]
|
||||
106 Checking fewer things during TLS handshakes [CLOSED]
|
||||
107 Uptime Sanity Checking [CLOSED]
|
||||
108 Base "Stable" Flag on Mean Time Between Failures [CLOSED]
|
||||
109 No more than one server per IP address [CLOSED]
|
||||
110 Avoiding infinite length circuits [ACCEPTED]
|
||||
111 Prioritizing local traffic over relayed traffic [CLOSED]
|
||||
112 Bring Back Pathlen Coin Weight [SUPERSEDED]
|
||||
113 Simplifying directory authority administration [SUPERSEDED]
|
||||
114 Distributed Storage for Tor Hidden Service Descriptors [CLOSED]
|
||||
115 Two Hop Paths [DEAD]
|
||||
116 Two hop paths from entry guards [DEAD]
|
||||
117 IPv6 exits [ACCEPTED]
|
||||
118 Advertising multiple ORPorts at once [ACCEPTED]
|
||||
119 New PROTOCOLINFO command for controllers [CLOSED]
|
||||
120 Shutdown descriptors when Tor servers stop [DEAD]
|
||||
121 Hidden Service Authentication [FINISHED]
|
||||
122 Network status entries need a new Unnamed flag [CLOSED]
|
||||
123 Naming authorities automatically create bindings [CLOSED]
|
||||
124 Blocking resistant TLS certificate usage [SUPERSEDED]
|
||||
125 Behavior for bridge users, bridge relays, and bridge authorities [CLOSED]
|
||||
126 Getting GeoIP data and publishing usage summaries [CLOSED]
|
||||
127 Relaying dirport requests to Tor download site / website [DRAFT]
|
||||
128 Families of private bridges [DEAD]
|
||||
129 Block Insecure Protocols by Default [CLOSED]
|
||||
130 Version 2 Tor connection protocol [CLOSED]
|
||||
131 Help users to verify they are using Tor [NEEDS-REVISION]
|
||||
132 A Tor Web Service For Verifying Correct Browser Configuration [DRAFT]
|
||||
133 Incorporate Unreachable ORs into the Tor Network [DRAFT]
|
||||
134 More robust consensus voting with diverse authority sets [ACCEPTED]
|
||||
135 Simplify Configuration of Private Tor Networks [CLOSED]
|
||||
136 Mass authority migration with legacy keys [CLOSED]
|
||||
137 Keep controllers informed as Tor bootstraps [CLOSED]
|
||||
138 Remove routers that are not Running from consensus documents [CLOSED]
|
||||
139 Download consensus documents only when it will be trusted [CLOSED]
|
||||
140 Provide diffs between consensuses [ACCEPTED]
|
||||
141 Download server descriptors on demand [DRAFT]
|
||||
142 Combine Introduction and Rendezvous Points [DEAD]
|
||||
143 Improvements of Distributed Storage for Tor Hidden Service Descriptors [OPEN]
|
||||
144 Increase the diversity of circuits by detecting nodes belonging the same provider [DRAFT]
|
||||
145 Separate "suitable as a guard" from "suitable as a new guard" [OPEN]
|
||||
146 Add new flag to reflect long-term stability [OPEN]
|
||||
147 Eliminate the need for v2 directories in generating v3 directories [ACCEPTED]
|
||||
148 Stream end reasons from the client side should be uniform [CLOSED]
|
||||
149 Using data from NETINFO cells [OPEN]
|
||||
150 Exclude Exit Nodes from a circuit [CLOSED]
|
||||
151 Improving Tor Path Selection [DRAFT]
|
||||
152 Optionally allow exit from single-hop circuits [CLOSED]
|
||||
153 Automatic software update protocol [SUPERSEDED]
|
||||
154 Automatic Software Update Protocol [SUPERSEDED]
|
||||
155 Four Improvements of Hidden Service Performance [FINISHED]
|
||||
156 Tracking blocked ports on the client side [OPEN]
|
||||
157 Make certificate downloads specific [ACCEPTED]
|
||||
158 Clients download consensus + microdescriptors [OPEN]
|
||||
159 Exit Scanning [OPEN]
|
||||
|
||||
|
||||
Proposals by status:
|
||||
|
||||
DRAFT:
|
||||
127 Relaying dirport requests to Tor download site / website
|
||||
132 A Tor Web Service For Verifying Correct Browser Configuration
|
||||
133 Incorporate Unreachable ORs into the Tor Network
|
||||
141 Download server descriptors on demand
|
||||
144 Increase the diversity of circuits by detecting nodes belonging the same provider
|
||||
151 Improving Tor Path Selection
|
||||
NEEDS-REVISION:
|
||||
131 Help users to verify they are using Tor
|
||||
OPEN:
|
||||
143 Improvements of Distributed Storage for Tor Hidden Service Descriptors [for 0.2.1.x]
|
||||
145 Separate "suitable as a guard" from "suitable as a new guard" [for 0.2.1.x]
|
||||
146 Add new flag to reflect long-term stability [for 0.2.1.x]
|
||||
149 Using data from NETINFO cells [for 0.2.1.x]
|
||||
156 Tracking blocked ports on the client side [for 0.2.?]
|
||||
158 Clients download consensus + microdescriptors
|
||||
159 Exit Scanning
|
||||
ACCEPTED:
|
||||
110 Avoiding infinite length circuits [for 0.2.1.x] [in 0.2.1.3-alpha]
|
||||
117 IPv6 exits [for 0.2.1.x]
|
||||
118 Advertising multiple ORPorts at once [for 0.2.1.x]
|
||||
134 More robust consensus voting with diverse authority sets [for 0.2.2.x]
|
||||
140 Provide diffs between consensuses [for 0.2.2.x]
|
||||
147 Eliminate the need for v2 directories in generating v3 directories [for 0.2.1.x]
|
||||
157 Make certificate downloads specific [for 0.2.1.x]
|
||||
META:
|
||||
000 Index of Tor Proposals
|
||||
001 The Tor Proposal Process
|
||||
098 Proposals that should be written
|
||||
099 Miscellaneous proposals
|
||||
FINISHED:
|
||||
121 Hidden Service Authentication [in 0.2.1.x]
|
||||
155 Four Improvements of Hidden Service Performance [in 0.2.1.x]
|
||||
CLOSED:
|
||||
101 Voting on the Tor Directory System [in 0.2.0.x]
|
||||
102 Dropping "opt" from the directory format [in 0.2.0.x]
|
||||
103 Splitting identity key from regularly used signing key [in 0.2.0.x]
|
||||
104 Long and Short Router Descriptors [in 0.2.0.x]
|
||||
105 Version negotiation for the Tor protocol [in 0.2.0.x]
|
||||
106 Checking fewer things during TLS handshakes [in 0.2.0.x]
|
||||
107 Uptime Sanity Checking [in 0.2.0.x]
|
||||
108 Base "Stable" Flag on Mean Time Between Failures [in 0.2.0.x]
|
||||
109 No more than one server per IP address [in 0.2.0.x]
|
||||
111 Prioritizing local traffic over relayed traffic [in 0.2.0.x]
|
||||
114 Distributed Storage for Tor Hidden Service Descriptors [in 0.2.0.x]
|
||||
119 New PROTOCOLINFO command for controllers [in 0.2.0.x]
|
||||
122 Network status entries need a new Unnamed flag [in 0.2.0.x]
|
||||
123 Naming authorities automatically create bindings [in 0.2.0.x]
|
||||
125 Behavior for bridge users, bridge relays, and bridge authorities [in 0.2.0.x]
|
||||
126 Getting GeoIP data and publishing usage summaries [in 0.2.0.x]
|
||||
129 Block Insecure Protocols by Default [in 0.2.0.x]
|
||||
130 Version 2 Tor connection protocol [in 0.2.0.x]
|
||||
135 Simplify Configuration of Private Tor Networks [for 0.2.1.x] [in 0.2.1.2-alpha]
|
||||
136 Mass authority migration with legacy keys [in 0.2.0.x]
|
||||
137 Keep controllers informed as Tor bootstraps [in 0.2.1.x]
|
||||
138 Remove routers that are not Running from consensus documents [in 0.2.1.2-alpha]
|
||||
139 Download consensus documents only when it will be trusted [in 0.2.1.x]
|
||||
148 Stream end reasons from the client side should be uniform [in 0.2.1.9-alpha]
|
||||
150 Exclude Exit Nodes from a circuit [in 0.2.1.3-alpha]
|
||||
152 Optionally allow exit from single-hop circuits [in 0.2.1.6-alpha]
|
||||
SUPERSEDED:
|
||||
112 Bring Back Pathlen Coin Weight
|
||||
113 Simplifying directory authority administration
|
||||
124 Blocking resistant TLS certificate usage
|
||||
153 Automatic software update protocol
|
||||
154 Automatic Software Update Protocol
|
||||
DEAD:
|
||||
100 Tor Unreliable Datagram Extension Proposal
|
||||
115 Two Hop Paths
|
||||
116 Two hop paths from entry guards
|
||||
120 Shutdown descriptors when Tor servers stop
|
||||
128 Families of private bridges
|
||||
142 Combine Introduction and Rendezvous Points
|
@ -1,187 +0,0 @@
|
||||
Filename: 001-process.txt
|
||||
Title: The Tor Proposal Process
|
||||
Version: $Revision$
|
||||
Last-Modified: $Date$
|
||||
Author: Nick Mathewson
|
||||
Created: 30-Jan-2007
|
||||
Status: Meta
|
||||
|
||||
Overview:
|
||||
|
||||
This document describes how to change the Tor specifications, how Tor
|
||||
proposals work, and the relationship between Tor proposals and the
|
||||
specifications.
|
||||
|
||||
This is an informational document.
|
||||
|
||||
Motivation:
|
||||
|
||||
Previously, our process for updating the Tor specifications was maximally
|
||||
informal: we'd patch the specification (sometimes forking first, and
|
||||
sometimes not), then discuss the patches, reach consensus, and implement
|
||||
the changes.
|
||||
|
||||
This had a few problems.
|
||||
|
||||
First, even at its most efficient, the old process would often have the
|
||||
spec out of sync with the code. The worst cases were those where
|
||||
implementation was deferred: the spec and code could stay out of sync for
|
||||
versions at a time.
|
||||
|
||||
Second, it was hard to participate in discussion, since you had to know
|
||||
which portions of the spec were a proposal, and which were already
|
||||
implemented.
|
||||
|
||||
Third, it littered the specifications with too many inline comments.
|
||||
[This was a real problem -NM]
|
||||
[Especially when it went to multiple levels! -NM]
|
||||
[XXXX especially when they weren't signed and talked about that
|
||||
thing that you can't remember after a year]
|
||||
|
||||
How to change the specs now:
|
||||
|
||||
First, somebody writes a proposal document. It should describe the change
|
||||
that should be made in detail, and give some idea of how to implement it.
|
||||
Once it's fleshed out enough, it becomes a proposal.
|
||||
|
||||
Like an RFC, every proposal gets a number. Unlike RFCs, proposals can
|
||||
change over time and keep the same number, until they are finally
|
||||
accepted or rejected. The history for each proposal
|
||||
will be stored in the Tor Subversion repository.
|
||||
|
||||
Once a proposal is in the repository, we should discuss and improve it
|
||||
until we've reached consensus that it's a good idea, and that it's
|
||||
detailed enough to implement. When this happens, we implement the
|
||||
proposal and incorporate it into the specifications. Thus, the specs
|
||||
remain the canonical documentation for the Tor protocol: no proposal is
|
||||
ever the canonical documentation for an implemented feature.
|
||||
|
||||
(This process is pretty similar to the Python Enhancement Process, with
|
||||
the major exception that Tor proposals get re-integrated into the specs
|
||||
after implementation, whereas PEPs _become_ the new spec.)
|
||||
|
||||
{It's still okay to make small changes directly to the spec if the code
|
||||
can be
|
||||
written more or less immediately, or cosmetic changes if no code change is
|
||||
required. This document reflects the current developers' _intent_, not
|
||||
a permanent promise to always use this process in the future: we reserve
|
||||
the right to get really excited and run off and implement something in a
|
||||
caffeine-or-m&m-fueled all-night hacking session.}
|
||||
|
||||
How new proposals get added:
|
||||
|
||||
Once an idea has been proposed on the development list, a properly formatted
|
||||
(see below) draft exists, and rough consensus within the active development
|
||||
community exists that this idea warrants consideration, the proposal editor
|
||||
will officially add the proposal.
|
||||
|
||||
To get your proposal in, send it to or-dev.
|
||||
|
||||
The current proposal editor is Nick Mathewson.
|
||||
|
||||
What should go in a proposal:
|
||||
|
||||
Every proposal should have a header containing these fields:
|
||||
Filename, Title, Version, Last-Modified, Author, Created, Status.
|
||||
The Version and Last-Modified fields should use the SVN Revision and Date
|
||||
tags respectively.
|
||||
|
||||
These fields are optional but recommended:
|
||||
Target, Implemented-In.
|
||||
The Target field should describe which version the proposal is hoped to be
|
||||
implemented in (if it's Open or Accepted). The Implemented-In field
|
||||
should describe which version the proposal was implemented in (if it's
|
||||
Finished or Closed).
|
||||
|
||||
The body of the proposal should start with an Overview section explaining
|
||||
what the proposal's about, what it does, and about what state it's in.
|
||||
|
||||
After the Overview, the proposal becomes more free-form. Depending on its
|
||||
the length and complexity, the proposal can break into sections as
|
||||
appropriate, or follow a short discursive format. Every proposal should
|
||||
contain at least the following information before it is "ACCEPTED",
|
||||
though the information does not need to be in sections with these names.
|
||||
|
||||
Motivation: What problem is the proposal trying to solve? Why does
|
||||
this problem matter? If several approaches are possible, why take this
|
||||
one?
|
||||
|
||||
Design: A high-level view of what the new or modified features are, how
|
||||
the new or modified features work, how they interoperate with each
|
||||
other, and how they interact with the rest of Tor. This is the main
|
||||
body of the proposal. Some proposals will start out with only a
|
||||
Motivation and a Design, and wait for a specification until the
|
||||
Design seems approximately right.
|
||||
|
||||
Security implications: What effects the proposed changes might have on
|
||||
anonymity, how well understood these effects are, and so on.
|
||||
|
||||
Specification: A detailed description of what needs to be added to the
|
||||
Tor specifications in order to implement the proposal. This should
|
||||
be in about as much detail as the specifications will eventually
|
||||
contain: it should be possible for independent programmers to write
|
||||
mutually compatible implementations of the proposal based on its
|
||||
specifications.
|
||||
|
||||
Compatibility: Will versions of Tor that follow the proposal be
|
||||
compatible with versions that do not? If so, how will compatibility
|
||||
be achieved? Generally, we try to not drop compatibility if at
|
||||
all possible; we haven't made a "flag day" change since May 2004,
|
||||
and we don't want to do another one.
|
||||
|
||||
Implementation: If the proposal will be tricky to implement in Tor's
|
||||
current architecture, the document can contain some discussion of how
|
||||
to go about making it work.
|
||||
|
||||
Performance and scalability notes: If the feature will have an effect
|
||||
on performance (in RAM, CPU, bandwidth) or scalability, there should
|
||||
be some analysis on how significant this effect will be, so that we
|
||||
can avoid really expensive performance regressions, and so we can
|
||||
avoid wasting time on insignificant gains.
|
||||
|
||||
Proposal status:
|
||||
|
||||
Open: A proposal under discussion.
|
||||
|
||||
Accepted: The proposal is complete, and we intend to implement it.
|
||||
After this point, substantive changes to the proposal should be
|
||||
avoided, and regarded as a sign of the process having failed
|
||||
somewhere.
|
||||
|
||||
Finished: The proposal has been accepted and implemented. After this
|
||||
point, the proposal should not be changed.
|
||||
|
||||
Closed: The proposal has been accepted, implemented, and merged into the
|
||||
main specification documents. The proposal should not be changed after
|
||||
this point.
|
||||
|
||||
Rejected: We're not going to implement the feature as described here,
|
||||
though we might do some other version. See comments in the document
|
||||
for details. The proposal should not be changed after this point;
|
||||
to bring up some other version of the idea, write a new proposal.
|
||||
|
||||
Draft: This isn't a complete proposal yet; there are definite missing
|
||||
pieces. Please don't add any new proposals with this status; put them
|
||||
in the "ideas" sub-directory instead.
|
||||
|
||||
Needs-Revision: The idea for the proposal is a good one, but the proposal
|
||||
as it stands has serious problems that keep it from being accepted.
|
||||
See comments in the document for details.
|
||||
|
||||
Dead: The proposal hasn't been touched in a long time, and it doesn't look
|
||||
like anybody is going to complete it soon. It can become "Open" again
|
||||
if it gets a new proponent.
|
||||
|
||||
Needs-Research: There are research problems that need to be solved before
|
||||
it's clear whether the proposal is a good idea.
|
||||
|
||||
Meta: This is not a proposal, but a document about proposals.
|
||||
|
||||
|
||||
The editor maintains the correct status of proposals, based on rough
|
||||
consensus and his own discretion.
|
||||
|
||||
Proposal numbering:
|
||||
|
||||
Numbers 000-099 are reserved for special and meta-proposals. 100 and up
|
||||
are used for actual proposals. Numbers aren't recycled.
|
@ -1,109 +0,0 @@
|
||||
Filename: 098-todo.txt
|
||||
Title: Proposals that should be written
|
||||
Version: $Revision$
|
||||
Last-Modified: $Date$
|
||||
Author: Nick Mathewson, Roger Dingledine
|
||||
Created: 26-Jan-2007
|
||||
Status: Meta
|
||||
|
||||
Overview:
|
||||
|
||||
This document lists ideas that various people have had for improving the
|
||||
Tor protocol. These should be implemented and specified if they're
|
||||
trivial, or written up as proposals if they're not.
|
||||
|
||||
This is an active document, to be edited as proposals are written and as
|
||||
we come up with new ideas for proposals. We should take stuff out as it
|
||||
seems irrelevant.
|
||||
|
||||
|
||||
For some later protocol version.
|
||||
|
||||
- It would be great to get smarter about identity and linkability.
|
||||
It's not crazy to say, "Never use the same circuit for my SSH
|
||||
connections and my web browsing." How far can/should we take this?
|
||||
See ideas/xxx-separate-streams-by-port.txt for a start.
|
||||
|
||||
- Fix onionskin handshake scheme to be more mainstream, less nutty.
|
||||
Can we just do
|
||||
E(HMAC(g^x), g^x) rather than just E(g^x) ?
|
||||
No, that has the same flaws as before. We should send
|
||||
E(g^x, C) with random C and expect g^y, HMAC_C(K=g^xy).
|
||||
Better ask Ian; probably Stephen too.
|
||||
|
||||
- Length on CREATE and friends
|
||||
|
||||
- Versioning on circuits and create cells, so we have a clear path
|
||||
to improve the circuit protocol.
|
||||
|
||||
- SHA1 is showing its age. We should get a design for upgrading our
|
||||
hash once the AHS competition is done, or even sooner.
|
||||
|
||||
- Not being able to upgrade ciphersuites or increase key lengths is
|
||||
lame.
|
||||
- Paul has some ideas about circuit creation; read his PET paper once it's
|
||||
out.
|
||||
|
||||
Any time:
|
||||
|
||||
- Some ideas for revising the directory protocol:
|
||||
- Extend the "r" line in network-status to give a set of buckets (say,
|
||||
comma-separated) for that router.
|
||||
- Buckets are deterministic based on IP address.
|
||||
- Then clients can choose a bucket (or set of buckets) to
|
||||
download and use.
|
||||
- We need a way for the authorities to declare that nodes are in a
|
||||
family. Also, it kinda sucks that family declarations use O(N^2) space
|
||||
in the descriptors.
|
||||
- REASON_CONNECTFAILED should include an IP.
|
||||
- Spec should incorporate some prose from tor-design to be more readable.
|
||||
- Spec when we should rotate which keys
|
||||
- Spec how to publish descriptors less often
|
||||
- Describe pros and cons of non-deterministic path lengths
|
||||
|
||||
- We should use a variable-length path length by default -- 3 +/- some
|
||||
distribution. Need to think harder about allowing values less than 3,
|
||||
and there's a tradeoff between having a wide variance and performance.
|
||||
|
||||
- Clients currently use certs during TLS. Is this wise? It does make it
|
||||
easier for servers to tell which NATted client is which. We could use a
|
||||
seprate set of certs for each guard, I suppose, but generating so many
|
||||
certs could get expensive. Omitting them entirely would make OP->OR
|
||||
easier to tell from OR->OR.
|
||||
|
||||
Things that should change...
|
||||
|
||||
B.1. ... but which will require backward-incompatible change
|
||||
|
||||
- Circuit IDs should be longer.
|
||||
. IPv6 everywhere.
|
||||
- Maybe, keys should be longer.
|
||||
- Maybe, key-length should be adjustable. How to do this without
|
||||
making anonymity suck?
|
||||
- Drop backward compatibility.
|
||||
- We should use a 128-bit subgroup of our DH prime.
|
||||
- Handshake should use HMAC.
|
||||
- Multiple cell lengths.
|
||||
- Ability to split circuits across paths (If this is useful.)
|
||||
- SENDME windows should be dynamic.
|
||||
|
||||
- Directory
|
||||
- Stop ever mentioning socks ports
|
||||
|
||||
B.1. ... and that will require no changes
|
||||
|
||||
- Advertised outbound IP?
|
||||
- Migrate streams across circuits.
|
||||
- Fix bug 469 by limiting the number of simultaneous connections per IP.
|
||||
|
||||
B.2. ... and that we have no idea how to do.
|
||||
|
||||
- UDP (as transport)
|
||||
- UDP (as content)
|
||||
- Use a better AES mode that has built-in integrity checking,
|
||||
doesn't grow with the number of hops, is not patented, and
|
||||
is implemented and maintained by smart people.
|
||||
|
||||
Let onion keys be not just RSA but maybe DH too, for Paul's reply onion
|
||||
design.
|
||||
|
@ -1,30 +0,0 @@
|
||||
Filename: 099-misc.txt
|
||||
Title: Miscellaneous proposals
|
||||
Version: $Revision$
|
||||
Last-Modified: $Date$
|
||||
Author: Various
|
||||
Created: 26-Jan-2007
|
||||
Status: Meta
|
||||
|
||||
Overview:
|
||||
|
||||
This document is for small proposal ideas that are about one paragraph in
|
||||
length. From here, ideas can be rejected outright, expanded into full
|
||||
proposals, or specified and implemented as-is.
|
||||
|
||||
Proposals
|
||||
|
||||
1. Directory compression.
|
||||
|
||||
Gzip would be easier to work with than zlib; bzip2 would result in smaller
|
||||
data lengths. [Concretely, we're looking at about 10-15% space savings at
|
||||
the expense of 3-5x longer compression time for using bzip2.] Doing
|
||||
on-the-fly gzip requires zlib 1.2 or later; doing bzip2 requires bzlib.
|
||||
Pre-compressing status documents in multiple formats would force us to use
|
||||
more memory to hold them.
|
||||
|
||||
Status: Open
|
||||
|
||||
-- Nick Mathewson
|
||||
|
||||
|
@ -1,424 +0,0 @@
|
||||
Filename: 100-tor-spec-udp.txt
|
||||
Title: Tor Unreliable Datagram Extension Proposal
|
||||
Version: $Revision$
|
||||
Last-Modified: $Date$
|
||||
Author: Marc Liberatore
|
||||
Created: 23 Feb 2006
|
||||
Status: Dead
|
||||
|
||||
Overview:
|
||||
|
||||
This is a modified version of the Tor specification written by Marc
|
||||
Liberatore to add UDP support to Tor. For each TLS link, it adds a
|
||||
corresponding DTLS link: control messages and TCP data flow over TLS, and
|
||||
UDP data flows over DTLS.
|
||||
|
||||
This proposal is not likely to be accepted as-is; see comments at the end
|
||||
of the document.
|
||||
|
||||
|
||||
Contents
|
||||
|
||||
0. Introduction
|
||||
|
||||
Tor is a distributed overlay network designed to anonymize low-latency
|
||||
TCP-based applications. The current tor specification supports only
|
||||
TCP-based traffic. This limitation prevents the use of tor to anonymize
|
||||
other important applications, notably voice over IP software. This document
|
||||
is a proposal to extend the tor specification to support UDP traffic.
|
||||
|
||||
The basic design philosophy of this extension is to add support for
|
||||
tunneling unreliable datagrams through tor with as few modifications to the
|
||||
protocol as possible. As currently specified, tor cannot directly support
|
||||
such tunneling, as connections between nodes are built using transport layer
|
||||
security (TLS) atop TCP. The latency incurred by TCP is likely unacceptable
|
||||
to the operation of most UDP-based application level protocols.
|
||||
|
||||
Thus, we propose the addition of links between nodes using datagram
|
||||
transport layer security (DTLS). These links allow packets to traverse a
|
||||
route through tor quickly, but their unreliable nature requires minor
|
||||
changes to the tor protocol. This proposal outlines the necessary
|
||||
additions and changes to the tor specification to support UDP traffic.
|
||||
|
||||
We note that a separate set of DTLS links between nodes creates a second
|
||||
overlay, distinct from the that composed of TLS links. This separation and
|
||||
resulting decrease in each anonymity set's size will make certain attacks
|
||||
easier. However, it is our belief that VoIP support in tor will
|
||||
dramatically increase its appeal, and correspondingly, the size of its user
|
||||
base, number of deployed nodes, and total traffic relayed. These increases
|
||||
should help offset the loss of anonymity that two distinct networks imply.
|
||||
|
||||
1. Overview of Tor-UDP and its complications
|
||||
|
||||
As described above, this proposal extends the Tor specification to support
|
||||
UDP with as few changes as possible. Tor's overlay network is managed
|
||||
through TLS based connections; we will re-use this control plane to set up
|
||||
and tear down circuits that relay UDP traffic. These circuits be built atop
|
||||
DTLS, in a fashion analogous to how Tor currently sends TCP traffic over
|
||||
TLS.
|
||||
|
||||
The unreliability of DTLS circuits creates problems for Tor at two levels:
|
||||
|
||||
1. Tor's encryption of the relay layer does not allow independent
|
||||
decryption of individual records. If record N is not received, then
|
||||
record N+1 will not decrypt correctly, as the counter for AES/CTR is
|
||||
maintained implicitly.
|
||||
|
||||
2. Tor's end-to-end integrity checking works under the assumption that
|
||||
all RELAY cells are delivered. This assumption is invalid when cells
|
||||
are sent over DTLS.
|
||||
|
||||
The fix for the first problem is straightforward: add an explicit sequence
|
||||
number to each cell. To fix the second problem, we introduce a
|
||||
system of nonces and hashes to RELAY packets.
|
||||
|
||||
In the following sections, we mirror the layout of the Tor Protocol
|
||||
Specification, presenting the necessary modifications to the Tor protocol as
|
||||
a series of deltas.
|
||||
|
||||
2. Connections
|
||||
|
||||
Tor-UDP uses DTLS for encryption of some links. All DTLS links must have
|
||||
corresponding TLS links, as all control messages are sent over TLS. All
|
||||
implementations MUST support the DTLS ciphersuite "[TODO]".
|
||||
|
||||
DTLS connections are formed using the same protocol as TLS connections.
|
||||
This occurs upon request, following a CREATE_UDP or CREATE_FAST_UDP cell,
|
||||
as detailed in section 4.6.
|
||||
|
||||
Once a paired TLS/DTLS connection is established, the two sides send cells
|
||||
to one another. All but two types of cells are sent over TLS links. RELAY
|
||||
cells containing the commands RELAY_UDP_DATA and RELAY_UDP_DROP, specified
|
||||
below, are sent over DTLS links. [Should all cells still be 512 bytes long?
|
||||
Perhaps upon completion of a preliminary implementation, we should do a
|
||||
performance evaluation for some class of UDP traffic, such as VoIP. - ML]
|
||||
Cells may be sent embedded in TLS or DTLS records of any size or divided
|
||||
across such records. The framing of these records MUST NOT leak any more
|
||||
information than the above differentiation on the basis of cell type. [I am
|
||||
uncomfortable with this leakage, but don't see any simple, elegant way
|
||||
around it. -ML]
|
||||
|
||||
As with TLS connections, DTLS connections are not permanent.
|
||||
|
||||
3. Cell format
|
||||
|
||||
Each cell contains the following fields:
|
||||
|
||||
CircID [2 bytes]
|
||||
Command [1 byte]
|
||||
Sequence Number [2 bytes]
|
||||
Payload (padded with 0 bytes) [507 bytes]
|
||||
[Total size: 512 bytes]
|
||||
|
||||
The 'Command' field holds one of the following values:
|
||||
0 -- PADDING (Padding) (See Sec 6.2)
|
||||
1 -- CREATE (Create a circuit) (See Sec 4)
|
||||
2 -- CREATED (Acknowledge create) (See Sec 4)
|
||||
3 -- RELAY (End-to-end data) (See Sec 5)
|
||||
4 -- DESTROY (Stop using a circuit) (See Sec 4)
|
||||
5 -- CREATE_FAST (Create a circuit, no PK) (See Sec 4)
|
||||
6 -- CREATED_FAST (Circuit created, no PK) (See Sec 4)
|
||||
7 -- CREATE_UDP (Create a UDP circuit) (See Sec 4)
|
||||
8 -- CREATED_UDP (Acknowledge UDP create) (See Sec 4)
|
||||
9 -- CREATE_FAST_UDP (Create a UDP circuit, no PK) (See Sec 4)
|
||||
10 -- CREATED_FAST_UDP(UDP circuit created, no PK) (See Sec 4)
|
||||
|
||||
The sequence number allows for AES/CTR decryption of RELAY cells
|
||||
independently of one another; this functionality is required to support
|
||||
cells sent over DTLS. The sequence number is described in more detail in
|
||||
section 4.5.
|
||||
|
||||
[Should the sequence number only appear in RELAY packets? The overhead is
|
||||
small, and I'm hesitant to force more code paths on the implementor. -ML]
|
||||
[There's already a separate relay header that has other material in it,
|
||||
so it wouldn't be the end of the world to move it there if it's
|
||||
appropriate. -RD]
|
||||
|
||||
[Having separate commands for UDP circuits seems necessary, unless we can
|
||||
assume a flag day event for a large number of tor nodes. -ML]
|
||||
|
||||
4. Circuit management
|
||||
|
||||
4.2. Setting circuit keys
|
||||
|
||||
Keys are set up for UDP circuits in the same fashion as for TCP circuits.
|
||||
Each UDP circuit shares keys with its corresponding TCP circuit.
|
||||
|
||||
[If the keys are used for both TCP and UDP connections, how does it
|
||||
work to mix sequence-number-less cells with sequenced-numbered cells --
|
||||
how do you know you have the encryption order right? -RD]
|
||||
|
||||
4.3. Creating circuits
|
||||
|
||||
UDP circuits are created as TCP circuits, using the *_UDP cells as
|
||||
appropriate.
|
||||
|
||||
4.4. Tearing down circuits
|
||||
|
||||
UDP circuits are torn down as TCP circuits, using the *_UDP cells as
|
||||
appropriate.
|
||||
|
||||
4.5. Routing relay cells
|
||||
|
||||
When an OR receives a RELAY cell, it checks the cell's circID and
|
||||
determines whether it has a corresponding circuit along that
|
||||
connection. If not, the OR drops the RELAY cell.
|
||||
|
||||
Otherwise, if the OR is not at the OP edge of the circuit (that is,
|
||||
either an 'exit node' or a non-edge node), it de/encrypts the payload
|
||||
with AES/CTR, as follows:
|
||||
'Forward' relay cell (same direction as CREATE):
|
||||
Use Kf as key; decrypt, using sequence number to synchronize
|
||||
ciphertext and keystream.
|
||||
'Back' relay cell (opposite direction from CREATE):
|
||||
Use Kb as key; encrypt, using sequence number to synchronize
|
||||
ciphertext and keystream.
|
||||
Note that in counter mode, decrypt and encrypt are the same operation.
|
||||
[Since the sequence number is only 2 bytes, what do you do when it
|
||||
rolls over? -RD]
|
||||
|
||||
Each stream encrypted by a Kf or Kb has a corresponding unique state,
|
||||
captured by a sequence number; the originator of each such stream chooses
|
||||
the initial sequence number randomly, and increments it only with RELAY
|
||||
cells. [This counts cells; unlike, say, TCP, tor uses fixed-size cells, so
|
||||
there's no need for counting bytes directly. Right? - ML]
|
||||
[I believe this is true. You'll find out for sure when you try to
|
||||
build it. ;) -RD]
|
||||
|
||||
The OR then decides whether it recognizes the relay cell, by
|
||||
inspecting the payload as described in section 5.1 below. If the OR
|
||||
recognizes the cell, it processes the contents of the relay cell.
|
||||
Otherwise, it passes the decrypted relay cell along the circuit if
|
||||
the circuit continues. If the OR at the end of the circuit
|
||||
encounters an unrecognized relay cell, an error has occurred: the OR
|
||||
sends a DESTROY cell to tear down the circuit.
|
||||
|
||||
When a relay cell arrives at an OP, the OP decrypts the payload
|
||||
with AES/CTR as follows:
|
||||
OP receives data cell:
|
||||
For I=N...1,
|
||||
Decrypt with Kb_I, using the sequence number as above. If the
|
||||
payload is recognized (see section 5.1), then stop and process
|
||||
the payload.
|
||||
|
||||
For more information, see section 5 below.
|
||||
|
||||
4.6. CREATE_UDP and CREATED_UDP cells
|
||||
|
||||
Users set up UDP circuits incrementally. The procedure is similar to that
|
||||
for TCP circuits, as described in section 4.1. In addition to the TLS
|
||||
connection to the first node, the OP also attempts to open a DTLS
|
||||
connection. If this succeeds, the OP sends a CREATE_UDP cell, with a
|
||||
payload in the same format as a CREATE cell. To extend a UDP circuit past
|
||||
the first hop, the OP sends an EXTEND_UDP relay cell (see section 5) which
|
||||
instructs the last node in the circuit to send a CREATE_UDP cell to extend
|
||||
the circuit.
|
||||
|
||||
The relay payload for an EXTEND_UDP relay cell consists of:
|
||||
Address [4 bytes]
|
||||
TCP port [2 bytes]
|
||||
UDP port [2 bytes]
|
||||
Onion skin [186 bytes]
|
||||
Identity fingerprint [20 bytes]
|
||||
|
||||
The address field and ports denote the IPV4 address and ports of the next OR
|
||||
in the circuit.
|
||||
|
||||
The payload for a CREATED_UDP cell or the relay payload for an
|
||||
RELAY_EXTENDED_UDP cell is identical to that of the corresponding CREATED or
|
||||
RELAY_EXTENDED cell. Both circuits are established using the same key.
|
||||
|
||||
Note that the existence of a UDP circuit implies the
|
||||
existence of a corresponding TCP circuit, sharing keys, sequence numbers,
|
||||
and any other relevant state.
|
||||
|
||||
4.6.1 CREATE_FAST_UDP/CREATED_FAST_UDP cells
|
||||
|
||||
As above, the OP must successfully connect using DTLS before attempting to
|
||||
send a CREATE_FAST_UDP cell. Otherwise, the procedure is the same as in
|
||||
section 4.1.1.
|
||||
|
||||
5. Application connections and stream management
|
||||
|
||||
5.1. Relay cells
|
||||
|
||||
Within a circuit, the OP and the exit node use the contents of RELAY cells
|
||||
to tunnel end-to-end commands, TCP connections ("Streams"), and UDP packets
|
||||
across circuits. End-to-end commands and UDP packets can be initiated by
|
||||
either edge; streams are initiated by the OP.
|
||||
|
||||
The payload of each unencrypted RELAY cell consists of:
|
||||
Relay command [1 byte]
|
||||
'Recognized' [2 bytes]
|
||||
StreamID [2 bytes]
|
||||
Digest [4 bytes]
|
||||
Length [2 bytes]
|
||||
Data [498 bytes]
|
||||
|
||||
The relay commands are:
|
||||
1 -- RELAY_BEGIN [forward]
|
||||
2 -- RELAY_DATA [forward or backward]
|
||||
3 -- RELAY_END [forward or backward]
|
||||
4 -- RELAY_CONNECTED [backward]
|
||||
5 -- RELAY_SENDME [forward or backward]
|
||||
6 -- RELAY_EXTEND [forward]
|
||||
7 -- RELAY_EXTENDED [backward]
|
||||
8 -- RELAY_TRUNCATE [forward]
|
||||
9 -- RELAY_TRUNCATED [backward]
|
||||
10 -- RELAY_DROP [forward or backward]
|
||||
11 -- RELAY_RESOLVE [forward]
|
||||
12 -- RELAY_RESOLVED [backward]
|
||||
13 -- RELAY_BEGIN_UDP [forward]
|
||||
14 -- RELAY_DATA_UDP [forward or backward]
|
||||
15 -- RELAY_EXTEND_UDP [forward]
|
||||
16 -- RELAY_EXTENDED_UDP [backward]
|
||||
17 -- RELAY_DROP_UDP [forward or backward]
|
||||
|
||||
Commands labelled as "forward" must only be sent by the originator
|
||||
of the circuit. Commands labelled as "backward" must only be sent by
|
||||
other nodes in the circuit back to the originator. Commands marked
|
||||
as either can be sent either by the originator or other nodes.
|
||||
|
||||
The 'recognized' field in any unencrypted relay payload is always set to
|
||||
zero.
|
||||
|
||||
The 'digest' field can have two meanings. For all cells sent over TLS
|
||||
connections (that is, all commands and all non-UDP RELAY data), it is
|
||||
computed as the first four bytes of the running SHA-1 digest of all the
|
||||
bytes that have been sent reliably and have been destined for this hop of
|
||||
the circuit or originated from this hop of the circuit, seeded from Df or Db
|
||||
respectively (obtained in section 4.2 above), and including this RELAY
|
||||
cell's entire payload (taken with the digest field set to zero). Cells sent
|
||||
over DTLS connections do not affect this running digest. Each cell sent
|
||||
over DTLS (that is, RELAY_DATA_UDP and RELAY_DROP_UDP) has the digest field
|
||||
set to the SHA-1 digest of the current RELAY cells' entire payload, with the
|
||||
digest field set to zero. Coupled with a randomly-chosen streamID, this
|
||||
provides per-cell integrity checking on UDP cells.
|
||||
[If you drop malformed UDP relay cells but don't close the circuit,
|
||||
then this 8 bytes of digest is not as strong as what we get in the
|
||||
TCP-circuit side. Is this a problem? -RD]
|
||||
|
||||
When the 'recognized' field of a RELAY cell is zero, and the digest
|
||||
is correct, the cell is considered "recognized" for the purposes of
|
||||
decryption (see section 4.5 above).
|
||||
|
||||
(The digest does not include any bytes from relay cells that do
|
||||
not start or end at this hop of the circuit. That is, it does not
|
||||
include forwarded data. Therefore if 'recognized' is zero but the
|
||||
digest does not match, the running digest at that node should
|
||||
not be updated, and the cell should be forwarded on.)
|
||||
|
||||
All RELAY cells pertaining to the same tunneled TCP stream have the
|
||||
same streamID. Such streamIDs are chosen arbitrarily by the OP. RELAY
|
||||
cells that affect the entire circuit rather than a particular
|
||||
stream use a StreamID of zero.
|
||||
|
||||
All RELAY cells pertaining to the same UDP tunnel have the same streamID.
|
||||
This streamID is chosen randomly by the OP, but cannot be zero.
|
||||
|
||||
The 'Length' field of a relay cell contains the number of bytes in
|
||||
the relay payload which contain real payload data. The remainder of
|
||||
the payload is padded with NUL bytes.
|
||||
|
||||
If the RELAY cell is recognized but the relay command is not
|
||||
understood, the cell must be dropped and ignored. Its contents
|
||||
still count with respect to the digests, though. [Before
|
||||
0.1.1.10, Tor closed circuits when it received an unknown relay
|
||||
command. Perhaps this will be more forward-compatible. -RD]
|
||||
|
||||
5.2.1. Opening UDP tunnels and transferring data
|
||||
|
||||
To open a new anonymized UDP connection, the OP chooses an open
|
||||
circuit to an exit that may be able to connect to the destination
|
||||
address, selects a random streamID not yet used on that circuit,
|
||||
and constructs a RELAY_BEGIN_UDP cell with a payload encoding the address
|
||||
and port of the destination host. The payload format is:
|
||||
|
||||
ADDRESS | ':' | PORT | [00]
|
||||
|
||||
where ADDRESS can be a DNS hostname, or an IPv4 address in
|
||||
dotted-quad format, or an IPv6 address surrounded by square brackets;
|
||||
and where PORT is encoded in decimal.
|
||||
|
||||
[What is the [00] for? -NM]
|
||||
[It's so the payload is easy to parse out with string funcs -RD]
|
||||
|
||||
Upon receiving this cell, the exit node resolves the address as necessary.
|
||||
If the address cannot be resolved, the exit node replies with a RELAY_END
|
||||
cell. (See 5.4 below.) Otherwise, the exit node replies with a
|
||||
RELAY_CONNECTED cell, whose payload is in one of the following formats:
|
||||
The IPv4 address to which the connection was made [4 octets]
|
||||
A number of seconds (TTL) for which the address may be cached [4 octets]
|
||||
or
|
||||
Four zero-valued octets [4 octets]
|
||||
An address type (6) [1 octet]
|
||||
The IPv6 address to which the connection was made [16 octets]
|
||||
A number of seconds (TTL) for which the address may be cached [4 octets]
|
||||
[XXXX Versions of Tor before 0.1.1.6 ignore and do not generate the TTL
|
||||
field. No version of Tor currently generates the IPv6 format.]
|
||||
|
||||
The OP waits for a RELAY_CONNECTED cell before sending any data.
|
||||
Once a connection has been established, the OP and exit node
|
||||
package UDP data in RELAY_DATA_UDP cells, and upon receiving such
|
||||
cells, echo their contents to the corresponding socket.
|
||||
RELAY_DATA_UDP cells sent to unrecognized streams are dropped.
|
||||
|
||||
Relay RELAY_DROP_UDP cells are long-range dummies; upon receiving such
|
||||
a cell, the OR or OP must drop it.
|
||||
|
||||
5.3. Closing streams
|
||||
|
||||
UDP tunnels are closed in a fashion corresponding to TCP connections.
|
||||
|
||||
6. Flow Control
|
||||
|
||||
UDP streams are not subject to flow control.
|
||||
|
||||
7.2. Router descriptor format.
|
||||
|
||||
The items' formats are as follows:
|
||||
"router" nickname address ORPort SocksPort DirPort UDPPort
|
||||
|
||||
Indicates the beginning of a router descriptor. "address" must be
|
||||
an IPv4 address in dotted-quad format. The last three numbers
|
||||
indicate the TCP ports at which this OR exposes
|
||||
functionality. ORPort is a port at which this OR accepts TLS
|
||||
connections for the main OR protocol; SocksPort is deprecated and
|
||||
should always be 0; DirPort is the port at which this OR accepts
|
||||
directory-related HTTP connections; and UDPPort is a port at which
|
||||
this OR accepts DTLS connections for UDP data. If any port is not
|
||||
supported, the value 0 is given instead of a port number.
|
||||
|
||||
Other sections:
|
||||
|
||||
What changes need to happen to each node's exit policy to support this? -RD
|
||||
|
||||
Switching to UDP means managing the queues of incoming packets better,
|
||||
so we don't miss packets. How does this interact with doing large public
|
||||
key operations (handshakes) in the same thread? -RD
|
||||
|
||||
========================================================================
|
||||
COMMENTS
|
||||
========================================================================
|
||||
|
||||
[16 May 2006]
|
||||
|
||||
I don't favor this approach; it makes packet traffic partitioned from
|
||||
stream traffic end-to-end. The architecture I'd like to see is:
|
||||
|
||||
A *All* Tor-to-Tor traffic is UDP/DTLS, unless we need to fall back on
|
||||
TCP/TLS for firewall penetration or something. (This also gives us an
|
||||
upgrade path for routing through legacy servers.)
|
||||
|
||||
B Stream traffic is handled with end-to-end per-stream acks/naks and
|
||||
retries. On failure, the data is retransmitted in a new RELAY_DATA cell;
|
||||
a cell isn't retransmitted.
|
||||
|
||||
We'll need to do A anyway, to fix our behavior on packet-loss. Once we've
|
||||
done so, B is more or less inevitable, and we can support end-to-end UDP
|
||||
traffic "for free".
|
||||
|
||||
(Also, there are some details that this draft spec doesn't address. For
|
||||
example, what happens when a UDP packet doesn't fit in a single cell?)
|
||||
|
||||
-NM
|
@ -1,285 +0,0 @@
|
||||
Filename: 101-dir-voting.txt
|
||||
Title: Voting on the Tor Directory System
|
||||
Version: $Revision$
|
||||
Last-Modified: $Date$
|
||||
Author: Nick Mathewson
|
||||
Created: Nov 2006
|
||||
Status: Closed
|
||||
Implemented-In: 0.2.0.x
|
||||
|
||||
Overview
|
||||
|
||||
This document describes a consensus voting scheme for Tor directories;
|
||||
instead of publishing different network statuses, directories would vote on
|
||||
and publish a single "consensus" network status document.
|
||||
|
||||
This is an open proposal.
|
||||
|
||||
Proposal:
|
||||
|
||||
0. Scope and preliminaries
|
||||
|
||||
This document describes a consensus voting scheme for Tor directories.
|
||||
Once it's accepted, it should be merged with dir-spec.txt. Some
|
||||
preliminaries for authority and caching support should be done during
|
||||
the 0.1.2.x series; the main deployment should come during the 0.2.0.x
|
||||
series.
|
||||
|
||||
0.1. Goals and motivation: voting.
|
||||
|
||||
The current directory system relies on clients downloading separate
|
||||
network status statements from the caches signed by each directory.
|
||||
Clients download a new statement every 30 minutes or so, choosing to
|
||||
replace the oldest statement they currently have.
|
||||
|
||||
This creates a partitioning problem: different clients have different
|
||||
"most recent" networkstatus sources, and different versions of each
|
||||
(since authorities change their statements often).
|
||||
|
||||
It also creates a scaling problem: most of the downloaded networkstatus
|
||||
are probably quite similar, and the redundancy grows as we add more
|
||||
authorities.
|
||||
|
||||
So if we have clients only download a single multiply signed consensus
|
||||
network status statement, we can:
|
||||
- Save bandwidth.
|
||||
- Reduce client partitioning
|
||||
- Reduce client-side and cache-side storage
|
||||
- Simplify client-side voting code (by moving voting away from the
|
||||
client)
|
||||
|
||||
We should try to do this without:
|
||||
- Assuming that client-side or cache-side clocks are more correct
|
||||
than we assume now.
|
||||
- Assuming that authority clocks are perfectly correct.
|
||||
- Degrading badly if a few authorities die or are offline for a bit.
|
||||
|
||||
We do not have to perform well if:
|
||||
- No clique of more than half the authorities can agree about who
|
||||
the authorities are.
|
||||
|
||||
1. The idea.
|
||||
|
||||
Instead of publishing a network status whenever something changes,
|
||||
each authority instead publishes a fresh network status only once per
|
||||
"period" (say, 60 minutes). Authorities either upload this network
|
||||
status (or "vote") to every other authority, or download every other
|
||||
authority's "vote" (see 3.1 below for discussion on push vs pull).
|
||||
|
||||
After an authority has (or has become convinced that it won't be able to
|
||||
get) every other authority's vote, it deterministically computes a
|
||||
consensus networkstatus, and signs it. Authorities download (or are
|
||||
uploaded; see 3.1) one another's signatures, and form a multiply signed
|
||||
consensus. This multiply-signed consensus is what caches cache and what
|
||||
clients download.
|
||||
|
||||
If an authority is down, authorities vote based on what they *can*
|
||||
download/get uploaded.
|
||||
|
||||
If an authority is "a little" down and only some authorities can reach
|
||||
it, authorities try to get its info from other authorities.
|
||||
|
||||
If an authority computes the vote wrong, its signature isn't included on
|
||||
the consensus.
|
||||
|
||||
Clients use a consensus if it is "trusted": signed by more than half the
|
||||
authorities they recognize. If clients can't find any such consensus,
|
||||
they use the most recent trusted consensus they have. If they don't
|
||||
have any trusted consensus, they warn the user and refuse to operate
|
||||
(and if DirServers is not the default, beg the user to adapt the list
|
||||
of authorities).
|
||||
|
||||
2. Details.
|
||||
|
||||
2.0. Versioning
|
||||
|
||||
All documents generated here have version "3" given in their
|
||||
network-status-version entries.
|
||||
|
||||
2.1. Vote specifications
|
||||
|
||||
Votes in v3 are similar to v2 network status documents. We add these
|
||||
fields to the preamble:
|
||||
|
||||
"vote-status" -- the word "vote".
|
||||
|
||||
"valid-until" -- the time when this authority expects to publish its
|
||||
next vote.
|
||||
|
||||
"known-flags" -- a space-separated list of flags that will sometimes
|
||||
be included on "s" lines later in the vote.
|
||||
|
||||
"dir-source" -- as before, except the "hostname" part MUST be the
|
||||
authority's nickname, which MUST be unique among authorities, and
|
||||
MUST match the nickname in the "directory-signature" entry.
|
||||
|
||||
Authorities SHOULD cache their most recently generated votes so they
|
||||
can persist them across restarts. Authorities SHOULD NOT generate
|
||||
another document until valid-until has passed.
|
||||
|
||||
Router entries in the vote MUST be sorted in ascending order by router
|
||||
identity digest. The flags in "s" lines MUST appear in alphabetical
|
||||
order.
|
||||
|
||||
Votes SHOULD be synchronized to half-hour publication intervals (one
|
||||
hour? XXX say more; be more precise.)
|
||||
|
||||
XXXX some way to request older networkstatus docs?
|
||||
|
||||
2.2. Consensus directory specifications
|
||||
|
||||
Consensuses are like v3 votes, except for the following fields:
|
||||
|
||||
"vote-status" -- the word "consensus".
|
||||
|
||||
"published" is the latest of all the published times on the votes.
|
||||
|
||||
"valid-until" is the earliest of all the valid-until times on the
|
||||
votes.
|
||||
|
||||
"dir-source" and "fingerprint" and "dir-signing-key" and "contact"
|
||||
are included for each authority that contributed to the vote.
|
||||
|
||||
"vote-digest" for each authority that contributed to the vote,
|
||||
calculated as for the digest in the signature on the vote. [XXX
|
||||
re-English this sentence]
|
||||
|
||||
"client-versions" and "server-versions" are sorted in ascending
|
||||
order based on version-spec.txt.
|
||||
|
||||
"dir-options" and "known-flags" are not included.
|
||||
[XXX really? why not list the ones that are used in the consensus?
|
||||
For example, right now BadExit is in use, but no servers would be
|
||||
labelled BadExit, and it's still worth knowing that it was considered
|
||||
by the authorities. -RD]
|
||||
|
||||
The fields MUST occur in the following order:
|
||||
"network-status-version"
|
||||
"vote-status"
|
||||
"published"
|
||||
"valid-until"
|
||||
For each authority, sorted in ascending order of nickname, case-
|
||||
insensitively:
|
||||
"dir-source", "fingerprint", "contact", "dir-signing-key",
|
||||
"vote-digest".
|
||||
"client-versions"
|
||||
"server-versions"
|
||||
|
||||
The signatures at the end of the document appear as multiple instances
|
||||
of directory-signature, sorted in ascending order by nickname,
|
||||
case-insensitively.
|
||||
|
||||
A router entry should be included in the result if it is included by more
|
||||
than half of the authorities (total authorities, not just those whose votes
|
||||
we have). A router entry has a flag set if it is included by more than
|
||||
half of the authorities who care about that flag. [XXXX this creates an
|
||||
incentive for attackers to DOS authorities whose votes they don't like.
|
||||
Can we remember what flags people set the last time we saw them? -NM]
|
||||
[Which 'we' are we talking here? The end-users never learn which
|
||||
authority sets which flags. So you're thinking the authorities
|
||||
should record the last vote they saw from each authority and if it's
|
||||
within a week or so, count all the flags that it advertised as 'no'
|
||||
votes? Plausible. -RD]
|
||||
|
||||
The signature hash covers from the "network-status-version" line through
|
||||
the characters "directory-signature" in the first "directory-signature"
|
||||
line.
|
||||
|
||||
Consensus directories SHOULD be rejected if they are not signed by more
|
||||
than half of the known authorities.
|
||||
|
||||
2.2.1. Detached signatures
|
||||
|
||||
Assuming full connectivity, every authority should compute and sign the
|
||||
same consensus directory in each period. Therefore, it isn't necessary to
|
||||
download the consensus computed by each authority; instead, the authorities
|
||||
only push/fetch each others' signatures. A "detached signature" document
|
||||
contains a single "consensus-digest" entry and one or more
|
||||
directory-signature entries. [XXXX specify more.]
|
||||
|
||||
2.3. URLs and timelines
|
||||
|
||||
2.3.1. URLs and timeline used for agreement
|
||||
|
||||
An authority SHOULD publish its vote immediately at the start of each voting
|
||||
period. It does this by making it available at
|
||||
http://<hostname>/tor/status-vote/current/authority.z
|
||||
and sending it in an HTTP POST request to each other authority at the URL
|
||||
http://<hostname>/tor/post/vote
|
||||
|
||||
If, N minutes after the voting period has begun, an authority does not have
|
||||
a current statement from another authority, the first authority retrieves
|
||||
the other's statement.
|
||||
|
||||
Once an authority has a vote from another authority, it makes it available
|
||||
at
|
||||
http://<hostname>/tor/status-vote/current/<fp>.z
|
||||
where <fp> is the fingerprint of the other authority's identity key.
|
||||
|
||||
The consensus network status, along with as many signatures as the server
|
||||
currently knows, should be available at
|
||||
http://<hostname>/tor/status-vote/current/consensus.z
|
||||
All of the detached signatures it knows for consensus status should be
|
||||
available at:
|
||||
http://<hostname>/tor/status-vote/current/consensus-signatures.z
|
||||
|
||||
Once an authority has computed and signed a consensus network status, it
|
||||
should send its detached signature to each other authority in an HTTP POST
|
||||
request to the URL:
|
||||
http://<hostname>/tor/post/consensus-signature
|
||||
|
||||
|
||||
[XXXX Store votes to disk.]
|
||||
|
||||
2.3.2. Serving a consensus directory
|
||||
|
||||
Once the authority is done getting signatures on the consensus directory,
|
||||
it should serve it from:
|
||||
http://<hostname>/tor/status/consensus.z
|
||||
|
||||
Caches SHOULD download consensus directories from an authority and serve
|
||||
them from the same URL.
|
||||
|
||||
2.3.3. Timeline and synchronization
|
||||
|
||||
[XXXX]
|
||||
|
||||
2.4. Distributing routerdescs between authorities
|
||||
|
||||
Consensus will be more meaningful if authorities take steps to make sure
|
||||
that they all have the same set of descriptors _before_ the voting
|
||||
starts. This is safe, since all descriptors are self-certified and
|
||||
timestamped: it's always okay to replace a signed descriptor with a more
|
||||
recent one signed by the same identity.
|
||||
|
||||
In the long run, we might want some kind of sophisticated process here.
|
||||
For now, since authorities already download one another's networkstatus
|
||||
documents and use them to determine what descriptors to download from one
|
||||
another, we can rely on this existing mechanism to keep authorities up to
|
||||
date.
|
||||
|
||||
[We should do a thorough read-through of dir-spec again to make sure
|
||||
that the authorities converge on which descriptor to "prefer" for
|
||||
each router. Right now the decision happens at the client, which is
|
||||
no longer the right place for it. -RD]
|
||||
|
||||
3. Questions and concerns
|
||||
|
||||
3.1. Push or pull?
|
||||
|
||||
The URLs above define a push mechanism for publishing votes and consensus
|
||||
signatures via HTTP POST requests, and a pull mechanism for downloading
|
||||
these documents via HTTP GET requests. As specified, every authority will
|
||||
post to every other. The "download if no copy has been received" mechanism
|
||||
exists only as a fallback.
|
||||
|
||||
4. Migration
|
||||
|
||||
* It would be cool if caches could get ready to download consensus
|
||||
status docs, verify enough signatures, and serve them now. That way
|
||||
once stuff works all we need to do is upgrade the authorities. Caches
|
||||
don't need to verify the correctness of the format so long as it's
|
||||
signed (or maybe multisigned?). We need to make sure that caches back
|
||||
off very quickly from downloading consensus docs until they're
|
||||
actually implemented.
|
||||
|
@ -1,40 +0,0 @@
|
||||
Filename: 102-drop-opt.txt
|
||||
Title: Dropping "opt" from the directory format
|
||||
Version: $Revision$
|
||||
Last-Modified: $Date$
|
||||
Author: Nick Mathewson
|
||||
Created: Jan 2007
|
||||
Status: Closed
|
||||
Implemented-In: 0.2.0.x
|
||||
|
||||
Overview:
|
||||
|
||||
This document proposes a change in the format used to transmit router and
|
||||
directory information.
|
||||
|
||||
This proposal has been accepted, implemented, and merged into dir-spec.txt.
|
||||
|
||||
Proposal:
|
||||
|
||||
The "opt" keyword in Tor's directory formats was originally intended to
|
||||
mean, "it is okay to ignore this entry if you don't understand it"; the
|
||||
default behavior has been "discard a routerdesc if it contains entries you
|
||||
don't recognize."
|
||||
|
||||
But so far, every new flag we have added has been marked 'opt'. It would
|
||||
probably make sense to change the default behavior to "ignore unrecognized
|
||||
fields", and add the statement that clients SHOULD ignore fields they don't
|
||||
recognize. As a meta-principle, we should say that clients and servers
|
||||
MUST NOT have to understand new fields in order to use directory documents
|
||||
correctly.
|
||||
|
||||
Of course, this will make it impossible to say, "The format has changed a
|
||||
lot; discard this quietly if you don't understand it." We could do that by
|
||||
adding a version field.
|
||||
|
||||
Status:
|
||||
|
||||
* We stopped requiring it as of 0.1.2.5-alpha. We'll stop generating it
|
||||
once earlier formats are obsolete.
|
||||
|
||||
|
@ -1,206 +0,0 @@
|
||||
Filename: 103-multilevel-keys.txt
|
||||
Title: Splitting identity key from regularly used signing key.
|
||||
Version: $Revision$
|
||||
Last-Modified: $Date$
|
||||
Author: Nick Mathewson
|
||||
Created: Jan 2007
|
||||
Status: Closed
|
||||
Implemented-In: 0.2.0.x
|
||||
|
||||
Overview:
|
||||
|
||||
This document proposes a change in the way identity keys are used, so that
|
||||
highly sensitive keys can be password-protected and seldom loaded into RAM.
|
||||
|
||||
It presents options; it is not yet a complete proposal.
|
||||
|
||||
Proposal:
|
||||
|
||||
Replacing a directory authority's identity key in the event of a compromise
|
||||
would be tremendously annoying. We'd need to tell every client to switch
|
||||
their configuration, or update to a new version with an uploaded list. So
|
||||
long as some weren't upgraded, they'd be at risk from whoever had
|
||||
compromised the key.
|
||||
|
||||
With this in mind, it's a shame that our current protocol forces us to
|
||||
store identity keys unencrypted in RAM. We need some kind of signing key
|
||||
stored unencrypted, since we need to generate new descriptors/directories
|
||||
and rotate link and onion keys regularly. (And since, of course, we can't
|
||||
ask server operators to be on-hand to enter a passphrase every time we
|
||||
want to rotate keys or sign a descriptor.)
|
||||
|
||||
The obvious solution seems to be to have a signing-only key that lives
|
||||
indefinitely (months or longer) and signs descriptors and link keys, and a
|
||||
separate identity key that's used to sign the signing key. Tor servers
|
||||
could run in one of several modes:
|
||||
1. Identity key stored encrypted. You need to pick a passphrase when
|
||||
you enable this mode, and re-enter this passphrase every time you
|
||||
rotate the signing key.
|
||||
1'. Identity key stored separate. You save your identity key to a
|
||||
floppy, and use the floppy when you need to rotate the signing key.
|
||||
2. All keys stored unencrypted. In this case, we might not want to even
|
||||
*have* a separate signing key. (We'll need to support no-separate-
|
||||
signing-key mode anyway to keep old servers working.)
|
||||
3. All keys stored encrypted. You need to enter a passphrase to start
|
||||
Tor.
|
||||
(Of course, we might not want to implement all of these.)
|
||||
|
||||
Case 1 is probably most usable and secure, if we assume that people don't
|
||||
forget their passphrases or lose their floppies. We could mitigate this a
|
||||
bit by encouraging people to PGP-encrypt their passphrases to themselves,
|
||||
or keep a cleartext copy of their secret key secret-split into a few
|
||||
pieces, or something like that.
|
||||
|
||||
Migration presents another difficulty, especially with the authorities. If
|
||||
we use the current set of identity keys as the new identity keys, we're in
|
||||
the position of having sensitive keys that have been stored on
|
||||
media-of-dubious-encryption up to now. Also, we need to keep old clients
|
||||
(who will expect descriptors to be signed by the identity keys they know
|
||||
and love, and who will not understand signing keys) happy.
|
||||
|
||||
A possible solution:
|
||||
|
||||
One thing to consider is that router identity keys are not very sensitive:
|
||||
if an OR disappears and reappears with a new key, the network treats it as
|
||||
though an old router had disappeared and a new one had joined the network.
|
||||
The Tor network continues unharmed; this isn't a disaster.
|
||||
|
||||
Thus, the ideas above are mostly relevant for authorities.
|
||||
|
||||
The most straightforward solution for the authorities is probably to take
|
||||
advantage of the protocol transition that will come with proposal 101, and
|
||||
introduce a new set of signing _and_ identity keys used only to sign votes
|
||||
and consensus network-status documents. Signing and identity keys could be
|
||||
delivered to users in a separate, rarely changing "keys" document, so that
|
||||
the consensus network-status documents wouldn't need to include N signing
|
||||
keys, N identity keys, and N certifications.
|
||||
|
||||
Note also that there is no reason that the identity/signing keys used by
|
||||
directory authorities would necessarily have to be the same as the identity
|
||||
keys those authorities use in their capacity as routers. Decoupling these
|
||||
keys would give directory authorities the following set of keys:
|
||||
|
||||
Directory authority identity:
|
||||
Highly confidential; stored encrypted and/or offline. Used to
|
||||
identity directory authorities. Shipped with clients. Used to
|
||||
sign Directory authority signing keys.
|
||||
|
||||
Directory authority signing key:
|
||||
Stored online, accessible to regular Tor process. Used to sign
|
||||
votes and consensus directories. Downloaded as part of a "keys"
|
||||
document.
|
||||
|
||||
[Administrators SHOULD rotate their signing keys every month or
|
||||
two, just to keep in practice and keep from forgetting the
|
||||
password to the authority identity.]
|
||||
|
||||
V1-V2 directory authority identity:
|
||||
Stored online, never changed. Used to sign legacy network-status
|
||||
and directory documents.
|
||||
|
||||
Router identity:
|
||||
Stored online, seldom changed. Used to sign server descriptors
|
||||
for this authority in its role as a router. Implicitly certified
|
||||
by being listed in network-status documents.
|
||||
|
||||
Onion key, link key:
|
||||
As in tor-spec.txt
|
||||
|
||||
|
||||
Extensions to Proposal 101.
|
||||
|
||||
Define a new document type, "Key certificate". It contains the
|
||||
following fields, in order:
|
||||
|
||||
"dir-key-certificate-version": As network-status-version. Must be
|
||||
"3".
|
||||
"fingerprint": Hex fingerprint, with spaces, based on the directory
|
||||
authority's identity key.
|
||||
"dir-identity-key": The long-term identity key for this authority.
|
||||
"dir-key-published": The time when this directory's signing key was
|
||||
last changed.
|
||||
"dir-key-expires": A time after which this key is no longer valid.
|
||||
"dir-signing-key": As in proposal 101.
|
||||
"dir-key-certification": A signature of the above fields, in order.
|
||||
The signed material extends from the beginning of
|
||||
"dir-key-certicate-version" through the newline after
|
||||
"dir-key-certification". The identity key is used to generate
|
||||
this signature.
|
||||
|
||||
These elements together constitute a "key certificate". These are
|
||||
generated offline when starting a v3 authority. Private identity
|
||||
keys SHOULD be stored offline, encrypted, or both. A running
|
||||
authority only needs access to the signing key.
|
||||
|
||||
Unlike other keys currently used by Tor, the authority identity
|
||||
keys and directory signing keys MAY be longer than 1024 bits.
|
||||
(They SHOULD be 2048 bits or longer; they MUST NOT be shorter than
|
||||
1024.)
|
||||
|
||||
Vote documents change as follows:
|
||||
|
||||
A key certificate MUST be included in-line in every vote document. With
|
||||
the exception of "fingerprint", its elements MUST NOT appear in consensus
|
||||
documents.
|
||||
|
||||
Consensus network statuses change as follows:
|
||||
|
||||
Remove dir-signing-key.
|
||||
|
||||
Change "directory-signature" to take a fingerprint of the authority's
|
||||
identity key and a fingerprint of the authority's current signing key
|
||||
rather than the authority's nickname.
|
||||
|
||||
Change "dir-source" to take the a fingerprint of the authority's
|
||||
identity key rather than the authority's nickname or hostname.
|
||||
|
||||
Add a new document type:
|
||||
|
||||
A "keys" document contains all currently known key certificates.
|
||||
All authorities serve it at
|
||||
|
||||
http://<hostname>/tor/status/keys.z
|
||||
|
||||
Caches and clients download the keys document whenever they receive a
|
||||
consensus vote that uses a key they do not recognize. Caches download
|
||||
from authorities; clients download from caches.
|
||||
|
||||
Processing votes:
|
||||
|
||||
When receiving a vote, authorities check to see if the key
|
||||
certificate for the voter is different from the one they have. If
|
||||
the key certificate _is_ different, and its dir-key-published is
|
||||
more recent than the most recently known one, and it is
|
||||
well-formed and correctly signed with the correct identity key,
|
||||
then authorities remember it as the new canonical key certificate
|
||||
for that voter.
|
||||
|
||||
A key certificate is invalid if any of the following hold:
|
||||
* The version is unrecognized.
|
||||
* The fingerprint does not match the identity key.
|
||||
* The identity key or the signing key is ill-formed.
|
||||
* The published date is very far in the past or future.
|
||||
|
||||
* The signature is not a valid signature of the key certificate
|
||||
generated with the identity key.
|
||||
|
||||
When processing the signatures on consensus, clients and caches act as
|
||||
follows:
|
||||
|
||||
1. Only consider the directory-signature entries whose identity
|
||||
key hashes match trusted authorities.
|
||||
|
||||
2. If any such entries have signing key hashes that match unknown
|
||||
signing keys, download a new keys document.
|
||||
|
||||
3. For every entry with a known (identity key,signing key) pair,
|
||||
check the signature on the document.
|
||||
|
||||
4. If the document has been signed by more than half of the
|
||||
authorities the client recognizes, treat the consensus as
|
||||
correctly signed.
|
||||
|
||||
If not, but the number entries with known identity keys but
|
||||
unknown signing keys might be enough to make the consensus
|
||||
correctly signed, do not use the consensus, but do not discard
|
||||
it until we have a new keys document.
|
@ -1,183 +0,0 @@
|
||||
Filename: 104-short-descriptors.txt
|
||||
Title: Long and Short Router Descriptors
|
||||
Version: $Revision$
|
||||
Last-Modified: $Date$
|
||||
Author: Nick Mathewson
|
||||
Created: Jan 2007
|
||||
Status: Closed
|
||||
Implemented-In: 0.2.0.x
|
||||
|
||||
Overview:
|
||||
|
||||
This document proposes moving unused-by-clients information from regular
|
||||
router descriptors into a new "extra info" router descriptor.
|
||||
|
||||
Proposal:
|
||||
|
||||
Some of the costliest fields in the current directory protocol are ones
|
||||
that no client actually uses. In particular, the "read-history" and
|
||||
"write-history" fields are used only by the authorities for monitoring the
|
||||
status of the network. If we took them out, the size of a compressed list
|
||||
of all the routers would fall by about 60%. (No other disposable field
|
||||
would save much more than 2%.)
|
||||
|
||||
We propose to remove these fields from descriptors, and and have them
|
||||
uploaded as a part of a separate signed "extra info" to the authorities.
|
||||
This document will be signed. A hash of this document will be included in
|
||||
the regular descriptors.
|
||||
|
||||
(We considered another design, where routers would generate and upload a
|
||||
short-form and a long-form descriptor. Only the short-form descriptor would
|
||||
ever be used by anybody for routing. The long-form descriptor would be
|
||||
used only for analytics and other tools. We decided against this because
|
||||
well-behaved tools would need to download short-form descriptors too (as
|
||||
these would be the only ones indexed), and hence get redundant info. Badly
|
||||
behaved tools would download only long-form descriptors, and expose
|
||||
themselves to partitioning attacks.)
|
||||
|
||||
Other disposable fields:
|
||||
|
||||
Clients don't need these fields, but removing them doesn't help bandwidth
|
||||
enough to be worthwhile.
|
||||
contact (save about 1%)
|
||||
fingerprint (save about 3%)
|
||||
|
||||
We could represent these fields more succinctly, but removing them would
|
||||
only save 1%. (!)
|
||||
reject
|
||||
accept
|
||||
(Apparently, exit polices are highly compressible.)
|
||||
|
||||
[Does size-on-disk matter to anybody? Some clients and servers don't
|
||||
have much disk, or have really slow disk (e.g. USB). And we don't
|
||||
store caches compressed right now. -RD]
|
||||
|
||||
Specification:
|
||||
|
||||
1. Extra Info Format.
|
||||
|
||||
An "extra info" descriptor contains the following fields:
|
||||
|
||||
"extra-info" Nickname Fingerprint
|
||||
Identifies what router this is an extra info descriptor for.
|
||||
Fingerprint is encoded in hex (using upper-case letters), with
|
||||
no spaces.
|
||||
|
||||
"published" As currently documented in dir-spec.txt. It MUST match the
|
||||
"published" field of the descriptor published at the same time.
|
||||
|
||||
"read-history"
|
||||
"write-history"
|
||||
As currently documented in dir-spec.txt. Optional.
|
||||
|
||||
"router-signature" NL Signature NL
|
||||
|
||||
A signature of the PKCS1-padded hash of the entire extra info
|
||||
document, taken from the beginning of the "extra-info" line, through
|
||||
the newline after the "router-signature" line. An extra info
|
||||
document is not valid unless the signature is performed with the
|
||||
identity key whose digest matches FINGERPRINT.
|
||||
|
||||
The "extra-info" field is required and MUST appear first. The
|
||||
router-signature field is required and MUST appear last. All others are
|
||||
optional. As for other documents, unrecognized fields must be ignored.
|
||||
|
||||
2. Existing formats
|
||||
|
||||
Implementations that use "read-history" and "write-history" SHOULD
|
||||
continue accepting router descriptors that contain them. (Prior to
|
||||
0.2.0.x, this information was encoded in ordinary router descriptors;
|
||||
in any case they have always been listed as opt, so they should be
|
||||
accepted anyway.)
|
||||
|
||||
Add these fields to router descriptors:
|
||||
|
||||
"extra-info-digest" Digest
|
||||
"Digest" is a hex-encoded digest (using upper-case characters)
|
||||
of the router's extra-info document, as signed in the router's
|
||||
extra-info. (If this field is absent, no extra-info-digest
|
||||
exists.)
|
||||
|
||||
"caches-extra-info"
|
||||
Present if this router is a directory cache that provides
|
||||
extra-info documents, or an authority that handles extra-info
|
||||
documents.
|
||||
|
||||
(Since implementations before 0.1.2.5-alpha required that the "opt"
|
||||
keyword precede any unrecognized entry, these keys MUST be preceded
|
||||
with "opt" until 0.1.2.5-alpha is obsolete.)
|
||||
|
||||
3. New communications rules
|
||||
|
||||
Servers SHOULD generate and upload one extra-info document after each
|
||||
descriptor they generate and upload; no more, no less. Servers MUST
|
||||
upload the new descriptor before they upload the new extra-info.
|
||||
|
||||
Authorities receiving an extra-info document SHOULD verify all of the
|
||||
following:
|
||||
* They have a router descriptor for some server with a matching
|
||||
nickname and identity fingerprint.
|
||||
* That server's identity key has been used to sign the extra-info
|
||||
document.
|
||||
* The extra-info-digest field in the router descriptor matches
|
||||
the digest of the extra-info document.
|
||||
* The published fields in the two documents match.
|
||||
|
||||
Authorities SHOULD drop extra-info documents that do not meet these
|
||||
criteria.
|
||||
|
||||
Extra-info documents MAY be uploaded as part of the same HTTP post as
|
||||
the router descriptor, or separately. Authorities MUST accept both
|
||||
methods.
|
||||
|
||||
Authorities SHOULD try to fetch extra-info documents from one another if
|
||||
they do not have one matching the digest declared in a router
|
||||
descriptor.
|
||||
|
||||
Caches that are running locally with a tool that needs to use extra-info
|
||||
documents MAY download and store extra-info documents. They should do
|
||||
so when they notice that the recommended descriptor has an
|
||||
extra-info-digest not matching any extra-info document they currently
|
||||
have. (Caches not running on a host that needs to use extra-info
|
||||
documents SHOULD NOT download or cache them.)
|
||||
|
||||
4. New URLs
|
||||
|
||||
http://<hostname>/tor/extra/d/...
|
||||
http://<hostname>/tor/extra/fp/...
|
||||
http://<hostname>/tor/extra/all[.z]
|
||||
(As for /tor/server/ URLs: supports fetching extra-info documents
|
||||
by their digest, by the fingerprint of their servers, or all
|
||||
at once. When serving by fingerprint, we serve the extra-info
|
||||
that corresponds to the descriptor we would serve by that
|
||||
fingerprint. Only directory authorities are guaranteed to support
|
||||
these URLs.)
|
||||
|
||||
http://<hostname>/tor/extra/authority[.z]
|
||||
(The extra-info document for this router.)
|
||||
|
||||
Extra-info documents are uploaded to the same URLs as regular
|
||||
router descriptors.
|
||||
|
||||
Migration:
|
||||
|
||||
For extra info approach:
|
||||
* First:
|
||||
* Authorities should accept extra info, and support serving it.
|
||||
* Routers should upload extra info once authorities accept it.
|
||||
* Caches should support an option to download and cache it, once
|
||||
authorities serve it.
|
||||
* Tools should be updated to use locally cached information.
|
||||
These tools include:
|
||||
lefkada's exit.py script.
|
||||
tor26's noreply script and general directory cache.
|
||||
https://nighteffect.us/tns/ for its graphs
|
||||
and check with or-talk for the rest, once it's time.
|
||||
|
||||
* Set a cutoff time for including bandwidth in router descriptors, so
|
||||
that tools that use bandwidth info know that they will need to fetch
|
||||
extra info documents.
|
||||
|
||||
* Once tools that want bandwidth info support fetching extra info:
|
||||
* Have routers stop including bandwidth info in their router
|
||||
descriptors.
|
@ -1,325 +0,0 @@
|
||||
Filename: 105-handshake-revision.txt
|
||||
Title: Version negotiation for the Tor protocol.
|
||||
Version: $Revision$
|
||||
Last-Modified: $Date$
|
||||
Author: Nick Mathewson, Roger Dingledine
|
||||
Created: Jan 2007
|
||||
Status: Closed
|
||||
Implemented-In: 0.2.0.x
|
||||
|
||||
Overview:
|
||||
|
||||
This document was extracted from a modified version of tor-spec.txt that we
|
||||
had written before the proposal system went into place. It adds two new
|
||||
cells types to the Tor link connection setup handshake: one used for
|
||||
version negotiation, and another to prevent MITM attacks.
|
||||
|
||||
This proposal is partially implemented, and partially proceded by
|
||||
proposal 130.
|
||||
|
||||
Motivation: Tor versions
|
||||
|
||||
Our *current* approach to versioning the Tor protocol(s) has been as
|
||||
follows:
|
||||
- All changes must be backward compatible.
|
||||
- It's okay to add new cell types, if they would be ignored by previous
|
||||
versions of Tor.
|
||||
- It's okay to add new data elements to cells, if they would be
|
||||
ignored by previous versions of Tor.
|
||||
- For forward compatibility, Tor must ignore cell types it doesn't
|
||||
recognize, and ignore data in those cells it doesn't expect.
|
||||
- Clients can inspect the version of Tor declared in the platform line
|
||||
of a router's descriptor, and use that to learn whether a server
|
||||
supports a given feature. Servers, however, aren't assumed to all
|
||||
know about each other, and so don't know the version of who they're
|
||||
talking to.
|
||||
|
||||
This system has these problems:
|
||||
- It's very hard to change fundamental aspects of the protocol, like the
|
||||
cell format, the link protocol, any of the various encryption schemes,
|
||||
and so on.
|
||||
- The router-to-router link protocol has remained more-or-less frozen
|
||||
for a long time, since we can't easily have an OR use new features
|
||||
unless it knows the other OR will understand them.
|
||||
|
||||
We need to resolve these problems because:
|
||||
- Our cipher suite is showing its age: SHA1/AES128/RSA1024/DH1024 will
|
||||
not seem like the best idea for all time.
|
||||
- There are many ideas circulating for multiple cell sizes; while it's
|
||||
not obvious whether these are safe, we can't do them at all without a
|
||||
mechanism to permit them.
|
||||
- There are many ideas circulating for alternative circuit building and
|
||||
cell relay rules: they don't work unless they can coexist in the
|
||||
current network.
|
||||
- If our protocol changes a lot, it's hard to describe any coherent
|
||||
version of it: we need to say "the version that Tor versions W through
|
||||
X use when talking to versions Y through Z". This makes analysis
|
||||
harder.
|
||||
|
||||
Motivation: Preventing MITM attacks
|
||||
|
||||
TLS prevents a man-in-the-middle attacker from reading or changing the
|
||||
contents of a communication. It does not, however, prevent such an
|
||||
attacker from observing timing information. Since timing attacks are some
|
||||
of the most effective against low-latency anonymity nets like Tor, we
|
||||
should take more care to make sure that we're not only talking to who
|
||||
we think we're talking to, but that we're using the network path we
|
||||
believe we're using.
|
||||
|
||||
Motivation: Signed clock information
|
||||
|
||||
It's very useful for Tor instances to know how skewed they are relative
|
||||
to one another. The only way to find out currently has been to download
|
||||
directory information, and check the Date header--but this is not
|
||||
authenticated, and hence subject to modification on the wire. Using
|
||||
BEGIN_DIR to create an authenticated directory stream through an existing
|
||||
circuit is better, but that's an extra step and it might be nicer to
|
||||
learn the information in the course of the regular protocol.
|
||||
|
||||
Proposal:
|
||||
|
||||
1.0. Version numbers
|
||||
|
||||
The node-to-node TLS-based "OR connection" protocol and the multi-hop
|
||||
"circuit" protocol are versioned quasi-independently.
|
||||
|
||||
Of course, some dependencies will continue to exist: Certain versions
|
||||
of the circuit protocol may require a minimum version of the connection
|
||||
protocol to be used. The connection protocol affects:
|
||||
- Initial connection setup, link encryption, transport guarantees,
|
||||
etc.
|
||||
- The allowable set of cell commands
|
||||
- Allowable formats for cells.
|
||||
|
||||
The circuit protocol determines:
|
||||
- How circuits are established and maintained
|
||||
- How cells are decrypted and relayed
|
||||
- How streams are established and maintained.
|
||||
|
||||
Version numbers are incremented for backward-incompatible protocol changes
|
||||
only. Backward-compatible changes are generally implemented by adding
|
||||
additional fields to existing structures; implementations MUST ignore
|
||||
fields they do not expect. Unused portions of cells MUST be set to zero.
|
||||
|
||||
Though versioning the protocol will make it easier to maintain backward
|
||||
compatibility with older versions of Tor, we will nevertheless continue to
|
||||
periodically drop support for older protocols,
|
||||
- to keep the implementation from growing without bound,
|
||||
- to limit the maintenance burden of patching bugs in obsolete Tors,
|
||||
- to limit the testing burden of verifying that many old protocol
|
||||
versions continue to be implemented properly, and
|
||||
- to limit the exposure of the network to protocol versions that are
|
||||
expensive to support.
|
||||
|
||||
The Tor protocol as implemented through the 0.1.2.x Tor series will be
|
||||
called "version 1" in its link protocol and "version 1" in its relay
|
||||
protocol. Versions of the Tor protocol so old as to be incompatible with
|
||||
Tor 0.1.2.x can be considered to be version 0 of each, and are not
|
||||
supported.
|
||||
|
||||
2.1. VERSIONS cells
|
||||
|
||||
When a Tor connection is established, both parties normally send a
|
||||
VERSIONS cell before sending any other cells. (But see below.)
|
||||
|
||||
VersionsLen [2 byte]
|
||||
Versions [VersionsLen bytes]
|
||||
|
||||
"Versions" is a sequence of VersionsLen bytes. Each value between 1 and
|
||||
127 inclusive represents a single version; current implementations MUST
|
||||
ignore other bytes. Parties should list all of the versions which they
|
||||
are able and willing to support. Parties can only communicate if they
|
||||
have some connection protocol version in common.
|
||||
|
||||
Version 0.2.0.x-alpha and earlier don't understand VERSIONS cells,
|
||||
and therefore don't support version negotiation. Thus, waiting until
|
||||
the other side has sent a VERSIONS cell won't work for these servers:
|
||||
if the other side sends no cells back, it is impossible to tell
|
||||
whether they
|
||||
have sent a VERSIONS cell that has been stalled, or whether they have
|
||||
dropped our own VERSIONS cell as unrecognized. Therefore, we'll
|
||||
change the TLS negotiation parameters so that old parties can still
|
||||
negotiate, but new parties can recognize each other. Immediately
|
||||
after a TLS connection has been established, the parties check
|
||||
whether the other side negotiated the connection in an "old" way or a
|
||||
"new" way. If either party negotiated in the "old" way, we assume a
|
||||
v1 connection. Otherwise, both parties send VERSIONS cells listing
|
||||
all their supported versions. Upon receiving the other party's
|
||||
VERSIONS cell, the implementation begins using the highest-valued
|
||||
version common to both cells. If the first cell from the other party
|
||||
has a recognized command, and is _not_ a VERSIONS cell, we assume a
|
||||
v1 protocol.
|
||||
|
||||
(For more detail on the TLS protocol change, see forthcoming draft
|
||||
proposals from Steven Murdoch.)
|
||||
|
||||
Implementations MUST discard VERSIONS cells that are not the first
|
||||
recognized cells sent on a connection.
|
||||
|
||||
The VERSIONS cell must be sent as a v1 cell (2 bytes of circuitID, 1
|
||||
byte of command, 509 bytes of payload).
|
||||
|
||||
[NOTE: The VERSIONS cell is assigned the command number 7.]
|
||||
|
||||
2.2. MITM-prevention and time checking
|
||||
|
||||
If we negotiate a v2 connection or higher, the second cell we send SHOULD
|
||||
be a NETINFO cell. Implementations SHOULD NOT send NETINFO cells at other
|
||||
times.
|
||||
|
||||
A NETINFO cell contains:
|
||||
Timestamp [4 bytes]
|
||||
Other OR's address [variable]
|
||||
Number of addresses [1 byte]
|
||||
This OR's addresses [variable]
|
||||
|
||||
Timestamp is the OR's current Unix time, in seconds since the epoch. If
|
||||
an implementation receives time values from many ORs that
|
||||
indicate that its clock is skewed, it SHOULD try to warn the
|
||||
administrator. (We leave the definition of 'many' intentionally vague
|
||||
for now.)
|
||||
|
||||
Before believing the timestamp in a NETINFO cell, implementations
|
||||
SHOULD compare the time at which they received the cell to the time
|
||||
when they sent their VERSIONS cell. If the difference is very large,
|
||||
it is likely that the cell was delayed long enough that its
|
||||
contents are out of date.
|
||||
|
||||
Each address contains Type/Length/Value as used in Section 6.4 of
|
||||
tor-spec.txt. The first address is the one that the party sending
|
||||
the NETINFO cell believes the other has -- it can be used to learn
|
||||
what your IP address is if you have no other hints.
|
||||
The rest of the addresses are the advertised addresses of the party
|
||||
sending the NETINFO cell -- we include them
|
||||
to block a man-in-the-middle attack on TLS that lets an attacker bounce
|
||||
traffic through his own computers to enable timing and packet-counting
|
||||
attacks.
|
||||
|
||||
A Tor instance should use the other Tor's reported address
|
||||
information as part of logic to decide whether to treat a given
|
||||
connection as suitable for extending circuits to a given address/ID
|
||||
combination. When we get an extend request, we use an
|
||||
existing OR connection if the ID matches, and ANY of the following
|
||||
conditions hold:
|
||||
- The IP matches the requested IP.
|
||||
- We know that the IP we're using is canonical because it was
|
||||
listed in the NETINFO cell.
|
||||
- We know that the IP we're using is canonical because it was
|
||||
listed in the server descriptor.
|
||||
|
||||
[NOTE: The NETINFO cell is assigned the command number 8.]
|
||||
|
||||
Discussion: Versions versus feature lists
|
||||
|
||||
Many protocols negotiate lists of available features instead of (or in
|
||||
addition to) protocol versions. While it's possible that some amount of
|
||||
feature negotiation could be supported in a later Tor, we should prefer to
|
||||
use protocol versions whenever possible, for reasons discussed in
|
||||
the "Anonymity Loves Company" paper.
|
||||
|
||||
Discussion: Bytes per version, versions per cell
|
||||
|
||||
This document provides for a one-byte count of how many versions a Tor
|
||||
supports, and allows one byte per version. Thus, it can only support only
|
||||
254 more versions of the protocol beyond the unallocated v0 and the
|
||||
current v1. If we ever need to split the protocol into 255 incompatible
|
||||
versions, we've probably screwed up badly somewhere.
|
||||
|
||||
Nevertheless, here are two ways we could support more versions:
|
||||
- Change the version count to a two-byte field that counts the number of
|
||||
_bytes_ used, and use a UTF8-style encoding: versions 0 through 127
|
||||
take one byte to encode, versions 128 through 2047 take two bytes to
|
||||
encode, and so on. We wouldn't need to parse any version higher than
|
||||
127 right now, since all bytes used to encode higher versions would
|
||||
have their high bit set.
|
||||
|
||||
We'd still have a limit of 380 simultaneously versions that could be
|
||||
declared in any version. This is probably okay.
|
||||
|
||||
- Decide that if we need to support more versions, we can add a
|
||||
MOREVERSIONS cell that gets sent before the VERSIONS cell. The spec
|
||||
above requires Tors to ignore unrecognized cell types that they get
|
||||
before the first VERSIONS cell, and still allows version negotiation
|
||||
to
|
||||
succeed.
|
||||
|
||||
[Resolution: Reserve the high bit and the v0 value for later use. If
|
||||
we ever have more live versions than we can fit in a cell, we've made a
|
||||
bad design decision somewhere along the line.]
|
||||
|
||||
Discussion: Reducing round-trips
|
||||
|
||||
It might be appealing to see if we can cram more information in the
|
||||
initial VERSIONS cell. For example, the contents of NETINFO will pretty
|
||||
soon be sent by everybody before any more information is exchanged, but
|
||||
decoupling them from the version exchange increases round-trips.
|
||||
|
||||
Instead, we could speculatively include handshaking information at
|
||||
the end of a VERSIONS cell, wrapped in a marker to indicate, "if we wind
|
||||
up speaking VERSION 2, here's the NETINFO I'll send. Otherwise, ignore
|
||||
this." This could be extended to opportunistically reduce round trips
|
||||
when possible for future versions when we guess the versions right.
|
||||
|
||||
Of course, we'd need to be careful about using a feature like this:
|
||||
- We don't want to include things that are expensive to compute,
|
||||
like PK signatures or proof-of-work.
|
||||
- We don't want to speculate as a mobile client: it may leak our
|
||||
experience with the server in question.
|
||||
|
||||
Discussion: Advertising versions in routerdescs and networkstatuses.
|
||||
|
||||
In network-statuses:
|
||||
|
||||
The networkstatus "v" line now has the format:
|
||||
"v" IMPLEMENTATION IMPL-VERSION "Link" LINK-VERSION-LIST
|
||||
"Circuit" CIRCUIT-VERSION-LIST NL
|
||||
|
||||
LINK-VERSION-LIST and CIRCUIT-VERSION-LIST are comma-separated lists of
|
||||
supported version numbers. IMPLEMENTATION is the name of the
|
||||
implementation of the Tor protocol (e.g., "Tor"), and IMPL-VERSION is the
|
||||
version of the implementation.
|
||||
|
||||
Examples:
|
||||
v Tor 0.2.5.1-alpha Link 1,2,3 Circuit 2,5
|
||||
|
||||
v OtherOR 2000+ Link 3 Circuit 5
|
||||
|
||||
Implementations that release independently of the Tor codebase SHOULD NOT
|
||||
use "Tor" as the value of their IMPLEMENTATION.
|
||||
|
||||
Additional fields on the "v" line MUST be ignored.
|
||||
|
||||
In router descriptors:
|
||||
|
||||
The router descriptor should contain a line of the form,
|
||||
"protocols" "Link" LINK-VERSION-LIST "Circuit" CIRCUIT_VERSION_LIST
|
||||
|
||||
Additional fields on the "protocols" line MUST be ignored.
|
||||
|
||||
[Versions of Tor before 0.1.2.5-alpha rejected router descriptors with
|
||||
unrecognized items; the protocols line should be preceded with an "opt"
|
||||
until these Tors are obsolete.]
|
||||
|
||||
Security issues:
|
||||
|
||||
Client partitioning is the big danger when we introduce new versions; if a
|
||||
client supports some very unusual set of protocol versions, it will stand
|
||||
out from others no matter where it goes. If a server supports an unusual
|
||||
version, it will get a disproportionate amount of traffic from clients who
|
||||
prefer that version. We can mitigate this somewhat as follows:
|
||||
|
||||
- Do not have clients prefer any protocol version by default until that
|
||||
version is widespread. (First introduce the new version to servers,
|
||||
and have clients admit to using it only when configured to do so for
|
||||
testing. Then, once many servers are running the new protocol
|
||||
version, enable its use by default.)
|
||||
|
||||
- Do not multiply protocol versions needlessly.
|
||||
|
||||
- Encourage protocol implementors to implement the same protocol version
|
||||
sets as some popular version of Tor.
|
||||
|
||||
- Disrecommend very old/unpopular versions of Tor via the directory
|
||||
authorities' RecommmendedVersions mechanism, even if it is still
|
||||
technically possible to use them.
|
||||
|
@ -1,113 +0,0 @@
|
||||
Filename: 106-less-tls-constraint.txt
|
||||
Title: Checking fewer things during TLS handshakes
|
||||
Version: $Revision$
|
||||
Last-Modified: $Date$
|
||||
Author: Nick Mathewson
|
||||
Created: 9-Feb-2007
|
||||
Status: Closed
|
||||
Implemented-In: 0.2.0.x
|
||||
|
||||
Overview:
|
||||
|
||||
This document proposes that we relax our requirements on the context of
|
||||
X.509 certificates during initial TLS handshakes.
|
||||
|
||||
Motivation:
|
||||
|
||||
Later, we want to try harder to avoid protocol fingerprinting attacks.
|
||||
This means that we'll need to make our connection handshake look closer
|
||||
to a regular HTTPS connection: one certificate on the server side and
|
||||
zero certificates on the client side. For now, about the best we
|
||||
can do is to stop requiring things during handshake that we don't
|
||||
actually use.
|
||||
|
||||
What we check now, and where we check it:
|
||||
|
||||
tor_tls_check_lifetime:
|
||||
peer has certificate
|
||||
notBefore <= now <= notAfter
|
||||
|
||||
tor_tls_verify:
|
||||
peer has at least one certificate
|
||||
There is at least one certificate in the chain
|
||||
At least one of the certificates in the chain is not the one used to
|
||||
negotiate the connection. (The "identity cert".)
|
||||
The certificate _not_ used to negotiate the connection has signed the
|
||||
link cert
|
||||
|
||||
tor_tls_get_peer_cert_nickname:
|
||||
peer has a certificate.
|
||||
certificate has a subjectName.
|
||||
subjectName has a commonName.
|
||||
commonName consists only of characters in LEGAL_NICKNAME_CHARACTERS. [2]
|
||||
|
||||
tor_tls_peer_has_cert:
|
||||
peer has a certificate.
|
||||
|
||||
connection_or_check_valid_handshake:
|
||||
tor_tls_peer_has_cert [1]
|
||||
tor_tls_get_peer_cert_nickname [1]
|
||||
tor_tls_verify [1]
|
||||
If nickname in cert is a known, named router, then its identity digest
|
||||
must be as expected.
|
||||
If we initiated the connection, then we got the identity digest we
|
||||
expected.
|
||||
|
||||
USEFUL THINGS WE COULD DO:
|
||||
|
||||
[1] We could just not force clients to have any certificate at all, let alone
|
||||
an identity certificate. Internally to the code, we could assign the
|
||||
identity_digest field of these or_connections to a random number, or even
|
||||
not add them to the identity_digest->or_conn map.
|
||||
[so if somebody connects with no certs, we let them. and mark them as
|
||||
a client and don't treat them as a server. great. -rd]
|
||||
|
||||
[2] Instead of using a restricted nickname character set that makes our
|
||||
commonName structure look unlike typical SSL certificates, we could treat
|
||||
the nickname as extending from the start of the commonName up to but not
|
||||
including the first non-nickname character.
|
||||
|
||||
Alternatively, we could stop checking commonNames entirely. We don't
|
||||
actually _do_ anything based on the nickname in the certificate, so
|
||||
there's really no harm in letting every router have any commonName it
|
||||
wants.
|
||||
[this is the better choice -rd]
|
||||
[agreed. -nm]
|
||||
|
||||
REMAINING WAYS TO RECOGNIZE CLIENT->SERVER CONNECTIONS:
|
||||
|
||||
Assuming that we removed the above requirements, we could then (in a later
|
||||
release) have clients not send certificates, and sometimes and started
|
||||
making our DNs a little less formulaic, client->server OR connections would
|
||||
still be recognizable by:
|
||||
having a two-certificate chain sent by the server
|
||||
using a particular set of ciphersuites
|
||||
traffic patterns
|
||||
probing the server later
|
||||
|
||||
OTHER IMPLICATIONS:
|
||||
|
||||
If we stop verifying the above requirements:
|
||||
|
||||
It will be slightly (but only slightly) more common to connect to a non-Tor
|
||||
server running TLS, and believe that you're talking to a Tor server (until
|
||||
you send the first cell).
|
||||
|
||||
It will be far easier for non-Tor SSL clients to accidentally connect to
|
||||
Tor servers and speak HTTPS or whatever to them.
|
||||
|
||||
If, in a later release, we have clients not send certificates, and we make
|
||||
DNs less recognizable:
|
||||
|
||||
If clients don't send certs, servers don't need to verify them: win!
|
||||
|
||||
If we remove these restrictions, it will be easier for people to write
|
||||
clients to fuzz our protocol: sorta win!
|
||||
|
||||
If clients don't send certs, they look slightly less like servers.
|
||||
|
||||
OTHER SPEC CHANGES:
|
||||
|
||||
When a client doesn't give us an identity, we should never extend any
|
||||
circuits to it (duh), and we should allow it to set circuit ID however it
|
||||
wants.
|
@ -1,56 +0,0 @@
|
||||
Filename: 107-uptime-sanity-checking.txt
|
||||
Title: Uptime Sanity Checking
|
||||
Version: $Revision$
|
||||
Last-Modified: $Date$
|
||||
Author: Kevin Bauer & Damon McCoy
|
||||
Created: 8-March-2007
|
||||
Status: Closed
|
||||
Implemented-In: 0.2.0.x
|
||||
|
||||
Overview:
|
||||
|
||||
This document describes how to cap the uptime that is used when computing
|
||||
which routers are marked as stable such that highly stable routers cannot
|
||||
be displaced by malicious routers that report extremely high uptime
|
||||
values.
|
||||
|
||||
This is similar to how bandwidth is capped at 1.5MB/s.
|
||||
|
||||
Motivation:
|
||||
|
||||
It has been pointed out that an attacker can displace all stable nodes and
|
||||
entry guard nodes by reporting high uptimes. This is an easy fix that will
|
||||
prevent highly stable nodes from being displaced.
|
||||
|
||||
Security implications:
|
||||
|
||||
It should decrease the effectiveness of routing attacks that report high
|
||||
uptimes while not impacting the normal routing algorithms.
|
||||
|
||||
Specification:
|
||||
|
||||
So we could patch Section 3.1 of dir-spec.txt to say:
|
||||
|
||||
"Stable" -- A router is 'Stable' if it is running, valid, not
|
||||
hibernating, and either its uptime is at least the median uptime for
|
||||
known running, valid, non-hibernating routers, or its uptime is at
|
||||
least 30 days. Routers are never called stable if they are running
|
||||
a version of Tor known to drop circuits stupidly. (0.1.1.10-alpha
|
||||
through 0.1.1.16-rc are stupid this way.)
|
||||
|
||||
Compatibility:
|
||||
|
||||
There should be no compatibility issues due to uptime capping.
|
||||
|
||||
Implementation:
|
||||
|
||||
Implemented and merged into dir-spec in 0.2.0.0-alpha-dev (r9788).
|
||||
|
||||
Discussion:
|
||||
|
||||
Initially, this proposal set the maximum at 60 days, not 30; the 30 day
|
||||
limit and spec wording was suggested by Roger in an or-dev post on 9 March
|
||||
2007.
|
||||
|
||||
This proposal also led to 108-mtbf-based-stability.txt
|
||||
|
@ -1,90 +0,0 @@
|
||||
Filename: 108-mtbf-based-stability.txt
|
||||
Title: Base "Stable" Flag on Mean Time Between Failures
|
||||
Version: $Revision$
|
||||
Last-Modified: $Date$
|
||||
Author: Nick Mathewson
|
||||
Created: 10-Mar-2007
|
||||
Status: Closed
|
||||
Implemented-In: 0.2.0.x
|
||||
|
||||
Overview:
|
||||
|
||||
This document proposes that we change how directory authorities set the
|
||||
stability flag from inspection of a router's declared Uptime to the
|
||||
authorities' perceived mean time between failure for the router.
|
||||
|
||||
Motivation:
|
||||
|
||||
Clients prefer nodes that the authorities call Stable. This flag is (as
|
||||
of 0.2.0.0-alpha-dev) set entirely based on the node's declared value for
|
||||
uptime. This creates an opportunity for malicious nodes to declare
|
||||
falsely high uptimes in order to get more traffic.
|
||||
|
||||
Spec changes:
|
||||
|
||||
Replace the current rule for setting the Stable flag with:
|
||||
|
||||
"Stable" -- A router is 'Stable' if it is active and its observed Stability
|
||||
for the past month is at or above the median Stability for active routers.
|
||||
Routers are never called stable if they are running a version of Tor
|
||||
known to drop circuits stupidly. (0.1.1.10-alpha through 0.1.1.16-rc
|
||||
are stupid this way.)
|
||||
|
||||
Stability shall be defined as the weighted mean length of the runs
|
||||
observed by a given directory authority. A run begins when an authority
|
||||
decides that the server is Running, and ends when the authority decides
|
||||
that the server is not Running. In-progress runs are counted when
|
||||
measuring Stability. When calculating the mean, runs are weighted by
|
||||
$\alpha ^ t$, where $t$ is time elapsed since the end of the run, and
|
||||
$0 < \alpha < 1$. Time when an authority is down do not count to the
|
||||
length of the run.
|
||||
|
||||
Rejected Alternative:
|
||||
|
||||
"A router's Stability shall be defined as the sum of $\alpha ^ d$ for every
|
||||
$d$ such that the router was considered reachable for the entire day
|
||||
$d$ days ago.
|
||||
|
||||
This allows a simpler implementation: every day, we multiply
|
||||
yesterday's Stability by alpha, and if the router was observed to be
|
||||
available every time we looked today, we add 1.
|
||||
|
||||
Instead of "day", we could pick an arbitrary time unit. We should
|
||||
pick alpha to be high enough that long-term stability counts, but low
|
||||
enough that the distant past is eventually forgotten. Something
|
||||
between .8 and .95 seems right.
|
||||
|
||||
(By requiring that routers be up for an entire day to get their
|
||||
stability increased, instead of counting fractions of a day, we
|
||||
capture the notion that stability is more like "probability of
|
||||
staying up for the next hour" than it is like "probability of being
|
||||
up at some randomly chosen time over the next hour." The former
|
||||
notion of stability is far more relevant for long-lived circuits.)
|
||||
|
||||
Limitations:
|
||||
|
||||
Authorities can have false positives and false negatives when trying to
|
||||
tell whether a router is up or down. So long as these aren't terribly
|
||||
wrong, and so long as they aren't significantly biased, we should be able
|
||||
to use them to estimate stability pretty well.
|
||||
|
||||
Probing approaches like the above could miss short incidents of
|
||||
downtime. If we use the router's declared uptime, we could detect
|
||||
these: but doing so would penalize routers who reported their uptime
|
||||
accurately.
|
||||
|
||||
Implementation:
|
||||
|
||||
For now, the easiest way to store this information at authorities
|
||||
would probably be in some kind of periodically flushed flat file.
|
||||
Later, we could move to Berkeley db or something if we really had to.
|
||||
|
||||
For each router, an authority will need to store:
|
||||
The router ID.
|
||||
Whether the router is up.
|
||||
The time when the current run started, if the router is up.
|
||||
The weighted sum length of all previous runs.
|
||||
The time at which the weighted sum length was last weighted down.
|
||||
|
||||
Servers should probe at random intervals to test whether servers are
|
||||
running.
|
@ -1,92 +0,0 @@
|
||||
Filename: 109-no-sharing-ips.txt
|
||||
Title: No more than one server per IP address.
|
||||
Version: $Revision$
|
||||
Last-Modified: $Date$
|
||||
Author: Kevin Bauer & Damon McCoy
|
||||
Created: 9-March-2007
|
||||
Status: Closed
|
||||
Implemented-In: 0.2.0.x
|
||||
|
||||
Overview:
|
||||
This document describes a solution to a Sybil attack vulnerability in the
|
||||
directory servers. Currently, it is possible for a single IP address to
|
||||
host an arbitrarily high number of Tor routers. We propose that the
|
||||
directory servers limit the number of Tor routers that may be registered at
|
||||
a particular IP address to some small (fixed) number, perhaps just one Tor
|
||||
router per IP address.
|
||||
|
||||
While Tor never uses more than one server from a given /16 in the same
|
||||
circuit, an attacker with multiple servers in the same place is still
|
||||
dangerous because he can get around the per-server bandwidth cap that is
|
||||
designed to prevent a single server from attracting too much of the overall
|
||||
traffic.
|
||||
|
||||
Motivation:
|
||||
Since it is possible for an attacker to register an arbitrarily large
|
||||
number of Tor routers, it is possible for malicious parties to do this
|
||||
as part of a traffic analysis attack.
|
||||
|
||||
Security implications:
|
||||
This countermeasure will increase the number of IP addresses that an
|
||||
attacker must control in order to carry out traffic analysis.
|
||||
|
||||
Specification:
|
||||
|
||||
For each IP address, each directory authority tracks the number of routers
|
||||
using that IP address, along with their total observed bandwidth. If there
|
||||
are more than MAX_SERVERS_PER_IP servers at some IP, the authority should
|
||||
"disable" all but MAX_SERVERS_PER_IP servers. When choosing which servers
|
||||
to disable, the authority should first disable non-Running servers in
|
||||
increasing order of observed bandwidth, and then should disable Running
|
||||
servers in increasing order of bandwidth.
|
||||
|
||||
[[ We don't actually do this part here. -NM
|
||||
|
||||
If the total observed
|
||||
bandwidth of the remaining non-"disabled" servers exceeds MAX_BW_PER_IP,
|
||||
the authority should "disable" some of the remaining servers until only one
|
||||
server remains, or until the remaining observed bandwidth of non-"disabled"
|
||||
servers is under MAX_BW_PER_IP.
|
||||
]]
|
||||
|
||||
Servers that are "disabled" MUST be marked as non-Valid and non-Running.
|
||||
|
||||
MAX_SERVERS_PER_IP is 3.
|
||||
|
||||
MAX_BW_PER_IP is 8 MB per s.
|
||||
|
||||
Compatibility:
|
||||
|
||||
Upon inspection of a directory server, we found that the following IP
|
||||
addresses have more than one Tor router:
|
||||
|
||||
Scruples 68.5.113.81 ip68-5-113-81.oc.oc.cox.net 443
|
||||
WiseUp 68.5.113.81 ip68-5-113-81.oc.oc.cox.net 9001
|
||||
Unnamed 62.1.196.71 pc01-megabyte-net-arkadiou.megabyte.gr 9001
|
||||
Unnamed 62.1.196.71 pc01-megabyte-net-arkadiou.megabyte.gr 9001
|
||||
Unnamed 62.1.196.71 pc01-megabyte-net-arkadiou.megabyte.gr 9001
|
||||
aurel 85.180.62.138 e180062138.adsl.alicedsl.de 9001
|
||||
sokrates 85.180.62.138 e180062138.adsl.alicedsl.de 9001
|
||||
moria1 18.244.0.188 moria.mit.edu 9001
|
||||
peacetime 18.244.0.188 moria.mit.edu 9100
|
||||
|
||||
There may exist compatibility issues with this proposed fix. Reasons why
|
||||
more than one server would share an IP address include:
|
||||
|
||||
* Testing. moria1, moria2, peacetime, and other morias all run on one
|
||||
computer at MIT, because that way we get testing. Moria1 and moria2 are
|
||||
run by Roger, and peacetime is run by Nick.
|
||||
* NAT. If there are several servers but they port-forward through the same
|
||||
IP address, ... we can hope that the operators coordinate with each
|
||||
other. Also, we should recognize that while they help the network in
|
||||
terms of increased capacity, they don't help as much as they could in
|
||||
terms of location diversity. But our approach so far has been to take
|
||||
what we can get.
|
||||
* People who have more than 1.5MB/s and want to help out more. For
|
||||
example, for a while Tonga was offering 10MB/s and its Tor server
|
||||
would only make use of a bit of it. So Roger suggested that he run
|
||||
two Tor servers, to use more.
|
||||
|
||||
[Note Roger's tweak to this behavior, in
|
||||
http://archives.seul.org/or/cvs/Oct-2007/msg00118.html]
|
||||
|
@ -1,122 +0,0 @@
|
||||
Filename: 110-avoid-infinite-circuits.txt
|
||||
Title: Avoiding infinite length circuits
|
||||
Version: $Revision$
|
||||
Last-Modified: $Date$
|
||||
Author: Roger Dingledine
|
||||
Created: 13-Mar-2007
|
||||
Status: Accepted
|
||||
Target: 0.2.1.x
|
||||
Implemented-In: 0.2.1.3-alpha
|
||||
|
||||
History:
|
||||
|
||||
Revised 28 July 2008 by nickm: set K.
|
||||
Revised 3 July 2008 by nickm: rename from relay_extend to
|
||||
relay_early. Revise to current migration plan. Allow K cells
|
||||
over circuit lifetime, not just at start.
|
||||
|
||||
Overview:
|
||||
|
||||
Right now, an attacker can add load to the Tor network by extending a
|
||||
circuit an arbitrary number of times. Every cell that goes down the
|
||||
circuit then adds N times that amount of load in overall bandwidth
|
||||
use. This vulnerability arises because servers don't know their position
|
||||
on the path, so they can't tell how many nodes there are before them
|
||||
on the path.
|
||||
|
||||
We propose a new set of relay cells that are distinguishable by
|
||||
intermediate hops as permitting extend cells. This approach will allow
|
||||
us to put an upper bound on circuit length relative to the number of
|
||||
colluding adversary nodes; but there are some downsides too.
|
||||
|
||||
Motivation:
|
||||
|
||||
The above attack can be used to generally increase load all across the
|
||||
network, or it can be used to target specific servers: by building a
|
||||
circuit back and forth between two victim servers, even a low-bandwidth
|
||||
attacker can soak up all the bandwidth offered by the fastest Tor
|
||||
servers.
|
||||
|
||||
The general attacks could be used as a demonstration that Tor isn't
|
||||
perfect (leading to yet more media articles about "breaking" Tor), and
|
||||
the targetted attacks will come into play once we have a reputation
|
||||
system -- it will be trivial to DoS a server so it can't pass its
|
||||
reputation checks, in turn impacting security.
|
||||
|
||||
Design:
|
||||
|
||||
We should split RELAY cells into two types: RELAY and RELAY_EARLY.
|
||||
|
||||
Only K (say, 10) Relay_early cells can be sent across a circuit, and
|
||||
only relay_early cells are allowed to contain extend requests. We
|
||||
still support obscuring the length of the circuit (if more research
|
||||
shows us what to do), because Alice can choose how many of the K to
|
||||
mark as relay_early. Note that relay_early cells *can* contain any
|
||||
sort of data cell; so in effect it's actually the relay type cells
|
||||
that are restricted. By default, she would just send the first K
|
||||
data cells over the stream as relay_early cells, regardless of their
|
||||
actual type.
|
||||
|
||||
(Note that a circuit that is out of relay_early cells MUST NOT be
|
||||
cannibalized later, since it can't extend. Note also that it's always okay
|
||||
to use regular RELAY cells when sending non-EXTEND commands targetted at
|
||||
the first hop of a circuit, since there is no intermediate hop to try to
|
||||
learn the relay command type.)
|
||||
|
||||
Each intermediate server would pass on the same type of cell that it
|
||||
received (either relay or relay_early), and the cell's destination
|
||||
will be able to learn whether it's allowed to contain an Extend request.
|
||||
|
||||
If an intermediate server receives more than K relay_early cells, or
|
||||
if it sees a relay cell that contains an extend request, then it
|
||||
tears down the circuit (protocol violation).
|
||||
|
||||
Security implications:
|
||||
|
||||
The upside is that this limits the bandwidth amplification factor to
|
||||
K: for an individual circuit to become arbitrary-length, the attacker
|
||||
would need an adversary-controlled node every K hops, and at that
|
||||
point the attack is no worse than if the attacker creates N/K separate
|
||||
K-hop circuits.
|
||||
|
||||
On the other hand, we want to pick a large enough value of K that we
|
||||
don't mind the cap.
|
||||
|
||||
If we ever want to take steps to hide the number of hops in the circuit
|
||||
or a node's position in the circuit, this design probably makes that
|
||||
more complex.
|
||||
|
||||
Migration:
|
||||
|
||||
In 0.2.0, servers speaking v2 or later of the link protocol accept
|
||||
RELAY_EARLY cells, and pass them on. If the next OR in the circuit
|
||||
is not speaking the v2 link protocol, the server relays the cell as
|
||||
a RELAY cell.
|
||||
|
||||
In 0.2.1.3-alpha, clients begin using RELAY_EARLY cells on v2
|
||||
connections. This functionality can be safely backported to
|
||||
0.2.0.x. Clients should pick a random number betweeen (say) K and
|
||||
K-2 to send.
|
||||
|
||||
In 0.2.1.3-alpha, servers close any circuit in which more than K
|
||||
relay_early cells are sent.
|
||||
|
||||
Once all versions the do not send RELAY_EARLY cells are obsolete,
|
||||
servers can begin to reject any EXTEND requests not sent in a
|
||||
RELAY_EARLY cell.
|
||||
|
||||
Parameters:
|
||||
|
||||
Let K = 8, for no terribly good reason.
|
||||
|
||||
Spec:
|
||||
|
||||
[We can formalize this part once we think the design is a good one.]
|
||||
|
||||
Acknowledgements:
|
||||
|
||||
This design has been kicking around since Christian Grothoff and I came
|
||||
up with it at PET 2004. (Nathan Evans, Christian Grothoff's student,
|
||||
is working on implementing a fix based on this design in the summer
|
||||
2007 timeframe.)
|
||||
|
@ -1,153 +0,0 @@
|
||||
Filename: 111-local-traffic-priority.txt
|
||||
Title: Prioritizing local traffic over relayed traffic
|
||||
Version: $Revision$
|
||||
Last-Modified: $Date$
|
||||
Author: Roger Dingledine
|
||||
Created: 14-Mar-2007
|
||||
Status: Closed
|
||||
Implemented-In: 0.2.0.x
|
||||
|
||||
Overview:
|
||||
|
||||
We describe some ways to let Tor users operate as a relay and enforce
|
||||
rate limiting for relayed traffic without impacting their locally
|
||||
initiated traffic.
|
||||
|
||||
Motivation:
|
||||
|
||||
Right now we encourage people who use Tor as a client to configure it
|
||||
as a relay too ("just click the button in Vidalia"). Most of these users
|
||||
are on asymmetric links, meaning they have a lot more download capacity
|
||||
than upload capacity. But if they enable rate limiting too, suddenly
|
||||
they're limited to the same download capacity as upload capacity. And
|
||||
they have to enable rate limiting, or their upstream pipe gets filled
|
||||
up, starts dropping packets, and now their net connection doesn't work
|
||||
even for non-Tor stuff. So they end up turning off the relaying part
|
||||
so they can use Tor (and other applications) again.
|
||||
|
||||
So far this hasn't mattered that much: most of our fast relays are
|
||||
being operated only in relay mode, so the rate limiting makes sense
|
||||
for them. But if we want to be able to attract many more relays in
|
||||
the future, we need to let ordinary users act as relays too.
|
||||
|
||||
Further, as we begin to deploy the blocking-resistance design and we
|
||||
rely on ordinary users to click the "Tor for Freedom" button, this
|
||||
limitation will become a serious stumbling block to getting volunteers
|
||||
to act as bridges.
|
||||
|
||||
The problem:
|
||||
|
||||
Tor implements its rate limiting on the 'read' side by only reading
|
||||
a certain number of bytes from the network in each second. If it has
|
||||
emptied its token bucket, it doesn't read any more from the network;
|
||||
eventually TCP notices and stalls until we resume reading. But if we
|
||||
want to have two classes of service, we can't know what class a given
|
||||
incoming cell will be until we look at it, at which point we've already
|
||||
read it.
|
||||
|
||||
Some options:
|
||||
|
||||
Option 1: read when our token bucket is full enough, and if it turns
|
||||
out that what we read was local traffic, then add the tokens back into
|
||||
the token bucket. This will work when local traffic load alternates
|
||||
with relayed traffic load; but it's a poor option in general, because
|
||||
when we're receiving both local and relayed traffic, there are plenty
|
||||
of cases where we'll end up with an empty token bucket, and then we're
|
||||
back where we were before.
|
||||
|
||||
More generally, notice that our problem is easy when a given TCP
|
||||
connection either has entirely local circuits or entirely relayed
|
||||
circuits. In fact, even if they are both present, if one class is
|
||||
entirely idle (none of its circuits have sent or received in the past
|
||||
N seconds), we can ignore that class until it wakes up again. So it
|
||||
only gets complex when a single connection contains active circuits
|
||||
of both classes.
|
||||
|
||||
Next, notice that local traffic uses only the entry guards, whereas
|
||||
relayed traffic likely doesn't. So if we're a bridge handling just
|
||||
a few users, the expected number of overlapping connections would be
|
||||
almost zero, and even if we're a full relay the number of overlapping
|
||||
connections will be quite small.
|
||||
|
||||
Option 2: build separate TCP connections for local traffic and for
|
||||
relayed traffic. In practice this will actually only require a few
|
||||
extra TCP connections: we would only need redundant TCP connections
|
||||
to at most the number of entry guards in use.
|
||||
|
||||
However, this approach has some drawbacks. First, if the remote side
|
||||
wants to extend a circuit to you, how does it know which TCP connection
|
||||
to send it on? We would need some extra scheme to label some connections
|
||||
"client-only" during construction. Perhaps we could do this by seeing
|
||||
whether any circuit was made via CREATE_FAST; but this still opens
|
||||
up a race condition where the other side sends a create request
|
||||
immediately. The only ways I can imagine to avoid the race entirely
|
||||
are to specify our preference in the VERSIONS cell, or to add some
|
||||
sort of "nope, not this connection, why don't you try another rather
|
||||
than failing" response to create cells, or to forbid create cells on
|
||||
connections that you didn't initiate and on which you haven't seen
|
||||
any circuit creation requests yet -- this last one would lead to a bit
|
||||
more connection bloat but doesn't seem so bad. And we already accept
|
||||
this race for the case where directory authorities establish new TCP
|
||||
connections periodically to check reachability, and then hope to hang
|
||||
up on them soon after. (In any case this issue is moot for bridges,
|
||||
since each destination will be one-way with respect to extend requests:
|
||||
either receiving extend requests from bridge users or sending extend
|
||||
requests to the Tor server, never both.)
|
||||
|
||||
The second problem with option 2 is that using two TCP connections
|
||||
reveals that there are two classes of traffic (and probably quickly
|
||||
reveals which is which, based on throughput). Now, it's unclear whether
|
||||
this information is already available to the other relay -- he would
|
||||
easily be able to tell that some circuits are fast and some are rate
|
||||
limited, after all -- but it would be nice to not add even more ways to
|
||||
leak that information. Also, it's less clear that an external observer
|
||||
already has this information if the circuits are all bundled together,
|
||||
and for this case it's worth trying to protect it.
|
||||
|
||||
Option 3: tell the other side about our rate limiting rules. When we
|
||||
establish the TCP connection, specify the different policy classes we
|
||||
have configured. Each time we extend a circuit, specify which policy
|
||||
class that circuit should be part of. Then hope the other side obeys
|
||||
our wishes. (If he doesn't, hang up on him.) Besides the design and
|
||||
coordination hassles involved in this approach, there's a big problem:
|
||||
our rate limiting classes apply to all our connections, not just
|
||||
pairwise connections. How does one server we're connected to know how
|
||||
much of our bucket has already been spent by another? I could imagine
|
||||
a complex and inefficient "ok, now you can send me those two more cells
|
||||
that you've got queued" protocol. I'm not sure how else we could do it.
|
||||
|
||||
(Gosh. How could UDP designs possibly be compatible with rate limiting
|
||||
with multiple bucket sizes?)
|
||||
|
||||
Option 4: put both classes of circuits over a single connection, and
|
||||
keep track of the last time we read or wrote a high-priority cell. If
|
||||
it's been less than N seconds, give the whole connection high priority,
|
||||
else give the whole connection low priority.
|
||||
|
||||
Option 5: put both classes of circuits over a single connection, and
|
||||
play a complex juggling game by periodically telling the remote side
|
||||
what rate limits to set for that connection, so you end up giving
|
||||
priority to the right connections but still stick to roughly your
|
||||
intended bandwidthrate and relaybandwidthrate.
|
||||
|
||||
Option 6: ?
|
||||
|
||||
Prognosis:
|
||||
|
||||
Nick really didn't like option 2 because of the partitioning questions.
|
||||
|
||||
I've put option 4 into place as of Tor 0.2.0.3-alpha.
|
||||
|
||||
In terms of implementation, it will be easy: just add a time_t to
|
||||
or_connection_t that specifies client_used (used by the initiator
|
||||
of the connection to rate limit it differently depending on how
|
||||
recently the time_t was reset). We currently update client_used
|
||||
in three places:
|
||||
- command_process_relay_cell() when we receive a relay cell for
|
||||
an origin circuit.
|
||||
- relay_send_command_from_edge() when we send a relay cell for
|
||||
an origin circuit.
|
||||
- circuit_deliver_create_cell() when send a create cell.
|
||||
We could probably remove the third case and it would still work,
|
||||
but hey.
|
||||
|
@ -1,165 +0,0 @@
|
||||
Filename: 112-bring-back-pathlencoinweight.txt
|
||||
Title: Bring Back Pathlen Coin Weight
|
||||
Version: $Revision$
|
||||
Last-Modified: $Date$
|
||||
Author: Mike Perry
|
||||
Created:
|
||||
Status: Superseded
|
||||
Superseded-By: 115
|
||||
|
||||
|
||||
Overview:
|
||||
|
||||
The idea is that users should be able to choose a weight which
|
||||
probabilistically chooses their path lengths to be 2 or 3 hops. This
|
||||
weight will essentially be a biased coin that indicates an
|
||||
additional hop (beyond 2) with probability P. The user should be
|
||||
allowed to choose 0 for this weight to always get 2 hops and 1 to
|
||||
always get 3.
|
||||
|
||||
This value should be modifiable from the controller, and should be
|
||||
available from Vidalia.
|
||||
|
||||
|
||||
Motivation:
|
||||
|
||||
The Tor network is slow and overloaded. Increasingly often I hear
|
||||
stories about friends and friends of friends who are behind firewalls,
|
||||
annoying censorware, or under surveillance that interferes with their
|
||||
productivity and Internet usage, or chills their speech. These people
|
||||
know about Tor, but they choose to put up with the censorship because
|
||||
Tor is too slow to be usable for them. In fact, to download a fresh,
|
||||
complete copy of levine-timing.pdf for the Anonymity Implications
|
||||
section of this proposal over Tor took me 3 tries.
|
||||
|
||||
There are many ways to improve the speed problem, and of course we
|
||||
should and will implement as many as we can. Johannes's GSoC project
|
||||
and my reputation system are longer term, higher-effort things that
|
||||
will still provide benefit independent of this proposal.
|
||||
|
||||
However, reducing the path length to 2 for those who do not need the
|
||||
(questionable) extra anonymity 3 hops provide not only improves
|
||||
their Tor experience but also reduces their load on the Tor network by
|
||||
33%, and can be done in less than 10 lines of code. That's not just
|
||||
Win-Win, it's Win-Win-Win.
|
||||
|
||||
Furthermore, when blocking resistance measures insert an extra relay
|
||||
hop into the equation, 4 hops will certainly be completely unusable
|
||||
for these users, especially since it will be considerably more
|
||||
difficult to balance the load across a dark relay net than balancing
|
||||
the load on Tor itself (which today is still not without its flaws).
|
||||
|
||||
|
||||
Anonymity Implications:
|
||||
|
||||
It has long been established that timing attacks against mixed
|
||||
networks are extremely effective, and that regardless of path
|
||||
length, if the adversary has compromised your first and last
|
||||
hop of your path, you can assume they have compromised your
|
||||
identity for that connection.
|
||||
|
||||
In [1], it is demonstrated that for all but the slowest, lossiest
|
||||
networks, error rates for false positives and false negatives were
|
||||
very near zero. Only for constant streams of traffic over slow and
|
||||
(more importantly) extremely lossy network links did the error rate
|
||||
hit 20%. For loss rates typical to the Internet, even the error rate
|
||||
for slow nodes with constant traffic streams was 13%.
|
||||
|
||||
When you take into account that most Tor streams are not constant,
|
||||
but probably much more like their "HomeIP" dataset, which consists
|
||||
mostly of web traffic that exists over finite intervals at specific
|
||||
times, error rates drop to fractions of 1%, even for the "worst"
|
||||
network nodes.
|
||||
|
||||
Therefore, the user has little benefit from the extra hop, assuming
|
||||
the adversary does timing correlation on their nodes. The real
|
||||
protection is the probability of getting both the first and last hop,
|
||||
and this is constant whether the client chooses 2 hops, 3 hops, or 42.
|
||||
|
||||
Partitioning attacks form another concern. Since Tor uses telescoping
|
||||
to build circuits, it is possible to tell a user is constructing only
|
||||
two hop paths at the entry node. It is questionable if this data is
|
||||
actually worth anything though, especially if the majority of users
|
||||
have easy access to this option, and do actually choose their path
|
||||
lengths semi-randomly.
|
||||
|
||||
Nick has postulated that exits may also be able to tell that you are
|
||||
using only 2 hops by the amount of time between sending their
|
||||
RELAY_CONNECTED cell and the first bit of RELAY_DATA traffic they
|
||||
see from the OP. I doubt that they will be able to make much use
|
||||
of this timing pattern, since it will likely vary widely depending
|
||||
upon the type of node selected for that first hop, and the user's
|
||||
connection rate to that first hop. It is also questionable if this
|
||||
data is worth anything, especially if many users are using this
|
||||
option (and I imagine many will).
|
||||
|
||||
Perhaps most seriously, two hop paths do allow malicious guards
|
||||
to easily fail circuits if they do not extend to their colluding peers
|
||||
for the exit hop. Since guards can detect the number of hops in a
|
||||
path, they could always fail the 3 hop circuits and focus on
|
||||
selectively failing the two hop ones until a peer was chosen.
|
||||
|
||||
I believe currently guards are rotated if circuits fail, which does
|
||||
provide some protection, but this could be changed so that an entry
|
||||
guard is completely abandoned after a certain ratio of extend or
|
||||
general circuit failures with respect to non-failed circuits. This
|
||||
could possibly be gamed to increase guard turnover, but such a game
|
||||
would be much more noticeable than an individual guard failing circuits,
|
||||
though, since it would affect all clients, not just those who chose
|
||||
a particular guard.
|
||||
|
||||
|
||||
Why not fix Pathlen=2?:
|
||||
|
||||
The main reason I am not advocating that we always use 2 hops is that
|
||||
in some situations, timing correlation evidence by itself may not be
|
||||
considered as solid and convincing as an actual, uninterrupted, fully
|
||||
traced path. Are these timing attacks as effective on a real network
|
||||
as they are in simulation? Would an extralegal adversary or authoritarian
|
||||
government even care? In the face of these situation-dependent unknowns,
|
||||
it should be up to the user to decide if this is a concern for them or not.
|
||||
|
||||
It should probably also be noted that even a false positive
|
||||
rate of 1% for a 200k concurrent-user network could mean that for a
|
||||
given node, a given stream could be confused with something like 10
|
||||
users, assuming ~200 nodes carry most of the traffic (ie 1000 users
|
||||
each). Though of course to really know for sure, someone needs to do
|
||||
an attack on a real network, unfortunately.
|
||||
|
||||
|
||||
Implementation:
|
||||
|
||||
new_route_len() can be modified directly with a check of the
|
||||
PathlenCoinWeight option (converted to percent) and a call to
|
||||
crypto_rand_int(0,100) for the weighted coin.
|
||||
|
||||
The entry_guard_t structure could have num_circ_failed and
|
||||
num_circ_succeeded members such that if it exceeds N% circuit
|
||||
extend failure rate to a second hop, it is removed from the entry list.
|
||||
N should be sufficiently high to avoid churn from normal Tor circuit
|
||||
failure as determined by TorFlow scans.
|
||||
|
||||
The Vidalia option should be presented as a boolean, to minimize confusion
|
||||
for the user. Something like a radiobutton with:
|
||||
|
||||
* "I use Tor for Censorship Resistance, not Anonymity. Speed is more
|
||||
important to me than Anonymity."
|
||||
* "I use Tor for Anonymity. I need extra protection at the cost of speed."
|
||||
|
||||
and then some explanation in the help for exactly what this means, and
|
||||
the risks involved with eliminating the adversary's need for timing attacks
|
||||
wrt to false positives, etc.
|
||||
|
||||
Migration:
|
||||
|
||||
Phase one: Experiment with the proper ratio of circuit failures
|
||||
used to expire garbage or malicious guards via TorFlow.
|
||||
|
||||
Phase two: Re-enable config and modify new_route_len() to add an
|
||||
extra hop if coin comes up "heads".
|
||||
|
||||
Phase three: Make radiobutton in Vidalia, along with help entry
|
||||
that explains in layman's terms the risks involved.
|
||||
|
||||
|
||||
[1] http://www.cs.umass.edu/~mwright/papers/levine-timing.pdf
|
@ -1,87 +0,0 @@
|
||||
Filename: 113-fast-authority-interface.txt
|
||||
Title: Simplifying directory authority administration
|
||||
Version: $Revision$
|
||||
Last-Modified: $Date$
|
||||
Author: Nick Mathewson
|
||||
Created:
|
||||
Status: Superseded
|
||||
|
||||
Overview
|
||||
|
||||
The problem:
|
||||
|
||||
Administering a directory authority is a pain: you need to go through
|
||||
emails and manually add new nodes as "named". When bad things come up,
|
||||
you need to mark nodes (or whole regions) as invalid, badexit, etc.
|
||||
|
||||
This means that mostly, authority admins don't: only 2/4 current authority
|
||||
admins actually bind names or list bad exits, and those two have often
|
||||
complained about how annoying it is to do so.
|
||||
|
||||
Worse, name binding is a common path, but it's a pain in the neck: nobody
|
||||
has done it for a couple of months.
|
||||
|
||||
Digression: who knows what?
|
||||
|
||||
It's trivial for Tor to automatically keep track of all of the
|
||||
following information about a server:
|
||||
name, fingerprint, IP, last-seen time, first-seen time, declared
|
||||
contact.
|
||||
|
||||
All we need to have the administrator set is:
|
||||
- Is this name/fingerprint pair bound?
|
||||
- Is this fingerprint/IP a bad exit?
|
||||
- Is this fingerprint/IP an invalid node?
|
||||
- Is this fingerprint/IP to be rejected?
|
||||
|
||||
The workflow for authority admins has two parts:
|
||||
- Periodically, go through tor-ops and add new names. This doesn't
|
||||
need to be done urgently.
|
||||
- Less often, mark badly behaved serves as badly behaved. This is more
|
||||
urgent.
|
||||
|
||||
Possible solution #1: Web-interface for name binding.
|
||||
|
||||
Deprecate use of the tor-ops mailing list; instead, have operators go to a
|
||||
webform and enter their server info. This would put the information in a
|
||||
standardized format, thus allowing quick, nearly-automated approval and
|
||||
reply.
|
||||
|
||||
Possible solution #2: Self-binding names.
|
||||
|
||||
Peter Palfrader has proposed that names be assigned automatically to nodes
|
||||
that have been up and running and valid for a while.
|
||||
|
||||
Possible solution #3: Self-maintaining approved-routers file
|
||||
|
||||
Mixminion alpha has a neat feature where whenever a new server is seen,
|
||||
a stub line gets added to a configuration file. For Tor, it could look
|
||||
something like this:
|
||||
|
||||
## First seen with this key on 2007-04-21 13:13:14
|
||||
## Stayed up for at least 12 hours on IP 192.168.10.10
|
||||
#RouterName AAAABBBBCCCCDDDDEFEF
|
||||
|
||||
(Note that the implementation needs to parse commented lines to make sure
|
||||
that it doesn't add duplicates, but that's not so hard.)
|
||||
|
||||
To add a router as named, administrators would only need to uncomment the
|
||||
entry. This automatically maintained file could be kept separately from a
|
||||
manually maintained one.
|
||||
|
||||
This could be combined with solution #2, such that Tor would do the hard
|
||||
work of uncommenting entries for routers that should get Named, but
|
||||
operators could override its decisions.
|
||||
|
||||
Possible solution #4: A separate mailing list for authority operators.
|
||||
|
||||
Right now, the tor-ops list is very high volume. There should be another
|
||||
list that's only for dealing with problems that need prompt action, like
|
||||
marking a router as !badexit.
|
||||
|
||||
Resolution:
|
||||
|
||||
Solution #2 is described in "Proposal 123: Naming authorities
|
||||
automatically create bindings", and that approach is implemented.
|
||||
There are remaining issues in the problem statement above that need
|
||||
their own solutions.
|
@ -1,441 +0,0 @@
|
||||
Filename: 114-distributed-storage.txt
|
||||
Title: Distributed Storage for Tor Hidden Service Descriptors
|
||||
Version: $Revision$
|
||||
Last-Modified: $Date$
|
||||
Author: Karsten Loesing
|
||||
Created: 13-May-2007
|
||||
Status: Closed
|
||||
Implemented-In: 0.2.0.x
|
||||
|
||||
Change history:
|
||||
|
||||
13-May-2007 Initial proposal
|
||||
14-May-2007 Added changes suggested by Lasse Øverlier
|
||||
30-May-2007 Changed descriptor format, key length discussion, typos
|
||||
09-Jul-2007 Incorporated suggestions by Roger, added status of specification
|
||||
and implementation for upcoming GSoC mid-term evaluation
|
||||
11-Aug-2007 Updated implementation statuses, included non-consecutive
|
||||
replication to descriptor format
|
||||
20-Aug-2007 Renamed config option HSDir as HidServDirectoryV2
|
||||
02-Dec-2007 Closed proposal
|
||||
|
||||
Overview:
|
||||
|
||||
The basic idea of this proposal is to distribute the tasks of storing and
|
||||
serving hidden service descriptors from currently three authoritative
|
||||
directory nodes among a large subset of all onion routers. The three
|
||||
reasons to do this are better robustness (availability), better
|
||||
scalability, and improved security properties. Further,
|
||||
this proposal suggests changes to the hidden service descriptor format to
|
||||
prevent new security threats coming from decentralization and to gain even
|
||||
better security properties.
|
||||
|
||||
Status:
|
||||
|
||||
As of December 2007, the new hidden service descriptor format is implemented
|
||||
and usable. However, servers and clients do not yet make use of descriptor
|
||||
cookies, because there are open usability issues of this feature that might
|
||||
be resolved in proposal 121. Further, hidden service directories do not
|
||||
perform replication by themselves, because (unauthorized) replica fetch
|
||||
requests would allow any attacker to fetch all hidden service descriptors in
|
||||
the system. As neither issue is critical to the functioning of v2
|
||||
descriptors and their distribution, this proposal is considered as Closed.
|
||||
|
||||
Motivation:
|
||||
|
||||
The current design of hidden services exhibits the following performance and
|
||||
security problems:
|
||||
|
||||
First, the three hidden service authoritative directories constitute a
|
||||
performance bottleneck in the system. The directory nodes are responsible for
|
||||
storing and serving all hidden service descriptors. As of May 2007 there are
|
||||
about 1000 descriptors at a time, but this number is assumed to increase in
|
||||
the future. Further, there is no replication protocol for descriptors between
|
||||
the three directory nodes, so that hidden services must ensure the
|
||||
availability of their descriptors by manually publishing them on all
|
||||
directory nodes. Whenever a fourth or fifth hidden service authoritative
|
||||
directory is added, hidden services will need to maintain an equally
|
||||
increasing number of replicas. These scalability issues have an impact on the
|
||||
current usage of hidden services and put an even higher burden on the
|
||||
development of new kinds of applications for hidden services that might
|
||||
require storing even more descriptors.
|
||||
|
||||
Second, besides posing a limitation to scalability, storing all hidden
|
||||
service descriptors on three directory nodes also constitutes a security
|
||||
risk. The directory node operators could easily analyze the publish and fetch
|
||||
requests to derive information on service activity and usage and read the
|
||||
descriptor contents to determine which onion routers work as introduction
|
||||
points for a given hidden service and need to be attacked or threatened to
|
||||
shut it down. Furthermore, the contents of a hidden service descriptor offer
|
||||
only minimal security properties to the hidden service. Whoever gets aware of
|
||||
the service ID can easily find out whether the service is active at the
|
||||
moment and which introduction points it has. This applies to (former)
|
||||
clients, (former) introduction points, and of course to the directory nodes.
|
||||
It requires only to request the descriptor for the given service ID, which
|
||||
can be performed by anyone anonymously.
|
||||
|
||||
This proposal suggests two major changes to approach the described
|
||||
performance and security problems:
|
||||
|
||||
The first change affects the storage location for hidden service descriptors.
|
||||
Descriptors are distributed among a large subset of all onion routers instead
|
||||
of three fixed directory nodes. Each storing node is responsible for a subset
|
||||
of descriptors for a limited time only. It is not able to choose which
|
||||
descriptors it stores at a certain time, because this is determined by its
|
||||
onion ID which is hard to change frequently and in time (only routers which
|
||||
are stable for a given time are accepted as storing nodes). In order to
|
||||
resist single node failures and untrustworthy nodes, descriptors are
|
||||
replicated among a certain number of storing nodes. A first replication
|
||||
protocol makes sure that descriptors don't get lost when the node population
|
||||
changes; therefore, a storing node periodically requests the descriptors from
|
||||
its siblings. A second replication protocol distributes descriptors among
|
||||
non-consecutive nodes of the ID ring to prevent a group of adversaries from
|
||||
generating new onion keys until they have consecutive IDs to create a 'black
|
||||
hole' in the ring and make random services unavailable. Connections to
|
||||
storing nodes are established by extending existing circuits by one hop to
|
||||
the storing node. This also ensures that contents are encrypted. The effect
|
||||
of this first change is that the probability that a single node operator
|
||||
learns about a certain hidden service is very small and that it is very hard
|
||||
to track a service over time, even when it collaborates with other node
|
||||
operators.
|
||||
|
||||
The second change concerns the content of hidden service descriptors.
|
||||
Obviously, security problems cannot be solved only by decentralizing storage;
|
||||
in fact, they could also get worse if done without caution. At first, a
|
||||
descriptor ID needs to change periodically in order to be stored on changing
|
||||
nodes over time. Next, the descriptor ID needs to be computable only for the
|
||||
service's clients, but should be unpredictable for all other nodes. Further,
|
||||
the storing node needs to be able to verify that the hidden service is the
|
||||
true originator of the descriptor with the given ID even though it is not a
|
||||
client. Finally, a storing node should learn as little information as
|
||||
necessary by storing a descriptor, because it might not be as trustworthy as
|
||||
a directory node; for example it does not need to know the list of
|
||||
introduction points. Therefore, a second key is applied that is only known to
|
||||
the hidden service provider and its clients and that is not included in the
|
||||
descriptor. It is used to calculate descriptor IDs and to encrypt the
|
||||
introduction points. This second key can either be given to all clients
|
||||
together with the hidden service ID, or to a group or a single client as
|
||||
an authentication token. In the future this second key could be the result of
|
||||
some key agreement protocol between the hidden service and one or more
|
||||
clients. A new text-based format is proposed for descriptors instead of an
|
||||
extension of the existing binary format for reasons of future extensibility.
|
||||
|
||||
Design:
|
||||
|
||||
The proposed design is described by the required changes to the current
|
||||
design. These requirements are grouped by content, rather than by affected
|
||||
specification documents or code files, and numbered for reference below.
|
||||
|
||||
Hidden service clients, servers, and directories:
|
||||
|
||||
/1/ Create routing list
|
||||
|
||||
All participants can filter the consensus status document received from the
|
||||
directory authorities to one routing list containing only those servers
|
||||
that store and serve hidden service descriptors and which are running for
|
||||
at least 24 hours. A participant only trusts its own routing list and never
|
||||
learns about routing information from other parties.
|
||||
|
||||
/2/ Determine responsible hidden service directory
|
||||
|
||||
All participants can determine the hidden service directory that is
|
||||
responsible for storing and serving a given ID, as well as the hidden
|
||||
service directories that replicate its content. Every hidden service
|
||||
directory is responsible for the descriptor IDs in the interval from
|
||||
its predecessor, exclusive, to its own ID, inclusive. Further, a hidden
|
||||
service directory holds replicas for its n predecessors, where n denotes
|
||||
the number of consecutive replicas. (requires /1/)
|
||||
|
||||
[/3/ and /4/ were requirements to use BEGIN_DIR cells for directory
|
||||
requests which have not been fulfilled in the course of the implementation
|
||||
of this proposal, but elsewhere.]
|
||||
|
||||
Hidden service directory nodes:
|
||||
|
||||
/5/ Advertise hidden service directory functionality
|
||||
|
||||
Every onion router that has its directory port open can decide whether it
|
||||
wants to store and serve hidden service descriptors by setting a new config
|
||||
option "HidServDirectoryV2" 0|1 to 1. An onion router with this config
|
||||
option being set includes the flag "hidden-service-dir" in its router
|
||||
descriptors that it sends to directory authorities.
|
||||
|
||||
/6/ Accept v2 publish requests, parse and store v2 descriptors
|
||||
|
||||
Hidden service directory nodes accept publish requests for hidden service
|
||||
descriptors and store them to their local memory. (It is not necessary to
|
||||
make descriptors persistent, because after disconnecting, the onion router
|
||||
would not be accepted as storing node anyway, because it has not been
|
||||
running for at least 24 hours.) All requests and replies are formatted as
|
||||
HTTP messages. Requests are directed to the router's directory port and are
|
||||
contained within BEGIN_DIR cells. A hidden service directory node stores a
|
||||
descriptor only when it thinks that it is responsible for storing that
|
||||
descriptor based on its own routing table. Every hidden service directory
|
||||
node is responsible for the descriptor IDs in the interval of its n-th
|
||||
predecessor in the ID circle up to its own ID (n denotes the number of
|
||||
consecutive replicas). (requires /1/)
|
||||
|
||||
/7/ Accept v2 fetch requests
|
||||
|
||||
Same as /6/, but with fetch requests for hidden service descriptors.
|
||||
(requires /2/)
|
||||
|
||||
/8/ Replicate descriptors with neighbors
|
||||
|
||||
A hidden service directory node replicates descriptors from its two
|
||||
predecessors by downloading them once an hour. Further, it checks its
|
||||
routing table periodically for changes. Whenever it realizes that a
|
||||
predecessor has left the network, it establishes a connection to the new
|
||||
n-th predecessor and requests its stored descriptors in the interval of its
|
||||
(n+1)-th predecessor and the requested n-th predecessor. Whenever it
|
||||
realizes that a new onion router has joined with an ID higher than its
|
||||
former n-th predecessor, it adds it to its predecessors and discards all
|
||||
descriptors in the interval of its (n+1)-th and its n-th predecessor.
|
||||
(requires /1/)
|
||||
|
||||
[Dec 02: This function has not been implemented, because arbitrary nodes
|
||||
what have been able to download the entire set of v2 descriptors. An
|
||||
authorized replication request would be necessary. For the moment, the
|
||||
system runs without any directory-side replication. -KL]
|
||||
|
||||
Authoritative directory nodes:
|
||||
|
||||
/9/ Confirm a router's hidden service directory functionality
|
||||
|
||||
Directory nodes include a new flag "HSDir" for routers that decided to
|
||||
provide storage for hidden service descriptors and that are running for at
|
||||
least 24 hours. The last requirement prevents a node from frequently
|
||||
changing its onion key to become responsible for an identifier it wants to
|
||||
target.
|
||||
|
||||
Hidden service provider:
|
||||
|
||||
/10/ Configure v2 hidden service
|
||||
|
||||
Each hidden service provider that has set the config option
|
||||
"PublishV2HidServDescriptors" 0|1 to 1 is configured to publish v2
|
||||
descriptors and conform to the v2 connection establishment protocol. When
|
||||
configuring a hidden service, a hidden service provider checks if it has
|
||||
already created a random secret_cookie and a hostname2 file; if not, it
|
||||
creates both of them. (requires /2/)
|
||||
|
||||
/11/ Establish introduction points with fresh key
|
||||
|
||||
If configured to publish only v2 descriptors and no v0/v1 descriptors any
|
||||
more, a hidden service provider that is setting up the hidden service at
|
||||
introduction points does not pass its own public key, but the public key
|
||||
of a freshly generated key pair. It also includes these fresh public keys
|
||||
in the hidden service descriptor together with the other introduction point
|
||||
information. The reason is that the introduction point does not need to and
|
||||
therefore should not know for which hidden service it works, so as to
|
||||
prevent it from tracking the hidden service's activity. (If a hidden
|
||||
service provider supports both, v0/v1 and v2 descriptors, v0/v1 clients
|
||||
rely on the fact that all introduction points accept the same public key,
|
||||
so that this new feature cannot be used.)
|
||||
|
||||
/12/ Encode v2 descriptors and send v2 publish requests
|
||||
|
||||
If configured to publish v2 descriptors, a hidden service provider
|
||||
publishes a new descriptor whenever its content changes or a new
|
||||
publication period starts for this descriptor. If the current publication
|
||||
period would only last for less than 60 minutes (= 2 x 30 minutes to allow
|
||||
the server to be 30 minutes behind and the client 30 minutes ahead), the
|
||||
hidden service provider publishes both a current descriptor and one for
|
||||
the next period. Publication is performed by sending the descriptor to all
|
||||
hidden service directories that are responsible for keeping replicas for
|
||||
the descriptor ID. This includes two non-consecutive replicas that are
|
||||
stored at 3 consecutive nodes each. (requires /1/ and /2/)
|
||||
|
||||
Hidden service client:
|
||||
|
||||
/13/ Send v2 fetch requests
|
||||
|
||||
A hidden service client that has set the config option
|
||||
"FetchV2HidServDescriptors" 0|1 to 1 handles SOCKS requests for v2 onion
|
||||
addresses by requesting a v2 descriptor from a randomly chosen hidden
|
||||
service directory that is responsible for keeping replica for the
|
||||
descriptor ID. In total there are six replicas of which the first and the
|
||||
last three are stored on consecutive nodes. The probability of picking one
|
||||
of the three consecutive replicas is 1/6, 2/6, and 3/6 to incorporate the
|
||||
fact that the availability will be the highest on the node with next higher
|
||||
ID. A hidden service client relies on the hidden service provider to store
|
||||
two sets of descriptors to compensate clock skew between service and
|
||||
client. (requires /1/ and /2/)
|
||||
|
||||
/14/ Process v2 fetch reply and parse v2 descriptors
|
||||
|
||||
A hidden service client that has sent a request for a v2 descriptor can
|
||||
parse it and store it to the local cache of rendezvous service descriptors.
|
||||
|
||||
/15/ Establish connection to v2 hidden service
|
||||
|
||||
A hidden service client can establish a connection to a hidden service
|
||||
using a v2 descriptor. This includes using the secret cookie for decrypting
|
||||
the introduction points contained in the descriptor. When contacting an
|
||||
introduction point, the client does not use the public key of the hidden
|
||||
service provider, but the freshly-generated public key that is included in
|
||||
the hidden service descriptor. Whether or not a fresh key is used instead
|
||||
of the key of the hidden service depends on the available protocol versions
|
||||
that are included in the descriptor; by this, connection establishment is
|
||||
to a certain extend decoupled from fetching the descriptor.
|
||||
|
||||
Hidden service descriptor:
|
||||
|
||||
(Requirements concerning the descriptor format are contained in /6/ and /7/.)
|
||||
|
||||
The new v2 hidden service descriptor format looks like this:
|
||||
|
||||
onion-address = h(public-key) + cookie
|
||||
descriptor-id = h(h(public-key) + h(time-period + cookie + relica))
|
||||
descriptor-content = {
|
||||
descriptor-id,
|
||||
version,
|
||||
public-key,
|
||||
h(time-period + cookie + replica),
|
||||
timestamp,
|
||||
protocol-versions,
|
||||
{ introduction-points } encrypted with cookie
|
||||
} signed with private-key
|
||||
|
||||
The "descriptor-id" needs to change periodically in order for the
|
||||
descriptor to be stored on changing nodes over time. It may only be
|
||||
computable by a hidden service provider and all of his clients to prevent
|
||||
unauthorized nodes from tracking the service activity by periodically
|
||||
checking whether there is a descriptor for this service. Finally, the
|
||||
hidden service directory needs to be able to verify that the hidden service
|
||||
provider is the true originator of the descriptor with the given ID.
|
||||
|
||||
Therefore, "descriptor-id" is derived from the "public-key" of the hidden
|
||||
service provider, the current "time-period" which changes every 24 hours,
|
||||
a secret "cookie" shared between hidden service provider and clients, and
|
||||
a "replica" denoting the number of this non-consecutive replica. (The
|
||||
"time-period" is constructed in a way that time periods do not change at
|
||||
the same moment for all descriptors by deriving a value between 0:00 and
|
||||
23:59 hours from h(public-key) and making the descriptors of this hidden
|
||||
service provider expire at that time of the day.) The "descriptor-id" is
|
||||
defined to be 160 bits long. [extending the "descriptor-id" length
|
||||
suggested by LØ]
|
||||
|
||||
Only the hidden service provider and the clients are able to generate
|
||||
future "descriptor-ID"s. Hence, the "onion-address" is extended from now
|
||||
the hash value of "public-key" by the secret "cookie". The "public-key" is
|
||||
determined to be 80 bits long, whereas the "cookie" is dimensioned to be
|
||||
120 bits long. This makes a total of 200 bits or 40 base32 chars, which is
|
||||
quite a lot to handle for a human, but necessary to provide sufficient
|
||||
protection against an adversary from generating a key pair with same
|
||||
"public-key" hash or guessing the "cookie".
|
||||
|
||||
A hidden service directory can verify that a descriptor was created by the
|
||||
hidden service provider by checking if the "descriptor-id" corresponds to
|
||||
the "public-key" and if the signature can be verified with the
|
||||
"public-key".
|
||||
|
||||
The "introduction-points" that are included in the descriptor are encrypted
|
||||
using the same "cookie" that is shared between hidden service provider and
|
||||
clients. [correction to use another key than h(time-period + cookie) as
|
||||
encryption key for introduction points made by LØ]
|
||||
|
||||
A new text-based format is proposed for descriptors instead of an extension
|
||||
of the existing binary format for reasons of future extensibility.
|
||||
|
||||
Security implications:
|
||||
|
||||
The security implications of the proposed changes are grouped by the roles of
|
||||
nodes that could perform attacks or on which attacks could be performed.
|
||||
|
||||
Attacks by authoritative directory nodes
|
||||
|
||||
Authoritative directory nodes are no longer the single places in the
|
||||
network that know about a hidden service's activity and introduction
|
||||
points. Thus, they cannot perform attacks using this information, e.g.
|
||||
track a hidden service's activity or usage pattern or attack its
|
||||
introduction points. Formerly, it would only require a single corrupted
|
||||
authoritative directory operator to perform such an attack.
|
||||
|
||||
Attacks by hidden service directory nodes
|
||||
|
||||
A hidden service directory node could misuse a stored descriptor to track a
|
||||
hidden service's activity and usage pattern by clients. Though there is no
|
||||
countermeasure against this kind of attack, it is very expensive to track a
|
||||
certain hidden service over time. An attacker would need to run a large
|
||||
number of stable onion routers that work as hidden service directory nodes
|
||||
to have a good probability to become responsible for its changing
|
||||
descriptor IDs. For each period, the probability is:
|
||||
|
||||
1-(N-c choose r)/(N choose r) for N-c>=r and 1 otherwise, with N
|
||||
as total
|
||||
number of hidden service directories, c as compromised nodes, and r as
|
||||
number of replicas
|
||||
|
||||
The hidden service directory nodes could try to make a certain hidden
|
||||
service unavailable to its clients. Therefore, they could discard all
|
||||
stored descriptors for that hidden service and reply to clients that there
|
||||
is no descriptor for the given ID or return an old or false descriptor
|
||||
content. The client would detect a false descriptor, because it could not
|
||||
contain a correct signature. But an old content or an empty reply could
|
||||
confuse the client. Therefore, the countermeasure is to replicate
|
||||
descriptors among a small number of hidden service directories, e.g. 5.
|
||||
The probability of a group of collaborating nodes to make a hidden service
|
||||
completely unavailable is in each period:
|
||||
|
||||
(c choose r)/(N choose r) for c>=r and N>=r, and 0 otherwise,
|
||||
with N as total
|
||||
number of hidden service directories, c as compromised nodes, and r as
|
||||
number of replicas
|
||||
|
||||
A hidden service directory could try to find out which introduction points
|
||||
are working on behalf of a hidden service. In contrast to the previous
|
||||
design, this is not possible anymore, because this information is encrypted
|
||||
to the clients of a hidden service.
|
||||
|
||||
Attacks on hidden service directory nodes
|
||||
|
||||
An anonymous attacker could try to swamp a hidden service directory with
|
||||
false descriptors for a given descriptor ID. This is prevented by requiring
|
||||
that descriptors are signed.
|
||||
|
||||
Anonymous attackers could swamp a hidden service directory with correct
|
||||
descriptors for non-existing hidden services. There is no countermeasure
|
||||
against this attack. However, the creation of valid descriptors is more
|
||||
expensive than verification and storage in local memory. This should make
|
||||
this kind of attack unattractive.
|
||||
|
||||
Attacks by introduction points
|
||||
|
||||
Current or former introduction points could try to gain information on the
|
||||
hidden service they serve. But due to the fresh key pair that is used by
|
||||
the hidden service, this attack is not possible anymore.
|
||||
|
||||
Attacks by clients
|
||||
|
||||
Current or former clients could track a hidden service's activity, attack
|
||||
its introduction points, or determine the responsible hidden service
|
||||
directory nodes and attack them. There is nothing that could prevent them
|
||||
from doing so, because honest clients need the full descriptor content to
|
||||
establish a connection to the hidden service. At the moment, the only
|
||||
countermeasure against dishonest clients is to change the secret cookie and
|
||||
pass it only to the honest clients.
|
||||
|
||||
Compatibility:
|
||||
|
||||
The proposed design is meant to replace the current design for hidden service
|
||||
descriptors and their storage in the long run.
|
||||
|
||||
There should be a first transition phase in which both, the current design
|
||||
and the proposed design are served in parallel. Onion routers should start
|
||||
serving as hidden service directories, and hidden service providers and
|
||||
clients should make use of the new design if both sides support it. Hidden
|
||||
service providers should be allowed to publish descriptors of the current
|
||||
format in parallel, and authoritative directories should continue storing and
|
||||
serving these descriptors.
|
||||
|
||||
After the first transition phase, hidden service providers should stop
|
||||
publishing descriptors on authoritative directories, and hidden service
|
||||
clients should not try to fetch descriptors from the authoritative
|
||||
directories. However, the authoritative directories should continue serving
|
||||
hidden service descriptors for a second transition phase. As of this point,
|
||||
all v2 config options should be set to a default value of 1.
|
||||
|
||||
After the second transition phase, the authoritative directories should stop
|
||||
serving hidden service descriptors.
|
||||
|
@ -1,387 +0,0 @@
|
||||
Filename: 115-two-hop-paths.txt
|
||||
Title: Two Hop Paths
|
||||
Version: $Revision$
|
||||
Last-Modified: $Date$
|
||||
Author: Mike Perry
|
||||
Created:
|
||||
Status: Dead
|
||||
Supersedes: 112
|
||||
|
||||
|
||||
Overview:
|
||||
|
||||
The idea is that users should be able to choose if they would like
|
||||
to have either two or three hop paths through the tor network.
|
||||
|
||||
Let us be clear: the users who would choose this option should be
|
||||
those that are concerned with IP obfuscation only: ie they would not be
|
||||
targets of a resource-intensive multi-node attack. It is sometimes said
|
||||
that these users should find some other network to use other than Tor.
|
||||
This is a foolish suggestion: more users improves security of everyone,
|
||||
and the current small userbase size is a critical hindrance to
|
||||
anonymity, as is discussed below and in [1].
|
||||
|
||||
This value should be modifiable from the controller, and should be
|
||||
available from Vidalia.
|
||||
|
||||
|
||||
Motivation:
|
||||
|
||||
The Tor network is slow and overloaded. Increasingly often I hear
|
||||
stories about friends and friends of friends who are behind firewalls,
|
||||
annoying censorware, or under surveillance that interferes with their
|
||||
productivity and Internet usage, or chills their speech. These people
|
||||
know about Tor, but they choose to put up with the censorship because
|
||||
Tor is too slow to be usable for them. In fact, to download a fresh,
|
||||
complete copy of levine-timing.pdf for the Theoretical Argument
|
||||
section of this proposal over Tor took me 3 tries.
|
||||
|
||||
Furthermore, the biggest current problem with Tor's anonymity for
|
||||
those who really need it is not someone attacking the network to
|
||||
discover who they are. It's instead the extreme danger that so few
|
||||
people use Tor because it's so slow, that those who do use it have
|
||||
essentially no confusion set.
|
||||
|
||||
The recent case where the professor and the rogue Tor user were the
|
||||
only Tor users on campus, and thus suspected in an incident involving
|
||||
Tor and that University underscores this point: "That was why the police
|
||||
had come to see me. They told me that only two people on our campus were
|
||||
using Tor: me and someone they suspected of engaging in an online scam.
|
||||
The detectives wanted to know whether the other user was a former
|
||||
student of mine, and why I was using Tor"[1].
|
||||
|
||||
Not only does Tor provide no anonymity if you use it to be anonymous
|
||||
but are obviously from a certain institution, location or circumstance,
|
||||
it is also dangerous to use Tor for risk of being accused of having
|
||||
something significant enough to hide to be willing to put up with
|
||||
the horrible performance as opposed to using some weaker alternative.
|
||||
|
||||
There are many ways to improve the speed problem, and of course we
|
||||
should and will implement as many as we can. Johannes's GSoC project
|
||||
and my reputation system are longer term, higher-effort things that
|
||||
will still provide benefit independent of this proposal.
|
||||
|
||||
However, reducing the path length to 2 for those who do not need the
|
||||
extra anonymity 3 hops provide not only improves their Tor experience
|
||||
but also reduces their load on the Tor network by 33%, and should
|
||||
increase adoption of Tor by a good deal. That's not just Win-Win, it's
|
||||
Win-Win-Win.
|
||||
|
||||
|
||||
Who will enable this option?
|
||||
|
||||
This is the crux of the proposal. Admittedly, there is some anonymity
|
||||
loss and some degree of decreased investment required on the part of
|
||||
the adversary to attack 2 hop users versus 3 hop users, even if it is
|
||||
minimal and limited mostly to up-front costs and false positives.
|
||||
|
||||
The key questions are:
|
||||
|
||||
1. Are these users in a class such that their risk is significantly
|
||||
less than the amount of this anonymity loss?
|
||||
|
||||
2. Are these users able to identify themselves?
|
||||
|
||||
Many many users of Tor are not at risk for an adversary capturing c/n
|
||||
nodes of the network just to see what they do. These users use Tor to
|
||||
circumvent aggressive content filters, or simply to keep their IP out of
|
||||
marketing and search engine databases. Most content filters have no
|
||||
interest in running Tor nodes to catch violators, and marketers
|
||||
certainly would never consider such a thing, both on a cost basis and a
|
||||
legal one.
|
||||
|
||||
In a sense, this represents an alternate threat model against these
|
||||
users who are not at risk for Tor's normal threat model.
|
||||
|
||||
It should be evident to these users that they fall into this class. All
|
||||
that should be needed is a radio button
|
||||
|
||||
* "I use Tor for local content filter circumvention and/or IP obfuscation,
|
||||
not anonymity. Speed is more important to me than high anonymity.
|
||||
No one will make considerable efforts to determine my real IP."
|
||||
* "I use Tor for anonymity and/or national-level, legally enforced
|
||||
censorship. It is possible effort will be taken to identify
|
||||
me, including but not limited to network surveillance. I need more
|
||||
protection."
|
||||
|
||||
and then some explanation in the help for exactly what this means, and
|
||||
the risks involved with eliminating the adversary's need for timing
|
||||
attacks with respect to false positives. Ultimately, the decision is a
|
||||
simple one that can be made without this information, however. The user
|
||||
does not need Paul Syverson to instruct them on the deep magic of Onion
|
||||
Routing to make this decision. They just need to know why they use Tor.
|
||||
If they use it just to stay out of marketing databases and/or bypass a
|
||||
local content filter, two hops is plenty. This is likely the vast
|
||||
majority of Tor users, and many non-users we would like to bring on
|
||||
board.
|
||||
|
||||
So, having established this class of users, let us now go on to
|
||||
examine theoretical and practical risks we place them at, and determine
|
||||
if these risks violate the users needs, or introduce additional risk
|
||||
to node operators who may be subject to requests from law enforcement
|
||||
to track users who need 3 hops, but use 2 because they enjoy the
|
||||
thrill of russian roulette.
|
||||
|
||||
|
||||
Theoretical Argument:
|
||||
|
||||
It has long been established that timing attacks against mixed
|
||||
and onion networks are extremely effective, and that regardless
|
||||
of path length, if the adversary has compromised your first and
|
||||
last hop of your path, you can assume they have compromised your
|
||||
identity for that connection.
|
||||
|
||||
In fact, it was demonstrated that for all but the slowest, lossiest
|
||||
networks, error rates for false positives and false negatives were
|
||||
very near zero[2]. Only for constant streams of traffic over slow and
|
||||
(more importantly) extremely lossy network links did the error rate
|
||||
hit 20%. For loss rates typical to the Internet, even the error rate
|
||||
for slow nodes with constant traffic streams was 13%.
|
||||
|
||||
When you take into account that most Tor streams are not constant,
|
||||
but probably much more like their "HomeIP" dataset, which consists
|
||||
mostly of web traffic that exists over finite intervals at specific
|
||||
times, error rates drop to fractions of 1%, even for the "worst"
|
||||
network nodes.
|
||||
|
||||
Therefore, the user has little benefit from the extra hop, assuming
|
||||
the adversary does timing correlation on their nodes. Since timing
|
||||
correlation is simply an implementation issue and is most likely
|
||||
a single up-front cost (and one that is like quite a bit cheaper
|
||||
than the cost of the machines purchased to host the nodes to mount
|
||||
an attack), the real protection is the low probability of getting
|
||||
both the first and last hop of a client's stream.
|
||||
|
||||
|
||||
Practical Issues:
|
||||
|
||||
Theoretical issues aside, there are several practical issues with the
|
||||
implementation of Tor that need to be addressed to ensure that
|
||||
identity information is not leaked by the implementation.
|
||||
|
||||
Exit policy issues:
|
||||
|
||||
If a client chooses an exit with a very restrictive exit policy
|
||||
(such as an IP or IP range), the first hop then knows a good deal
|
||||
about the destination. For this reason, clients should not select
|
||||
exits that match their destination IP with anything other than "*".
|
||||
|
||||
Partitioning:
|
||||
|
||||
Partitioning attacks form another concern. Since Tor uses telescoping
|
||||
to build circuits, it is possible to tell a user is constructing only
|
||||
two hop paths at the entry node and on the local network. An external
|
||||
adversary can potentially differentiate 2 and 3 hop users, and decide
|
||||
that all IP addresses connecting to Tor and using 3 hops have something
|
||||
to hide, and should be scrutinized more closely or outright apprehended.
|
||||
|
||||
One solution to this is to use the "leaky-circuit" method of attaching
|
||||
streams: The user always creates 3-hop circuits, but if the option
|
||||
is enabled, they always exit from their 2nd hop. The ideal solution
|
||||
would be to create a RELAY_SHISHKABOB cell which contains onion
|
||||
skins for every host along the path, but this requires protocol
|
||||
changes at the nodes to support.
|
||||
|
||||
Guard nodes:
|
||||
|
||||
Since guard nodes can rotate due to client relocation, network
|
||||
failure, node upgrades and other issues, if you amortize the risk a
|
||||
mobile, dialup, or otherwise intermittently connected user is exposed to
|
||||
over any reasonable duration of Tor usage (on the order of a year), it
|
||||
is the same with or without guard nodes. Assuming an adversary has
|
||||
c%/n% of network bandwidth, and guards rotate on average with period R,
|
||||
statistically speaking, it's merely a question of if the user wishes
|
||||
their risk to be concentrated with probability c/n over an expected
|
||||
period of R*c, and probability 0 over an expected period of R*(n-c),
|
||||
versus a continuous risk of (c/n)^2. So statistically speaking, guards
|
||||
only create a time-tradeoff of risk over the long run for normal Tor
|
||||
usage. Rotating guards do not reduce risk for normal client usage long
|
||||
term.[3]
|
||||
|
||||
On other other hand, assuming a more stable method of guard selection
|
||||
and preservation is devised, or a more stable client side network than
|
||||
my own is typical (which rotates guards frequently due to network issues
|
||||
and moving about), guard nodes provide a tradeoff in the form of c/n% of
|
||||
the users being "sacrificial users" who are exposed to high risk O(c/n)
|
||||
of identification, while the rest of the network is exposed to zero
|
||||
risk.
|
||||
|
||||
The nature of Tor makes it likely an adversary will take a "shock and
|
||||
awe" approach to suppressing Tor by rounding up a few users whose
|
||||
browsing activity has been observed to be made into examples, in an
|
||||
attempt to prove that Tor is not perfect.
|
||||
|
||||
Since this "shock and awe" attack can be applied with or without guard
|
||||
nodes, stable guard nodes do offer a measure of accountability of sorts.
|
||||
If a user was using a small set of guard nodes and knows them well, and
|
||||
then is suddenly apprehended as a result of Tor usage, having a fixed
|
||||
set of entry points to suspect is a lot better than suspecting the whole
|
||||
network. Conversely, it can also give non-apprehended users comfort
|
||||
that they are likely to remain safe indefinitely with their set of (now
|
||||
presumably trusted) guards. This is probably the most beneficial
|
||||
property of reliable guards: they deter the adversary from mounting
|
||||
"shock and awe" attacks because the surviving users will not
|
||||
intimidated, but instead made more confident. Of course, guards need to
|
||||
be made much more stable and users need to be encouraged to know their
|
||||
guards for this property to really take effect.
|
||||
|
||||
This beneficial property of client vigilance also carries over to an
|
||||
active adversary, except in this case instead of relying on the user
|
||||
to remember their guard nodes and somehow communicate them after
|
||||
apprehension, the code can alert them to the presence of an active
|
||||
adversary before they are apprehended. But only if they use guard nodes.
|
||||
|
||||
So lets consider the active adversary: Two hop paths allow malicious
|
||||
guards to get considerably more benefit from failing circuits if they do
|
||||
not extend to their colluding peers for the exit hop. Since guards can
|
||||
detect the number of hops in a path via either timing or by statistical
|
||||
analysis of the exit policy of the 2nd hop, they can perform this attack
|
||||
predominantly against 2 hop users.
|
||||
|
||||
This can be addressed by completely abandoning an entry guard after a
|
||||
certain ratio of extend or general circuit failures with respect to
|
||||
non-failed circuits. The proper value for this ratio can be determined
|
||||
experimentally with TorFlow. There is the possibility that the local
|
||||
network can abuse this feature to cause certain guards to be dropped,
|
||||
but they can do that anyways with the current Tor by just making guards
|
||||
they don't like unreachable. With this mechanism, Tor will complain
|
||||
loudly if any guard failure rate exceeds the expected in any failure
|
||||
case, local or remote.
|
||||
|
||||
Eliminating guards entirely would actually not address this issue due
|
||||
to the time-tradeoff nature of risk. In fact, it would just make it
|
||||
worse. Without guard nodes, it becomes much more difficult for clients
|
||||
to become alerted to Tor entry points that are failing circuits to make
|
||||
sure that they only devote bandwidth to carry traffic for streams which
|
||||
they observe both ends. Yet the rogue entry points are still able to
|
||||
significantly increase their success rates by failing circuits.
|
||||
|
||||
For this reason, guard nodes should remain enabled for 2 hop users,
|
||||
at least until an IP-independent, undetectable guard scanner can
|
||||
be created. TorFlow can scan for failing guards, but after a while,
|
||||
its unique behavior gives away the fact that its IP is a scanner and
|
||||
it can be given selective service.
|
||||
|
||||
Consideration of risks for node operators:
|
||||
|
||||
There is a serious risk for two hop users in the form of guard
|
||||
profiling. If an adversary running an exit node notices that a
|
||||
particular site is always visited from a fixed previous hop, it is
|
||||
likely that this is a two hop user using a certain guard, which could be
|
||||
monitored to determine their identity. Thus, for the protection of both
|
||||
2 hop users and node operators, 2 hop users should limit their guard
|
||||
duration to a sufficient number of days to verify reliability of a node,
|
||||
but not much more. This duration can be determined experimentally by
|
||||
TorFlow.
|
||||
|
||||
Considering a Tor client builds on average 144 circuits/day (10
|
||||
minutes per circuit), if the adversary owns c/n% of exits on the
|
||||
network, they can expect to see 144*c/n circuits from this user, or
|
||||
about 14 minutes of usage per day per percentage of network penetration.
|
||||
Since it will take several occurrences of user-linkable exit content
|
||||
from the same predecessor hop for the adversary to have any confidence
|
||||
this is a 2 hop user, it is very unlikely that any sort of demands made
|
||||
upon the predecessor node would guaranteed to be effective (ie it
|
||||
actually was a guard), let alone be executed in time to apprehend the
|
||||
user before they rotated guards.
|
||||
|
||||
The reverse risk also warrants consideration. If a malicious guard has
|
||||
orders to surveil Mike Perry, it can determine Mike Perry is using two
|
||||
hops by observing his tendency to choose a 2nd hop with a viable exit
|
||||
policy. This can be done relatively quickly, unfortunately, and
|
||||
indicates Mike Perry should spend some of his time building real 3 hop
|
||||
circuits through the same guards, to require them to at least wait for
|
||||
him to actually use Tor to determine his style of operation, rather than
|
||||
collect this information from his passive building patterns.
|
||||
|
||||
However, to actively determine where Mike Perry is going, the guard
|
||||
will need to require logging ahead of time at multiple exit nodes that
|
||||
he may use over the course of the few days while he is at that guard,
|
||||
and correlate the usage times of the exit node with Mike Perry's
|
||||
activity at that guard for the few days he uses it. At this point, the
|
||||
adversary is mounting a scale and method of attack (widespread logging,
|
||||
timing attacks) that works pretty much just as effectively against 3
|
||||
hops, so exit node operators are exposed to no additional danger than
|
||||
they otherwise normally are.
|
||||
|
||||
|
||||
Why not fix Pathlen=2?:
|
||||
|
||||
The main reason I am not advocating that we always use 2 hops is that
|
||||
in some situations, timing correlation evidence by itself may not be
|
||||
considered as solid and convincing as an actual, uninterrupted, fully
|
||||
traced path. Are these timing attacks as effective on a real network as
|
||||
they are in simulation? Maybe the circuit multiplexing of Tor can serve
|
||||
to frustrate them to a degree? Would an extralegal adversary or
|
||||
authoritarian government even care? In the face of these situation
|
||||
dependent unknowns, it should be up to the user to decide if this is
|
||||
a concern for them or not.
|
||||
|
||||
It should probably also be noted that even a false positive
|
||||
rate of 1% for a 200k concurrent-user network could mean that for a
|
||||
given node, a given stream could be confused with something like 10
|
||||
users, assuming ~200 nodes carry most of the traffic (ie 1000 users
|
||||
each). Though of course to really know for sure, someone needs to do
|
||||
an attack on a real network, unfortunately.
|
||||
|
||||
Additionally, at some point cover traffic schemes may be implemented to
|
||||
frustrate timing attacks on the first hop. It is possible some expert
|
||||
users may do this ad-hoc already, and may wish to continue using 3 hops
|
||||
for this reason.
|
||||
|
||||
|
||||
Implementation:
|
||||
|
||||
new_route_len() can be modified directly with a check of the
|
||||
Pathlen option. However, circuit construction logic should be
|
||||
altered so that both 2 hop and 3 hop users build the same types of
|
||||
circuits, and the option should ultimately govern circuit selection,
|
||||
not construction. This improves coverage against guard nodes being
|
||||
able to passively profile users who aren't even using Tor.
|
||||
PathlenCoinWeight, anyone? :)
|
||||
|
||||
The exit policy hack is a bit more tricky. compare_addr_to_addr_policy
|
||||
needs to return an alternate ADDR_POLICY_ACCEPTED_WILDCARD or
|
||||
ADDR_POLICY_ACCEPTED_SPECIFIC return value for use in
|
||||
circuit_is_acceptable.
|
||||
|
||||
The leaky exit is trickier still.. handle_control_attachstream
|
||||
does allow paths to exit at a given hop. Presumably something similar
|
||||
can be done in connection_ap_handshake_process_socks, and elsewhere?
|
||||
Circuit construction would also have to be performed such that the
|
||||
2nd hop's exit policy is what is considered, not the 3rd's.
|
||||
|
||||
The entry_guard_t structure could have num_circ_failed and
|
||||
num_circ_succeeded members such that if it exceeds F% circuit
|
||||
extend failure rate to a second hop, it is removed from the entry list.
|
||||
|
||||
F should be sufficiently high to avoid churn from normal Tor circuit
|
||||
failure as determined by TorFlow scans.
|
||||
|
||||
The Vidalia option should be presented as a radio button.
|
||||
|
||||
|
||||
Migration:
|
||||
|
||||
Phase 1: Adjust exit policy checks if Pathlen is set, implement leaky
|
||||
circuit ability, and 2-3 hop circuit selection logic governed by
|
||||
Pathlen.
|
||||
|
||||
Phase 2: Experiment to determine the proper ratio of circuit
|
||||
failures used to expire garbage or malicious guards via TorFlow
|
||||
(pending Bug #440 backport+adoption).
|
||||
|
||||
Phase 3: Implement guard expiration code to kick off failure-prone
|
||||
guards and warn the user. Cap 2 hop guard duration to a proper number
|
||||
of days determined sufficient to establish guard reliability (to be
|
||||
determined by TorFlow).
|
||||
|
||||
Phase 4: Make radiobutton in Vidalia, along with help entry
|
||||
that explains in layman's terms the risks involved.
|
||||
|
||||
Phase 5: Allow user to specify path length by HTTP URL suffix.
|
||||
|
||||
|
||||
[1] http://p2pnet.net/story/11279
|
||||
[2] http://www.cs.umass.edu/~mwright/papers/levine-timing.pdf
|
||||
[3] Proof available upon request ;)
|
@ -1,120 +0,0 @@
|
||||
Filename: 116-two-hop-paths-from-guard.txt
|
||||
Title: Two hop paths from entry guards
|
||||
Version: $Revision$
|
||||
Last-Modified: $Date$
|
||||
Author: Michael Lieberman
|
||||
Created: 26-Jun-2007
|
||||
Status: Dead
|
||||
|
||||
This proposal is related to (but different from) Mike Perry's proposal 115
|
||||
"Two Hop Paths."
|
||||
|
||||
Overview:
|
||||
|
||||
Volunteers who run entry guards should have the option of using only 2
|
||||
additional tor nodes when constructing their own tor circuits.
|
||||
|
||||
While the option of two hop paths should perhaps be extended to every client
|
||||
(as discussed in Mike Perry's thread), I believe the anonymity properties of
|
||||
two hop paths are particularly well-suited to client computers that are also
|
||||
serving as entry guards.
|
||||
|
||||
First I will describe the details of the strategy, as well as possible
|
||||
avenues of attack. Then I will list advantages and disadvantages. Then, I
|
||||
will discuss some possibly safer variations of the strategy, and finally
|
||||
some implementation issues.
|
||||
|
||||
Details:
|
||||
|
||||
Suppose Alice is an entry guard, and wants to construct a two hop circuit.
|
||||
Alice chooses a middle node at random (not using the entry guard strategy),
|
||||
and gains anonymity by having her traffic look just like traffic from
|
||||
someone else using her as an entry guard.
|
||||
|
||||
Can Alice's middle node figure out that she is initiator of the traffic? I
|
||||
can think of four possible approaches for distinguishing traffic from Alice
|
||||
with traffic through Alice:
|
||||
|
||||
1) Notice that communication from Alice comes too fast: Experimentation is
|
||||
needed to determine if traffic from Alice can be distinguished from traffic
|
||||
from a computer with a decent link to Alice.
|
||||
|
||||
2) Monitor Alice's network traffic to discover the lack of incoming packets
|
||||
at the appropriate times. If an adversary has this ability, then Alice
|
||||
already has problems in the current system, because the adversary can run a
|
||||
standard timing attack on Alice's traffic.
|
||||
|
||||
3) Notice that traffic from Alice is unique in some way such that if Alice
|
||||
was just one of 3 entry guards for this traffic, then the traffic should be
|
||||
coming from two other entry guards as well. An example of "unique traffic"
|
||||
could be always sending 117 packets every 3 minutes to an exit node that
|
||||
exits to port 4661. However, if such patterns existed with sufficient
|
||||
precision, then it seems to me that Tor already has a problem. (This "unique
|
||||
traffic" may not be a problem if clients often end up choosing a single
|
||||
entry guard because their other two are down. Does anyone know if this is
|
||||
the case?)
|
||||
|
||||
4) First, control the middle node *and* some other part of the traffic,
|
||||
using standard attacks on a two hop circuit without entry nodes (my recent
|
||||
paper on Browser-Based Attacks would work well for this
|
||||
http://petworkshop.org/2007/papers/PET2007_preproc_Browser_based.pdf). With
|
||||
control of the circuit, we can now cause "unique traffic" as in 3).
|
||||
Alternatively, if we know something about Alice independently, and we can
|
||||
see what websites are being visited, we might be able to guess that she is
|
||||
the kind of person that would visit those websites.
|
||||
|
||||
Anonymity Advantages:
|
||||
|
||||
-Alice never has the problem of choosing a malicious entry guard. In some
|
||||
sense, Alice acts as her own entry guard.
|
||||
|
||||
Anonymity Disadvantages:
|
||||
|
||||
-If Alice's traffic is identified as originating from herself (see above for
|
||||
how hard that might be), then she has the anonymity of a 2 hop circuit
|
||||
without entry guards.
|
||||
|
||||
Additional advantages:
|
||||
|
||||
-A discussion of the latency advantages of two hop circuits is going on in
|
||||
Mike Perry's thread already.
|
||||
-Also, we can advertise this change as "Run an entry guard and decrease your
|
||||
own Tor latency." This incentive has the potential to add nodes to the
|
||||
network, improving the network as a whole.
|
||||
|
||||
Safer variations:
|
||||
|
||||
To solve the "unique traffic" problem, Alice could use two hop paths only
|
||||
1/3 of the time, and choose 2 other entry guards for the other 2/3 of the
|
||||
time. All the advantages are now 1/3 as useful (possibly more, if the other
|
||||
2 entry guards are not always up).
|
||||
|
||||
To solve the problem that Alice's responses are too fast, Alice could delay
|
||||
her responses (ideally based on some real data of response time when Alice
|
||||
is used an entry guard). This loses most of the speed advantages of the two
|
||||
hop path, but if Alice is a fast entry guard, it doesn't lose everything. It
|
||||
also still has the (arguable) anonymity advantage that Alice doesn't have to
|
||||
worry about having a malicious entry guard.
|
||||
|
||||
Implementation details:
|
||||
For Alice to remain anonymous using this strategy, she has to actually be
|
||||
acting as an entry guard for other nodes. This means the two hop option can
|
||||
only be available to whatever high-performance threshold is currently set on
|
||||
entry guards. Alice may need to somehow check her own current status as an
|
||||
entry guard before choosing this two hop strategy.
|
||||
|
||||
Another thing to consider: suppose Alice is also an exit node. If the
|
||||
fraction of exit nodes in existence is too small, she may rarely or never be
|
||||
chosen as an entry guard. It would be sad if we offered an incentive to run
|
||||
an entry guard that didn't extend to exit nodes. I suppose clients of Exit
|
||||
nodes could pull the same trick, and bypass using Tor altogether (zero hop
|
||||
paths), though that has additional issues.*
|
||||
|
||||
Mike Lieberman
|
||||
MIT
|
||||
|
||||
*Why we shouldn't recommend Exit nodes pull the same trick:
|
||||
1) Exit nodes would suffer heavily from the problem of "unique traffic"
|
||||
mentioned above.
|
||||
2) It would give governments an incentive to confiscate exit nodes to see if
|
||||
they are pulling this trick.
|
@ -1,412 +0,0 @@
|
||||
Filename: 117-ipv6-exits.txt
|
||||
Title: IPv6 exits
|
||||
Version: $Revision$
|
||||
Last-Modified: $Date$
|
||||
Author: coderman
|
||||
Created: 10-Jul-2007
|
||||
Status: Accepted
|
||||
Target: 0.2.1.x
|
||||
|
||||
Overview
|
||||
|
||||
Extend Tor for TCP exit via IPv6 transport and DNS resolution of IPv6
|
||||
addresses. This proposal does not imply any IPv6 support for OR
|
||||
traffic, only exit and name resolution.
|
||||
|
||||
|
||||
Contents
|
||||
|
||||
0. Motivation
|
||||
|
||||
As the IPv4 address space becomes more scarce there is increasing
|
||||
effort to provide Internet services via the IPv6 protocol. Many
|
||||
hosts are available at IPv6 endpoints which are currently
|
||||
inaccessible for Tor users.
|
||||
|
||||
Extending Tor to support IPv6 exit streams and IPv6 DNS name
|
||||
resolution will allow users of the Tor network to access these hosts.
|
||||
This capability would be present for those who do not currently have
|
||||
IPv6 access, thus increasing the utility of Tor and furthering
|
||||
adoption of IPv6.
|
||||
|
||||
|
||||
1. Design
|
||||
|
||||
1.1. General design overview
|
||||
|
||||
There are three main components to this proposal. The first is a
|
||||
method for routers to advertise their ability to exit IPv6 traffic.
|
||||
The second is the manner in which routers resolve names to IPv6
|
||||
addresses. Last but not least is the method in which clients
|
||||
communicate with Tor to resolve and connect to IPv6 endpoints
|
||||
anonymously.
|
||||
|
||||
1.2. Router IPv6 exit support
|
||||
|
||||
In order to specify exit policies and IPv6 capability new directives
|
||||
in the Tor configuration will be needed. If a router advertises IPv6
|
||||
exit policies in its descriptor this will signal the ability to
|
||||
provide IPv6 exit. There are a number of additional default deny
|
||||
rules associated with this new address space which are detailed in
|
||||
the addendum.
|
||||
|
||||
When Tor is started on a host it should check for the presence of a
|
||||
global unicast IPv6 address and if present include the default IPv6
|
||||
exit policies and any user specified IPv6 exit policies.
|
||||
|
||||
If a user provides IPv6 exit policies but no global unicast IPv6
|
||||
address is available Tor should generate a warning and not publish the
|
||||
IPv6 policies in the router descriptor.
|
||||
|
||||
It should be noted that IPv4 mapped IPv6 addresses are not valid exit
|
||||
destinations. This mechanism is mainly used to interoperate with
|
||||
both IPv4 and IPv6 clients on the same socket. Any attempts to use
|
||||
an IPv4 mapped IPv6 address, perhaps to circumvent exit policy for
|
||||
IPv4, must be refused.
|
||||
|
||||
1.3. DNS name resolution of IPv6 addresses (AAAA records)
|
||||
|
||||
In addition to exit support for IPv6 TCP connections, a method to
|
||||
resolve domain names to their respective IPv6 addresses is also
|
||||
needed. This is accomplished in the existing DNS system via AAAA
|
||||
records. Routers will perform both A and AAAA requests when
|
||||
resolving a name so that the client can utilize an IPv6 endpoint when
|
||||
available or preferred.
|
||||
|
||||
To avoid potential problems with caching DNS servers that behave
|
||||
poorly all NXDOMAIN responses to AAAA requests should be ignored if a
|
||||
successful response is received for an A request. This implies that
|
||||
both AAAA and A requests will always be performed for each name
|
||||
resolution.
|
||||
|
||||
For reverse lookups on IPv6 addresses, like that used for
|
||||
RESOLVE_PTR, Tor will perform the necessary PTR requests via
|
||||
IP6.ARPA.
|
||||
|
||||
All routers which perform DNS resolution on behalf of clients
|
||||
(RELAY_RESOLVE) should perform and respond with both A and AAAA
|
||||
resources.
|
||||
|
||||
[NOTE: In a future version, when we extend the behavior of RESOLVE to
|
||||
encapsulate more of real DNS, it will make sense to allow more
|
||||
flexibility here. -nickm]
|
||||
|
||||
1.4. Client interaction with IPv6 exit capability
|
||||
|
||||
1.4.1. Usability goals
|
||||
|
||||
There are a number of behaviors which Tor can provide when
|
||||
interacting with clients that will improve the usability of IPv6 exit
|
||||
capability. These behaviors are designed to make it simple for
|
||||
clients to express a preference for IPv6 transport and utilize IPv6
|
||||
host services.
|
||||
|
||||
1.4.2. SOCKSv5 IPv6 client behavior
|
||||
|
||||
The SOCKS version 5 protocol supports IPv6 connections. When using
|
||||
SOCKSv5 with hostnames it is difficult to determine if a client
|
||||
wishes to use an IPv4 or IPv6 address to connect to the desired host
|
||||
if it resolves to both address types.
|
||||
|
||||
In order to make this more intuitive the SOCKSv5 protocol can be
|
||||
supported on a local IPv6 endpoint, [::1] port 9050 for example.
|
||||
When a client requests a connection to the desired host via an IPv6
|
||||
SOCKS connection Tor will prefer IPv6 addresses when resolving the
|
||||
host name and connecting to the host.
|
||||
|
||||
Likewise, RESOLVE and RESOLVE_PTR requests from an IPv6 SOCKS
|
||||
connection will return IPv6 addresses when available, and fall back
|
||||
to IPv4 addresses if not.
|
||||
|
||||
[NOTE: This means that SocksListenAddress and DNSListenAddress should
|
||||
support IPv6 addresses. Perhaps there should also be a general option
|
||||
to have listeners that default to 127.0.0.1 and 0.0.0.0 listen
|
||||
additionally or instead on ::1 and :: -nickm]
|
||||
|
||||
1.4.3. MAPADDRESS behavior
|
||||
|
||||
The MAPADDRESS capability supports clients that may not be able to
|
||||
use the SOCKSv4a or SOCKSv5 hostname support to resolve names via
|
||||
Tor. This ability should be extended to IPv6 addresses in SOCKSv5 as
|
||||
well.
|
||||
|
||||
When a client requests an address mapping from the wildcard IPv6
|
||||
address, [::0], the server will respond with a unique local IPv6
|
||||
address on success. It is important to note that there may be two
|
||||
mappings for the same name if both an IPv4 and IPv6 address are
|
||||
associated with the host. In this case a CONNECT to a mapped IPv6
|
||||
address should prefer IPv6 for the connection to the host, if
|
||||
available, while CONNECT to a mapped IPv4 address will prefer IPv4.
|
||||
|
||||
It should be noted that IPv6 does not provide the concept of a host
|
||||
local subnet, like 127.0.0.0/8 in IPv4. For this reason integration
|
||||
of Tor with IPv6 clients should consider a firewall or filter rule to
|
||||
drop unique local addresses to or from the network when possible.
|
||||
These packets should not be routed, however, keeping them off the
|
||||
subnet entirely is worthwhile.
|
||||
|
||||
1.4.3.1. Generating unique local IPv6 addresses
|
||||
|
||||
The usual manner of generating a unique local IPv6 address is to
|
||||
select a Global ID part randomly, along with a Subnet ID, and sharing
|
||||
this prefix among the communicating parties who each have their own
|
||||
distinct Interface ID. In this style a given Tor instance might
|
||||
select a random Global and Subnet ID and provide MAPADDRESS
|
||||
assignments with a random Interface ID as needed. This has the
|
||||
potential to associate unique Global/Subnet identifiers with a given
|
||||
Tor instance and may expose attacks against the anonymity of Tor
|
||||
users.
|
||||
|
||||
Tor avoid this potential problem entirely MAPADDRESS must always
|
||||
generate the Global, Subnet, and Interface IDs randomly for each
|
||||
request. It is also highly suggested that explicitly specifying an
|
||||
IPv6 source address instead of the wildcard address not be supported
|
||||
to ensure that a good random address is used.
|
||||
|
||||
1.4.4. DNSProxy IPv6 client behavior
|
||||
|
||||
A new capability in recent Tor versions is the transparent DNS proxy.
|
||||
This feature will need to return both A and AAAA resource records
|
||||
when responding to client name resolution requests.
|
||||
|
||||
The transparent DNS proxy should also support reverse lookups for
|
||||
IPv6 addresses. It is suggested that any such requests to the
|
||||
deprecated IP6.INT domain should be translated to IP6.ARPA instead.
|
||||
This translation is not likely to be used and is of low priority.
|
||||
|
||||
It would be nice to support DNS over IPv6 transport as well, however,
|
||||
this is not likely to be used and is of low priority.
|
||||
|
||||
1.4.5. TransPort IPv6 client behavior
|
||||
|
||||
Tor also provides transparent TCP proxy support via the Trans*
|
||||
directives in the configuration. The TransListenAddress directive
|
||||
should accept an IPv6 address in addition to IPv4 so that IPv6 TCP
|
||||
connections can be transparently proxied.
|
||||
|
||||
1.5. Additional changes
|
||||
|
||||
The RedirectExit option should be deprecated rather than extending
|
||||
this feature to IPv6.
|
||||
|
||||
|
||||
2. Spec changes
|
||||
|
||||
2.1. Tor specification
|
||||
|
||||
In '6.2. Opening streams and transferring data' the following should
|
||||
be changed to indicate IPv6 exit capability:
|
||||
|
||||
"No version of Tor currently generates the IPv6 format."
|
||||
|
||||
In '6.4. Remote hostname lookup' the following should be updated to
|
||||
reflect use of ip6.arpa in addition to in-addr.arpa.
|
||||
|
||||
"For a reverse lookup, the OP sends a RELAY_RESOLVE cell containing an
|
||||
in-addr.arpa address."
|
||||
|
||||
In 'A.1. Differences between spec and implementation' the following
|
||||
should be updated to indicate IPv6 exit capability:
|
||||
|
||||
"The current codebase has no IPv6 support at all."
|
||||
|
||||
[NOTE: the EXITPOLICY end-cell reason says that it can hold an ipv4 or an
|
||||
ipv6 address, but doesn't say how. We may want a separate EXITPOLICY2
|
||||
type that can hold an ipv6 address, since the way we encode ipv6
|
||||
addresses elsewhere ("0.0.0.0 indicates that the next 16 bytes are ipv6")
|
||||
is a bit dumb. -nickm]
|
||||
[Actually, the length field lets us distinguish EXITPOLICY. -nickm]
|
||||
|
||||
2.2. Directory specification
|
||||
|
||||
In '2.1. Router descriptor format' a new set of directives is needed
|
||||
for IPv6 exit policy. The existing accept/reject directives should
|
||||
be clarified to indicate IPv4 or wildcard address relevance. The new
|
||||
IPv6 directives will be in the form of:
|
||||
|
||||
"accept6" exitpattern NL
|
||||
"reject6" exitpattern NL
|
||||
|
||||
The section describing accept6/reject6 should explain that the
|
||||
presence of accept6 or reject6 exit policies in a router descriptor
|
||||
signals the ability of that router to exit IPv6 traffic (according to
|
||||
IPv6 exit policies).
|
||||
|
||||
The "[::]/0" notation is used to represent "all IPv6 addresses".
|
||||
"[::0]/0" may also be used for this representation.
|
||||
|
||||
If a user specifies a 'reject6 [::]/0:*' policy in the Tor
|
||||
configuration this will be interpreted as forcing no IPv6 exit
|
||||
support and no accept6/reject6 policies will be included in the
|
||||
published descriptor. This will prevent IPv6 exit if the router host
|
||||
has a global unicast IPv6 address present.
|
||||
|
||||
It is important to note that a wildcard address in an accept or
|
||||
reject policy applies to both IPv4 and IPv6 addresses.
|
||||
|
||||
2.3. Control specification
|
||||
|
||||
In '3.8. MAPADDRESS' the potential to have to addresses for a given
|
||||
name should be explained. The method for generating unique local
|
||||
addresses for IPv6 mappings needs explanation as described above.
|
||||
|
||||
When IPv6 addresses are used in this document they should include the
|
||||
brackets for consistency. For example, the null IPv6 address should
|
||||
be written as "[::0]" and not "::0". The control commands will
|
||||
expect the same syntax as well.
|
||||
|
||||
In '3.9. GETINFO' the "address" command should return both public
|
||||
IPv4 and IPv6 addresses if present. These addresses should be
|
||||
separated via \r\n.
|
||||
|
||||
|
||||
2.4. Tor SOCKS extensions
|
||||
|
||||
In '2. Name lookup' a description of IPv6 address resolution is
|
||||
needed for SOCKSv5 as described above. IPv6 addresses should be
|
||||
supported in both the RESOLVE and RESOLVE_PTR extensions.
|
||||
|
||||
A new section describing the ability to accept SOCKSv5 clients on a
|
||||
local IPv6 address to indicate a preference for IPv6 transport as
|
||||
described above is also needed. The behavior of Tor SOCKSv5 proxy
|
||||
with an IPv6 preference should be explained, for example, preferring
|
||||
IPv6 transport to a named host with both IPv4 and IPv6 addresses
|
||||
available (A and AAAA records).
|
||||
|
||||
|
||||
3. Questions and concerns
|
||||
|
||||
3.1. DNS A6 records
|
||||
|
||||
A6 is explicitly avoided in this document. There are potential
|
||||
reasons for implementing this, however, the inherent complexity of
|
||||
the protocol and resolvers make this unappealing. Is there a
|
||||
compelling reason to consider A6 as part of IPv6 exit support?
|
||||
|
||||
[IMO not till anybody needs it. -nickm]
|
||||
|
||||
3.2. IPv4 and IPv6 preference
|
||||
|
||||
The design above tries to infer a preference for IPv4 or IPv6
|
||||
transport based on client interactions with Tor. It might be useful
|
||||
to provide more explicit control over this preference. For example,
|
||||
an IPv4 SOCKSv5 client may want to use IPv6 transport to named hosts
|
||||
in CONNECT requests while the current implementation would assume an
|
||||
IPv4 preference. Should more explicit control be available, through
|
||||
either configuration directives or control commands?
|
||||
|
||||
Many applications support a inet6-only or prefer-family type option
|
||||
that provides the user manual control over address preference. This
|
||||
could be provided as a Tor configuration option.
|
||||
|
||||
An explicit preference is still possible by resolving names and then
|
||||
CONNECTing to an IPv4 or IPv6 address as desired, however, not all
|
||||
client applications may have this option available.
|
||||
|
||||
3.3. Support for IPv6 only transparent proxy clients
|
||||
|
||||
It may be useful to support IPv6 only transparent proxy clients using
|
||||
IPv4 mapped IPv6 like addresses. This would require transparent DNS
|
||||
proxy using IPv6 transport and the ability to map A record responses
|
||||
into IPv4 mapped IPv6 like addresses in the manner described in the
|
||||
"NAT-PT" RFC for a traditional Basic-NAT-PT with DNS-ALG. The
|
||||
transparent TCP proxy would thus need to detect these mapped addresses
|
||||
and connect to the desired IPv4 host.
|
||||
|
||||
The IPv6 prefix used for this purpose must not be the actual IPv4
|
||||
mapped IPv6 address prefix, though the manner in which IPv4 addresses
|
||||
are embedded in IPv6 addresses would be the same.
|
||||
|
||||
The lack of any IPv6 only hosts which would use this transparent proxy
|
||||
method makes this a lot of work for very little gain. Is there a
|
||||
compelling reason to support this NAT-PT like capability?
|
||||
|
||||
3.4. IPv6 DNS and older Tor routers
|
||||
|
||||
It is expected that many routers will continue to run with older
|
||||
versions of Tor when the IPv6 exit capability is released. Clients
|
||||
who wish to use IPv6 will need to route RELAY_RESOLVE requests to the
|
||||
newer routers which will respond with both A and AAAA resource
|
||||
records when possible.
|
||||
|
||||
One way to do this is to route RELAY_RESOLVE requests to routers with
|
||||
IPv6 exit policies published, however, this would not utilize current
|
||||
routers that can resolve IPv6 addresses even if they can't exit such
|
||||
traffic.
|
||||
|
||||
There was also concern expressed about the ability of existing clients
|
||||
to cope with new RELAY_RESOLVE responses that contain IPv6 addresses.
|
||||
If this breaks backward compatibility, a new request type may be
|
||||
necessary, like RELAY_RESOLVE6, or some other mechanism of indicating
|
||||
the ability to parse IPv6 responses when making the request.
|
||||
|
||||
3.5. IPv4 and IPv6 bindings in MAPADDRESS
|
||||
|
||||
It may be troublesome to try and support two distinct address mappings
|
||||
for the same name in the existing MAPADDRESS implementation. If this
|
||||
cannot be accommodated then the behavior should replace existing
|
||||
mappings with the new address regardless of family. A warning when
|
||||
this occurs would be useful to assist clients who encounter problems
|
||||
when both an IPv4 and IPv6 application are using MAPADDRESS for the
|
||||
same names concurrently, causing lost connections for one of them.
|
||||
|
||||
4. Addendum
|
||||
|
||||
4.1. Sample IPv6 default exit policy
|
||||
|
||||
reject 0.0.0.0/8
|
||||
reject 169.254.0.0/16
|
||||
reject 127.0.0.0/8
|
||||
reject 192.168.0.0/16
|
||||
reject 10.0.0.0/8
|
||||
reject 172.16.0.0/12
|
||||
reject6 [0000::]/8
|
||||
reject6 [0100::]/8
|
||||
reject6 [0200::]/7
|
||||
reject6 [0400::]/6
|
||||
reject6 [0800::]/5
|
||||
reject6 [1000::]/4
|
||||
reject6 [4000::]/3
|
||||
reject6 [6000::]/3
|
||||
reject6 [8000::]/3
|
||||
reject6 [A000::]/3
|
||||
reject6 [C000::]/3
|
||||
reject6 [E000::]/4
|
||||
reject6 [F000::]/5
|
||||
reject6 [F800::]/6
|
||||
reject6 [FC00::]/7
|
||||
reject6 [FE00::]/9
|
||||
reject6 [FE80::]/10
|
||||
reject6 [FEC0::]/10
|
||||
reject6 [FF00::]/8
|
||||
reject *:25
|
||||
reject *:119
|
||||
reject *:135-139
|
||||
reject *:445
|
||||
reject *:1214
|
||||
reject *:4661-4666
|
||||
reject *:6346-6429
|
||||
reject *:6699
|
||||
reject *:6881-6999
|
||||
accept *:*
|
||||
# accept6 [2000::]/3:* is implied
|
||||
|
||||
4.2. Additional resources
|
||||
|
||||
'DNS Extensions to Support IP Version 6'
|
||||
http://www.ietf.org/rfc/rfc3596.txt
|
||||
|
||||
'DNS Extensions to Support IPv6 Address Aggregation and Renumbering'
|
||||
http://www.ietf.org/rfc/rfc2874.txt
|
||||
|
||||
'SOCKS Protocol Version 5'
|
||||
http://www.ietf.org/rfc/rfc1928.txt
|
||||
|
||||
'Unique Local IPv6 Unicast Addresses'
|
||||
http://www.ietf.org/rfc/rfc4193.txt
|
||||
|
||||
'INTERNET PROTOCOL VERSION 6 ADDRESS SPACE'
|
||||
http://www.iana.org/assignments/ipv6-address-space
|
||||
|
||||
'Network Address Translation - Protocol Translation (NAT-PT)'
|
||||
http://www.ietf.org/rfc/rfc2766.txt
|
@ -1,86 +0,0 @@
|
||||
Filename: 118-multiple-orports.txt
|
||||
Title: Advertising multiple ORPorts at once
|
||||
Version: $Revision$
|
||||
Last-Modified: $Date$
|
||||
Author: Nick Mathewson
|
||||
Created: 09-Jul-2007
|
||||
Status: Accepted
|
||||
Target: 0.2.1.x
|
||||
|
||||
Overview:
|
||||
|
||||
This document is a proposal for servers to advertise multiple
|
||||
address/port combinations for their ORPort.
|
||||
|
||||
Motivation:
|
||||
|
||||
Sometimes servers want to support multiple ports for incoming
|
||||
connections, either in order to support multiple address families, to
|
||||
better use multiple interfaces, or to support a variety of
|
||||
FascistFirewallPorts settings. This is easy to set up now, but
|
||||
there's no way to advertise it to clients.
|
||||
|
||||
New descriptor syntax:
|
||||
|
||||
We add a new line in the router descriptor, "or-address". This line
|
||||
can occur zero, one, or multiple times. Its format is:
|
||||
|
||||
or-address SP ADDRESS ":" PORTLIST NL
|
||||
|
||||
ADDRESS = IP6ADDR / IP4ADDR
|
||||
IPV6ADDR = an ipv6 address, surrounded by square brackets.
|
||||
IPV4ADDR = an ipv4 address, represented as a dotted quad.
|
||||
PORTLIST = PORTSPEC | PORTSPEC "," PORTLIST
|
||||
PORTSPEC = PORT | PORT "-" PORT
|
||||
|
||||
[This is the regular format for specifying sets of addresses and
|
||||
ports in Tor.]
|
||||
|
||||
New OR behavior:
|
||||
|
||||
We add two more options to supplement ORListenAddress:
|
||||
ORPublishedListenAddress, and ORPublishAddressSet. The former
|
||||
listens on an address-port combination and publishes it in addition
|
||||
to the regular address. The latter advertises a set of address-port
|
||||
combinations, but does not listen on them. [To use this option, the
|
||||
server operator should set up port forwarding to the regular ORPort,
|
||||
as for example with firewall rules.]
|
||||
|
||||
Servers should extend their testing to include advertised addresses
|
||||
and ports. No address or port should be advertised until it's been
|
||||
tested. [This might get expensive in practice.]
|
||||
|
||||
New authority behavior:
|
||||
|
||||
Authorities should spot-test descriptors, and reject any where a
|
||||
substantial part of the addresses can't be reached.
|
||||
|
||||
New client behavior:
|
||||
|
||||
When connecting to another server, clients SHOULD pick an
|
||||
address-port ocmbination at random as supported by their
|
||||
reachableaddresses. If a client has a connection to a server at one
|
||||
address, it SHOULD use that address for any simultaneous connections
|
||||
to that server. Clients SHOULD use the canonical address for any
|
||||
server when generating extend cells.
|
||||
|
||||
Not addressed here:
|
||||
|
||||
* There's no reason to listen on multiple dirports; current Tors
|
||||
mostly don't connect directly to the dirport anyway.
|
||||
|
||||
* It could be advantageous to list something about extra addresses in
|
||||
the network-status document. This would, however, eat space there.
|
||||
More analysis is needed, particularly in light of proposal 141
|
||||
("Download server descriptors on demand")
|
||||
|
||||
Dependencies:
|
||||
|
||||
Testing for canonical connections needs to be implemented before it's
|
||||
safe to use this proposal.
|
||||
|
||||
|
||||
Notes 3 July:
|
||||
- Write up the simple version of this. No ranges needed yet. No
|
||||
networkstatus chagnes yet.
|
||||
|
@ -1,142 +0,0 @@
|
||||
Filename: 119-controlport-auth.txt
|
||||
Title: New PROTOCOLINFO command for controllers
|
||||
Version: $Revision$
|
||||
Last-Modified: $Date$
|
||||
Author: Roger Dingledine
|
||||
Created: 14-Aug-2007
|
||||
Status: Closed
|
||||
Implemented-In: 0.2.0.x
|
||||
|
||||
Overview:
|
||||
|
||||
Here we describe how to help controllers locate the cookie
|
||||
authentication file when authenticating to Tor, so we can a) require
|
||||
authentication by default for Tor controllers and b) still keep
|
||||
things usable. Also, we propose an extensible, general-purpose mechanism
|
||||
for controllers to learn about a Tor instance's protocol and
|
||||
authentication requirements before authenticating.
|
||||
|
||||
The Problem:
|
||||
|
||||
When we first added the controller protocol, we wanted to make it
|
||||
easy for people to play with it, so by default we didn't require any
|
||||
authentication from controller programs. We allowed requests only from
|
||||
localhost as a stopgap measure for security.
|
||||
|
||||
Due to an increasing number of vulnerabilities based on this approach,
|
||||
it's time to add authentication in default configurations.
|
||||
|
||||
We have a number of goals:
|
||||
- We want the default Vidalia bundles to transparently work. That
|
||||
means we don't want the users to have to type in or know a password.
|
||||
- We want to allow multiple controller applications to connect to the
|
||||
control port. So if Vidalia is launching Tor, it can't just keep the
|
||||
secrets to itself.
|
||||
|
||||
Right now there are three authentication approaches supported
|
||||
by the control protocol: NULL, CookieAuthentication, and
|
||||
HashedControlPassword. See Sec 5.1 in control-spec.txt for details.
|
||||
|
||||
There are a couple of challenges here. The first is: if the controller
|
||||
launches Tor, how should we teach Tor what authentication approach
|
||||
it should require, and the secret that goes along with it? Next is:
|
||||
how should this work when the controller attaches to an existing Tor,
|
||||
rather than launching Tor itself?
|
||||
|
||||
Cookie authentication seems most amenable to letting multiple controller
|
||||
applications interact with Tor. But that brings in yet another question:
|
||||
how does the controller guess where to look for the cookie file,
|
||||
without first knowing what DataDirectory Tor is using?
|
||||
|
||||
Design:
|
||||
|
||||
We should add a new controller command PROTOCOLINFO that can be sent
|
||||
as a valid first command (the others being AUTHENTICATE and QUIT). If
|
||||
PROTOCOLINFO is sent as the first command, the second command must be
|
||||
either a successful AUTHENTICATE or a QUIT.
|
||||
|
||||
If the initial command sequence is not valid, Tor closes the connection.
|
||||
|
||||
|
||||
Spec:
|
||||
|
||||
C: "PROTOCOLINFO" *(SP PIVERSION) CRLF
|
||||
S: "250+PROTOCOLINFO" SP PIVERSION CRLF *InfoLine "250 OK" CRLF
|
||||
|
||||
InfoLine = AuthLine / VersionLine / OtherLine
|
||||
|
||||
AuthLine = "250-AUTH" SP "METHODS=" AuthMethod *(",")AuthMethod
|
||||
*(SP "COOKIEFILE=" AuthCookieFile) CRLF
|
||||
VersionLine = "250-VERSION" SP "Tor=" TorVersion [SP Arguments] CRLF
|
||||
|
||||
AuthMethod =
|
||||
"NULL" / ; No authentication is required
|
||||
"HASHEDPASSWORD" / ; A controller must supply the original password
|
||||
"COOKIE" / ; A controller must supply the contents of a cookie
|
||||
|
||||
AuthCookieFile = QuotedString
|
||||
TorVersion = QuotedString
|
||||
|
||||
OtherLine = "250-" Keyword [SP Arguments] CRLF
|
||||
|
||||
For example:
|
||||
|
||||
C: PROTOCOLINFO CRLF
|
||||
S: "250+PROTOCOLINFO 1" CRLF
|
||||
S: "250-AUTH Methods=HASHEDPASSWORD,COOKIE COOKIEFILE="/tor/cookie"" CRLF
|
||||
S: "250-VERSION Tor=0.2.0.5-alpha" CRLF
|
||||
S: "250 OK" CRLF
|
||||
|
||||
Tor MAY give its InfoLines in any order; controllers MUST ignore InfoLines
|
||||
with keywords it does not recognize. Controllers MUST ignore extraneous
|
||||
data on any InfoLine.
|
||||
|
||||
PIVERSION is there in case we drastically change the syntax one day. For
|
||||
now it should always be "1", for the controller protocol. Controllers MAY
|
||||
provide a list of the protocol versions they support; Tor MAY select a
|
||||
version that the controller does not support.
|
||||
|
||||
Right now only two "topics" (AUTH and VERSION) are included, but more
|
||||
may be included in the future. Controllers must accept lines with
|
||||
unexpected topics.
|
||||
|
||||
AuthCookieFile = QuotedString
|
||||
|
||||
AuthMethod is used to specify one or more control authentication
|
||||
methods that Tor currently accepts.
|
||||
|
||||
AuthCookieFile specifies the absolute path and filename of the
|
||||
authentication cookie that Tor is expecting and is provided iff
|
||||
the METHODS field contains the method "COOKIE". Controllers MUST handle
|
||||
escape sequences inside this string.
|
||||
|
||||
The VERSION line contains the Tor version.
|
||||
|
||||
[What else might we want to include that could be useful? -RD]
|
||||
|
||||
Compatibility:
|
||||
|
||||
Tor 0.1.2.16 and 0.2.0.4-alpha hang up after the first failed
|
||||
command. Earlier Tors don't know about this command but don't hang
|
||||
up. That means controllers will need a mechanism for distinguishing
|
||||
whether they're talking to a Tor that speaks PROTOCOLINFO or not.
|
||||
|
||||
I suggest that the controllers attempt a PROTOCOLINFO. Then:
|
||||
- If it works, great. Authenticate as required.
|
||||
- If they get hung up on, reconnect and do a NULL AUTHENTICATE.
|
||||
- If it's unrecognized but they're not hung up on, do a NULL
|
||||
AUTHENTICATE.
|
||||
|
||||
Unsolved problems:
|
||||
|
||||
If Torbutton wants to be a Tor controller one day... talking TCP is
|
||||
bad enough, but reading from the filesystem is even harder. Is there
|
||||
a way to let simple programs work with the controller port without
|
||||
needing all the auth infrastructure?
|
||||
|
||||
Once we put this approach in place, the next vulnerability we see will
|
||||
involve an attacker somehow getting read access to the victim's files
|
||||
--- and then we're back where we started. This means we still need
|
||||
to think about how to demand password-based authentication without
|
||||
bothering the user about it.
|
||||
|
@ -1,85 +0,0 @@
|
||||
Filename: 120-shutdown-descriptors.txt
|
||||
Title: Shutdown descriptors when Tor servers stop
|
||||
Version: $Revision$
|
||||
Last-Modified: $Date$
|
||||
Author: Roger Dingledine
|
||||
Created: 15-Aug-2007
|
||||
Status: Dead
|
||||
|
||||
[Proposal dead as of 11 Jul 2008. The point of this proposal was to give
|
||||
routers a good way to get out of the networkstatus early, but proposal
|
||||
138 (already implemented) has achieved this.]
|
||||
|
||||
Overview:
|
||||
|
||||
Tor servers should publish a last descriptor whenever they shut down,
|
||||
to let others know that they are no longer offering service.
|
||||
|
||||
The Problem:
|
||||
|
||||
The main reason for this is in reaction to Internet services that want
|
||||
to treat connections from the Tor network differently. Right now,
|
||||
if a user experiments with turning on the "relay" functionality, he
|
||||
is punished by being locked out of some websites, some IRC networks,
|
||||
etc --- and this lockout persists for several days even after he turns
|
||||
the server off.
|
||||
|
||||
Design:
|
||||
|
||||
During the "slow shutdown" period if exiting, or shortly after the
|
||||
user sets his ORPort back to 0 if not exiting, Tor should publish a
|
||||
final descriptor with the following characteristics:
|
||||
|
||||
1) Exit policy is listed as "reject *:*"
|
||||
2) It includes a new entry called "opt shutdown 1"
|
||||
|
||||
The first step is so current blacklists will no longer list this node
|
||||
as exiting to whatever the service is.
|
||||
|
||||
The second step is so directory authorities can avoid wasting time
|
||||
doing reachability testing. Authorities should automatically not list
|
||||
as Running any router whose latest descriptor says it shut down.
|
||||
|
||||
[I originally had in mind a third step --- Advertised bandwidth capacity
|
||||
is listed as "0" --- so current Tor clients will skip over this node
|
||||
when building most circuits. But since clients won't fetch descriptors
|
||||
from nodes not listed as Running, this step seems pointless. -RD]
|
||||
|
||||
Spec:
|
||||
|
||||
TBD but should be pretty straightforward.
|
||||
|
||||
Security issues:
|
||||
|
||||
Now external people can learn exactly when a node stopped offering
|
||||
relay service. How bad is this? I can see a few minor attacks based
|
||||
on this knowledge, but on the other hand as it is we don't really take
|
||||
any steps to keep this information secret.
|
||||
|
||||
Overhead issues:
|
||||
|
||||
We are creating more descriptors that want to be remembered. However,
|
||||
since the router won't be marked as Running, ordinary clients won't
|
||||
fetch the shutdown descriptors. Caches will, though. I hope this is ok.
|
||||
|
||||
Implementation:
|
||||
|
||||
To make things easy, we should publish the shutdown descriptor only
|
||||
on controlled shutdown (SIGINT as opposed to SIGTERM). That would
|
||||
leave enough time for publishing that we probably wouldn't need any
|
||||
extra synchronization code.
|
||||
|
||||
If that turns out to be too unintuitive for users, I could imagine doing
|
||||
it on SIGTERMs too, and just delaying exit until we had successfully
|
||||
published to at least one authority, at which point we'd hope that it
|
||||
propagated from there.
|
||||
|
||||
Acknowledgements:
|
||||
|
||||
tup suggested this idea.
|
||||
|
||||
Comments:
|
||||
|
||||
2) Maybe add a rule "Don't do this for hibernation if we expect to wake
|
||||
up before the next consensus is published"?
|
||||
- NM 9 Oct 2007
|
@ -1,778 +0,0 @@
|
||||
Filename: 121-hidden-service-authentication.txt
|
||||
Title: Hidden Service Authentication
|
||||
Version: $Revision$
|
||||
Last-Modified: $Date$
|
||||
Author: Tobias Kamm, Thomas Lauterbach, Karsten Loesing, Ferdinand Rieger,
|
||||
Christoph Weingarten
|
||||
Created: 10-Sep-2007
|
||||
Status: Finished
|
||||
Implemented-In: 0.2.1.x
|
||||
|
||||
Change history:
|
||||
|
||||
26-Sep-2007 Initial proposal for or-dev
|
||||
08-Dec-2007 Incorporated comments by Nick posted to or-dev on 10-Oct-2007
|
||||
15-Dec-2007 Rewrote complete proposal for better readability, modified
|
||||
authentication protocol, merged in personal notes
|
||||
24-Dec-2007 Replaced misleading term "authentication" by "authorization"
|
||||
and added some clarifications (comments by Sven Kaffille)
|
||||
28-Apr-2008 Updated most parts of the concrete authorization protocol
|
||||
04-Jul-2008 Add a simple algorithm to delay descriptor publication for
|
||||
different clients of a hidden service
|
||||
19-Jul-2008 Added INTRODUCE1V cell type (1.2), improved replay
|
||||
protection for INTRODUCE2 cells (1.3), described limitations
|
||||
for auth protocols (1.6), improved hidden service protocol
|
||||
without client authorization (2.1), added second, more
|
||||
scalable authorization protocol (2.2), rewrote existing
|
||||
authorization protocol (2.3); changes based on discussion
|
||||
with Nick
|
||||
31-Jul-2008 Limit maximum descriptor size to 20 kilobytes to prevent
|
||||
abuse.
|
||||
01-Aug-2008 Use first part of Diffie-Hellman handshake for replay
|
||||
protection instead of rendezvous cookie.
|
||||
01-Aug-2008 Remove improved hidden service protocol without client
|
||||
authorization (2.1). It might get implemented in proposal
|
||||
142.
|
||||
|
||||
Overview:
|
||||
|
||||
This proposal deals with a general infrastructure for performing
|
||||
authorization (not necessarily implying authentication) of requests to
|
||||
hidden services at three points: (1) when downloading and decrypting
|
||||
parts of the hidden service descriptor, (2) at the introduction point,
|
||||
and (3) at Bob's Tor client before contacting the rendezvous point. A
|
||||
service provider will be able to restrict access to his service at these
|
||||
three points to authorized clients only. Further, the proposal contains
|
||||
specific authorization protocols as instances that implement the
|
||||
presented authorization infrastructure.
|
||||
|
||||
This proposal is based on v2 hidden service descriptors as described in
|
||||
proposal 114 and introduced in version 0.2.0.10-alpha.
|
||||
|
||||
The proposal is structured as follows: The next section motivates the
|
||||
integration of authorization mechanisms in the hidden service protocol.
|
||||
Then we describe a general infrastructure for authorization in hidden
|
||||
services, followed by specific authorization protocols for this
|
||||
infrastructure. At the end we discuss a number of attacks and non-attacks
|
||||
as well as compatibility issues.
|
||||
|
||||
Motivation:
|
||||
|
||||
The major part of hidden services does not require client authorization
|
||||
now and won't do so in the future. To the contrary, many clients would
|
||||
not want to be (pseudonymously) identifiable by the service (though this
|
||||
is unavoidable to some extent), but rather use the service
|
||||
anonymously. These services are not addressed by this proposal.
|
||||
|
||||
However, there may be certain services which are intended to be accessed
|
||||
by a limited set of clients only. A possible application might be a
|
||||
wiki or forum that should only be accessible for a closed user group.
|
||||
Another, less intuitive example might be a real-time communication
|
||||
service, where someone provides a presence and messaging service only to
|
||||
his buddies. Finally, a possible application would be a personal home
|
||||
server that should be remotely accessed by its owner.
|
||||
|
||||
Performing authorization for a hidden service within the Tor network, as
|
||||
proposed here, offers a range of advantages compared to allowing all
|
||||
client connections in the first instance and deferring authorization to
|
||||
the transported protocol:
|
||||
|
||||
(1) Reduced traffic: Unauthorized requests would be rejected as early as
|
||||
possible, thereby reducing the overall traffic in the network generated
|
||||
by establishing circuits and sending cells.
|
||||
|
||||
(2) Better protection of service location: Unauthorized clients could not
|
||||
force Bob to create circuits to their rendezvous points, thus preventing
|
||||
the attack described by Øverlier and Syverson in their paper "Locating
|
||||
Hidden Servers" even without the need for guards.
|
||||
|
||||
(3) Hiding activity: Apart from performing the actual authorization, a
|
||||
service provider could also hide the mere presence of his service from
|
||||
unauthorized clients when not providing hidden service descriptors to
|
||||
them, rejecting unauthorized requests already at the introduction
|
||||
point (ideally without leaking presence information at any of these
|
||||
points), or not answering unauthorized introduction requests.
|
||||
|
||||
(4) Better protection of introduction points: When providing hidden
|
||||
service descriptors to authorized clients only and encrypting the
|
||||
introduction points as described in proposal 114, the introduction points
|
||||
would be unknown to unauthorized clients and thereby protected from DoS
|
||||
attacks.
|
||||
|
||||
(5) Protocol independence: Authorization could be performed for all
|
||||
transported protocols, regardless of their own capabilities to do so.
|
||||
|
||||
(6) Ease of administration: A service provider running multiple hidden
|
||||
services would be able to configure access at a single place uniformly
|
||||
instead of doing so for all services separately.
|
||||
|
||||
(7) Optional QoS support: Bob could adapt his node selection algorithm
|
||||
for building the circuit to Alice's rendezvous point depending on a
|
||||
previously guaranteed QoS level, thus providing better latency or
|
||||
bandwidth for selected clients.
|
||||
|
||||
A disadvantage of performing authorization within the Tor network is
|
||||
that a hidden service cannot make use of authorization data in
|
||||
the transported protocol. Tor hidden services were designed to be
|
||||
independent of the transported protocol. Therefore it's only possible to
|
||||
either grant or deny access to the whole service, but not to specific
|
||||
resources of the service.
|
||||
|
||||
Authorization often implies authentication, i.e. proving one's identity.
|
||||
However, when performing authorization within the Tor network, untrusted
|
||||
points should not gain any useful information about the identities of
|
||||
communicating parties, neither server nor client. A crucial challenge is
|
||||
to remain anonymous towards directory servers and introduction points.
|
||||
However, trying to hide identity from the hidden service is a futile
|
||||
task, because a client would never know if he is the only authorized
|
||||
client and therefore perfectly identifiable. Therefore, hiding client
|
||||
identity from the hidden service is not an aim of this proposal.
|
||||
|
||||
The current implementation of hidden services does not provide any kind
|
||||
of authorization. The hidden service descriptor version 2, introduced by
|
||||
proposal 114, was designed to use a descriptor cookie for downloading and
|
||||
decrypting parts of the descriptor content, but this feature is not yet
|
||||
in use. Further, most relevant cell formats specified in rend-spec
|
||||
contain fields for authorization data, but those fields are neither
|
||||
implemented nor do they suffice entirely.
|
||||
|
||||
Details:
|
||||
|
||||
1. General infrastructure for authorization to hidden services
|
||||
|
||||
We spotted three possible authorization points in the hidden service
|
||||
protocol:
|
||||
|
||||
(1) when downloading and decrypting parts of the hidden service
|
||||
descriptor,
|
||||
(2) at the introduction point, and
|
||||
(3) at Bob's Tor client before contacting the rendezvous point.
|
||||
|
||||
The general idea of this proposal is to allow service providers to
|
||||
restrict access to some or all of these points to authorized clients
|
||||
only.
|
||||
|
||||
1.1. Client authorization at directory
|
||||
|
||||
Since the implementation of proposal 114 it is possible to combine a
|
||||
hidden service descriptor with a so-called descriptor cookie. If done so,
|
||||
the descriptor cookie becomes part of the descriptor ID, thus having an
|
||||
effect on the storage location of the descriptor. Someone who has learned
|
||||
about a service, but is not aware of the descriptor cookie, won't be able
|
||||
to determine the descriptor ID and download the current hidden service
|
||||
descriptor; he won't even know whether the service has uploaded a
|
||||
descriptor recently. Descriptor IDs are calculated as follows (see
|
||||
section 1.2 of rend-spec for the complete specification of v2 hidden
|
||||
service descriptors):
|
||||
|
||||
descriptor-id =
|
||||
H(service-id | H(time-period | descriptor-cookie | replica))
|
||||
|
||||
Currently, service-id is equivalent to permanent-id which is calculated
|
||||
as in the following formula. But in principle it could be any public
|
||||
key.
|
||||
|
||||
permanent-id = H(permanent-key)[:10]
|
||||
|
||||
The second purpose of the descriptor cookie is to encrypt the list of
|
||||
introduction points, including optional authorization data. Hence, the
|
||||
hidden service directories won't learn any introduction information from
|
||||
storing a hidden service descriptor. This feature is implemented but
|
||||
unused at the moment. So this proposal will harness the advantages
|
||||
of proposal 114.
|
||||
|
||||
The descriptor cookie can be used for authorization by keeping it secret
|
||||
from everyone but authorized clients. A service could then decide whether
|
||||
to publish hidden service descriptors using that descriptor cookie later
|
||||
on. An authorized client being aware of the descriptor cookie would be
|
||||
able to download and decrypt the hidden service descriptor.
|
||||
|
||||
The number of concurrently used descriptor cookies for one hidden service
|
||||
is not restricted. A service could use a single descriptor cookie for all
|
||||
users, a distinct cookie per user, or something in between, like one
|
||||
cookie per group of users. It is up to the specific protocol and how it
|
||||
is applied by a service provider.
|
||||
|
||||
Two or more hidden service descriptors for different groups or users
|
||||
should not be uploaded at the same time. A directory node could conclude
|
||||
easily that the descriptors were issued by the same hidden service, thus
|
||||
being able to link the two groups or users. Therefore, descriptors for
|
||||
different users or clients that ought to be stored on the same directory
|
||||
are delayed, so that only one descriptor is uploaded to a directory at a
|
||||
time. The remaining descriptors are uploaded with a delay of up to
|
||||
30 seconds.
|
||||
Further, descriptors for different groups or users that are to be stored
|
||||
on different directories are delayed for a random time of up to 30
|
||||
seconds to hide relations from colluding directories. Certainly, this
|
||||
does not prevent linking entirely, but it makes it somewhat harder.
|
||||
There is a conflict between hiding links between clients and making a
|
||||
service available in a timely manner.
|
||||
|
||||
Although this part of the proposal is meant to describe a general
|
||||
infrastructure for authorization, changing the way of using the
|
||||
descriptor cookie to look up hidden service descriptors, e.g. applying
|
||||
some sort of asymmetric crypto system, would require in-depth changes
|
||||
that would be incompatible to v2 hidden service descriptors. On the
|
||||
contrary, using another key for en-/decrypting the introduction point
|
||||
part of a hidden service descriptor, e.g. a different symmetric key or
|
||||
asymmetric encryption, would be easy to implement and compatible to v2
|
||||
hidden service descriptors as understood by hidden service directories
|
||||
(clients and services would have to be upgraded anyway for using the new
|
||||
features).
|
||||
|
||||
An adversary could try to abuse the fact that introduction points can be
|
||||
encrypted by storing arbitrary, unrelated data in the hidden service
|
||||
directory. This abuse can be limited by setting a hard descriptor size
|
||||
limit, forcing the adversary to split data into multiple chunks. There
|
||||
are some limitations that make splitting data across multiple descriptors
|
||||
unattractive: 1) The adversary would not be able to choose descriptor IDs
|
||||
freely and would therefore have to implement his own indexing
|
||||
structure. 2) Validity of descriptors is limited to at most 24 hours
|
||||
after which descriptors need to be republished.
|
||||
|
||||
The regular descriptor size in bytes is 745 + num_ipos * 837 + auth_data.
|
||||
A large descriptor with 7 introduction points and 5 kilobytes of
|
||||
authorization data would be 11724 bytes in size. The upper size limit of
|
||||
descriptors should be set to 20 kilobytes, which limits the effect of
|
||||
abuse while retaining enough flexibility in designing authorization
|
||||
protocols.
|
||||
|
||||
1.2. Client authorization at introduction point
|
||||
|
||||
The next possible authorization point after downloading and decrypting
|
||||
a hidden service descriptor is the introduction point. It may be important
|
||||
for authorization, because it bears the last chance of hiding presence
|
||||
of a hidden service from unauthorized clients. Further, performing
|
||||
authorization at the introduction point might reduce traffic in the
|
||||
network, because unauthorized requests would not be passed to the
|
||||
hidden service. This applies to those clients who are aware of a
|
||||
descriptor cookie and thereby of the hidden service descriptor, but do
|
||||
not have authorization data to pass the introduction point or access the
|
||||
service (such a situation might occur when authorization data for
|
||||
authorization at the directory is not issued on a per-user basis, but
|
||||
authorization data for authorization at the introduction point is).
|
||||
|
||||
It is important to note that the introduction point must be considered
|
||||
untrustworthy, and therefore cannot replace authorization at the hidden
|
||||
service itself. Nor should the introduction point learn any sensitive
|
||||
identifiable information from either the service or the client.
|
||||
|
||||
In order to perform authorization at the introduction point, three
|
||||
message formats need to be modified: (1) v2 hidden service descriptors,
|
||||
(2) ESTABLISH_INTRO cells, and (3) INTRODUCE1 cells.
|
||||
|
||||
A v2 hidden service descriptor needs to contain authorization data that
|
||||
is introduction-point-specific and sometimes also authorization data
|
||||
that is introduction-point-independent. Therefore, v2 hidden service
|
||||
descriptors as specified in section 1.2 of rend-spec already contain two
|
||||
reserved fields "intro-authorization" and "service-authorization"
|
||||
(originally, the names of these fields were "...-authentication")
|
||||
containing an authorization type number and arbitrary authorization
|
||||
data. We propose that authorization data consists of base64 encoded
|
||||
objects of arbitrary length, surrounded by "-----BEGIN MESSAGE-----" and
|
||||
"-----END MESSAGE-----". This will increase the size of hidden service
|
||||
descriptors, but this is allowed since there is no strict upper limit.
|
||||
|
||||
The current ESTABLISH_INTRO cells as described in section 1.3 of
|
||||
rend-spec do not contain either authorization data or version
|
||||
information. Therefore, we propose a new version 1 of the ESTABLISH_INTRO
|
||||
cells adding these two issues as follows:
|
||||
|
||||
V Format byte: set to 255 [1 octet]
|
||||
V Version byte: set to 1 [1 octet]
|
||||
KL Key length [2 octets]
|
||||
PK Bob's public key [KL octets]
|
||||
HS Hash of session info [20 octets]
|
||||
AUTHT The auth type that is supported [1 octet]
|
||||
AUTHL Length of auth data [2 octets]
|
||||
AUTHD Auth data [variable]
|
||||
SIG Signature of above information [variable]
|
||||
|
||||
From the format it is possible to determine the maximum allowed size for
|
||||
authorization data: given the fact that cells are 512 octets long, of
|
||||
which 498 octets are usable (see section 6.1 of tor-spec), and assuming
|
||||
1024 bit = 128 octet long keys, there are 215 octets left for
|
||||
authorization data. Hence, authorization protocols are bound to use no
|
||||
more than these 215 octets, regardless of the number of clients that
|
||||
shall be authenticated at the introduction point. Otherwise, one would
|
||||
need to send multiple ESTABLISH_INTRO cells or split them up, which we do
|
||||
not specify here.
|
||||
|
||||
In order to understand a v1 ESTABLISH_INTRO cell, the implementation of
|
||||
a relay must have a certain Tor version. Hidden services need to be able
|
||||
to distinguish relays being capable of understanding the new v1 cell
|
||||
formats and perform authorization. We propose to use the version number
|
||||
that is contained in networkstatus documents to find capable
|
||||
introduction points.
|
||||
|
||||
The current INTRODUCE1 cell as described in section 1.8 of rend-spec is
|
||||
not designed to carry authorization data and has no version number, too.
|
||||
Unfortunately, unversioned INTRODUCE1 cells consist only of a fixed-size,
|
||||
seemingly random PK_ID, followed by the encrypted INTRODUCE2 cell. This
|
||||
makes it impossible to distinguish unversioned INTRODUCE1 cells from any
|
||||
later format. In particular, it is not possible to introduce some kind of
|
||||
format and version byte for newer versions of this cell. That's probably
|
||||
where the comment "[XXX011 want to put intro-level auth info here, but no
|
||||
version. crap. -RD]" that was part of rend-spec some time ago comes from.
|
||||
|
||||
We propose that new versioned INTRODUCE1 cells use the new cell type 41
|
||||
RELAY_INTRODUCE1V (where V stands for versioned):
|
||||
|
||||
Cleartext
|
||||
V Version byte: set to 1 [1 octet]
|
||||
PK_ID Identifier for Bob's PK [20 octets]
|
||||
AUTHT The auth type that is included [1 octet]
|
||||
AUTHL Length of auth data [2 octets]
|
||||
AUTHD Auth data [variable]
|
||||
Encrypted to Bob's PK:
|
||||
(RELAY_INTRODUCE2 cell)
|
||||
|
||||
The maximum length of contained authorization data depends on the length
|
||||
of the contained INTRODUCE2 cell. A calculation follows below when
|
||||
describing the INTRODUCE2 cell format we propose to use.
|
||||
|
||||
1.3. Client authorization at hidden service
|
||||
|
||||
The time when a hidden service receives an INTRODUCE2 cell constitutes
|
||||
the last possible authorization point during the hidden service
|
||||
protocol. Performing authorization here is easier than at the other two
|
||||
authorization points, because there are no possibly untrusted entities
|
||||
involved.
|
||||
|
||||
In general, a client that is successfully authorized at the introduction
|
||||
point should be granted access at the hidden service, too. Otherwise, the
|
||||
client would receive a positive INTRODUCE_ACK cell from the introduction
|
||||
point and conclude that it may connect to the service, but the request
|
||||
will be dropped without notice. This would appear as a failure to
|
||||
clients. Therefore, the number of cases in which a client successfully
|
||||
passes the introduction point but fails at the hidden service should be
|
||||
zero. However, this does not lead to the conclusion that the
|
||||
authorization data used at the introduction point and the hidden service
|
||||
must be the same, but only that both authorization data should lead to
|
||||
the same authorization result.
|
||||
|
||||
Authorization data is transmitted from client to server via an
|
||||
INTRODUCE2 cell that is forwarded by the introduction point. There are
|
||||
versions 0 to 2 specified in section 1.8 of rend-spec, but none of these
|
||||
contain fields for carrying authorization data. We propose a slightly
|
||||
modified version of v3 INTRODUCE2 cells that is specified in section
|
||||
1.8.1 and which is not implemented as of December 2007. In contrast to
|
||||
the specified v3 we avoid specifying (and implementing) IPv6 capabilities,
|
||||
because Tor relays will be required to support IPv4 addresses for a long
|
||||
time in the future, so that this seems unnecessary at the moment. The
|
||||
proposed format of v3 INTRODUCE2 cells is as follows:
|
||||
|
||||
VER Version byte: set to 3. [1 octet]
|
||||
AUTHT The auth type that is used [1 octet]
|
||||
AUTHL Length of auth data [2 octets]
|
||||
AUTHD Auth data [variable]
|
||||
TS Timestamp (seconds since 1-1-1970) [4 octets]
|
||||
IP Rendezvous point's address [4 octets]
|
||||
PORT Rendezvous point's OR port [2 octets]
|
||||
ID Rendezvous point identity ID [20 octets]
|
||||
KLEN Length of onion key [2 octets]
|
||||
KEY Rendezvous point onion key [KLEN octets]
|
||||
RC Rendezvous cookie [20 octets]
|
||||
g^x Diffie-Hellman data, part 1 [128 octets]
|
||||
|
||||
The maximum possible length of authorization data is related to the
|
||||
enclosing INTRODUCE1V cell. A v3 INTRODUCE2 cell with
|
||||
1024 bit = 128 octets long public key without any authorization data
|
||||
occupies 306 octets (AUTHL is only used when AUTHT has a value != 0),
|
||||
plus 58 octets for hybrid public key encryption (see
|
||||
section 5.1 of tor-spec on hybrid encryption of CREATE cells). The
|
||||
surrounding INTRODUCE1V cell requires 24 octets. This leaves only 110
|
||||
of the 498 available octets free, which must be shared between
|
||||
authorization data to the introduction point _and_ to the hidden
|
||||
service.
|
||||
|
||||
When receiving a v3 INTRODUCE2 cell, Bob checks whether a client has
|
||||
provided valid authorization data to him. He also requires that the
|
||||
timestamp is no more than 30 minutes in the past or future and that the
|
||||
first part of the Diffie-Hellman handshake has not been used in the past
|
||||
60 minutes to prevent replay attacks by rogue introduction points. (The
|
||||
reason for not using the rendezvous cookie to detect replays---even
|
||||
though it is only sent once in the current design---is that it might be
|
||||
desirable to re-use rendezvous cookies for multiple introduction requests
|
||||
in the future.) If all checks pass, Bob builds a circuit to the provided
|
||||
rendezvous point. Otherwise he drops the cell.
|
||||
|
||||
1.4. Summary of authorization data fields
|
||||
|
||||
In summary, the proposed descriptor format and cell formats provide the
|
||||
following fields for carrying authorization data:
|
||||
|
||||
(1) The v2 hidden service descriptor contains:
|
||||
- a descriptor cookie that is used for the lookup process, and
|
||||
- an arbitrary encryption schema to ensure authorization to access
|
||||
introduction information (currently symmetric encryption with the
|
||||
descriptor cookie).
|
||||
|
||||
(2) For performing authorization at the introduction point we can use:
|
||||
- the fields intro-authorization and service-authorization in
|
||||
hidden service descriptors,
|
||||
- a maximum of 215 octets in the ESTABLISH_INTRO cell, and
|
||||
- one part of 110 octets in the INTRODUCE1V cell.
|
||||
|
||||
(3) For performing authorization at the hidden service we can use:
|
||||
- the fields intro-authorization and service-authorization in
|
||||
hidden service descriptors,
|
||||
- the other part of 110 octets in the INTRODUCE2 cell.
|
||||
|
||||
It will also still be possible to access a hidden service without any
|
||||
authorization or only use a part of the authorization infrastructure.
|
||||
However, this requires to consider all parts of the infrastructure. For
|
||||
example, authorization at the introduction point relying on confidential
|
||||
intro-authorization data transported in the hidden service descriptor
|
||||
cannot be performed without using an encryption schema for introduction
|
||||
information.
|
||||
|
||||
1.5. Managing authorization data at servers and clients
|
||||
|
||||
In order to provide authorization data at the hidden service and the
|
||||
authenticated clients, we propose to use files---either the Tor
|
||||
configuration file or separate files. The exact format of these special
|
||||
files depends on the authorization protocol used.
|
||||
|
||||
Currently, rend-spec contains the proposition to encode client-side
|
||||
authorization data in the URL, like in x.y.z.onion. This was never used
|
||||
and is also a bad idea, because in case of HTTP the requested URL may be
|
||||
contained in the Host and Referer fields.
|
||||
|
||||
1.6. Limitations for authorization protocols
|
||||
|
||||
There are two limitations of the current hidden service protocol for
|
||||
authorization protocols that shall be identified here.
|
||||
|
||||
1. The three cell types ESTABLISH_INTRO, INTRODUCE1V, and INTRODUCE2
|
||||
restricts the amount of data that can be used for authorization.
|
||||
This forces authorization protocols that require per-user
|
||||
authorization data at the introduction point to restrict the number
|
||||
of authorized clients artificially. A possible solution could be to
|
||||
split contents among multiple cells and reassemble them at the
|
||||
introduction points.
|
||||
|
||||
2. The current hidden service protocol does not specify cell types to
|
||||
perform interactive authorization between client and introduction
|
||||
point or hidden service. If there should be an authorization
|
||||
protocol that requires interaction, new cell types would have to be
|
||||
defined and integrated into the hidden service protocol.
|
||||
|
||||
|
||||
2. Specific authorization protocol instances
|
||||
|
||||
In the following we present two specific authorization protocols that
|
||||
make use of (parts of) the new authorization infrastructure:
|
||||
|
||||
1. The first protocol allows a service provider to restrict access
|
||||
to clients with a previously received secret key only, but does not
|
||||
attempt to hide service activity from others.
|
||||
|
||||
2. The second protocol, albeit being feasible for a limited set of about
|
||||
16 clients, performs client authorization and hides service activity
|
||||
from everyone but the authorized clients.
|
||||
|
||||
These two protocol instances extend the existing hidden service protocol
|
||||
version 2. Hidden services that perform client authorization may run in
|
||||
parallel to other services running versions 0, 2, or both.
|
||||
|
||||
2.1. Service with large-scale client authorization
|
||||
|
||||
The first client authorization protocol aims at performing access control
|
||||
while consuming as few additional resources as possible. A service
|
||||
provider should be able to permit access to a large number of clients
|
||||
while denying access for everyone else. However, the price for
|
||||
scalability is that the service won't be able to hide its activity from
|
||||
unauthorized or formerly authorized clients.
|
||||
|
||||
The main idea of this protocol is to encrypt the introduction-point part
|
||||
in hidden service descriptors to authorized clients using symmetric keys.
|
||||
This ensures that nobody else but authorized clients can learn which
|
||||
introduction points a service currently uses, nor can someone send a
|
||||
valid INTRODUCE1 message without knowing the introduction key. Therefore,
|
||||
a subsequent authorization at the introduction point is not required.
|
||||
|
||||
A service provider generates symmetric "descriptor cookies" for his
|
||||
clients and distributes them outside of Tor. The suggested key size is
|
||||
128 bits, so that descriptor cookies can be encoded in 22 base64 chars
|
||||
(which can hold up to 22 * 5 = 132 bits, leaving 4 bits to encode the
|
||||
authorization type (here: "0") and allow a client to distinguish this
|
||||
authorization protocol from others like the one proposed below).
|
||||
Typically, the contact information for a hidden service using this
|
||||
authorization protocol looks like this:
|
||||
|
||||
v2cbb2l4lsnpio4q.onion Ll3X7Xgz9eHGKCCnlFH0uz
|
||||
|
||||
When generating a hidden service descriptor, the service encrypts the
|
||||
introduction-point part with a single randomly generated symmetric
|
||||
128-bit session key using AES-CTR as described for v2 hidden service
|
||||
descriptors in rend-spec. Afterwards, the service encrypts the session
|
||||
key to all descriptor cookies using AES. Authorized client should be able
|
||||
to efficiently find the session key that is encrypted for him/her, so
|
||||
that 4 octet long client ID are generated consisting of descriptor cookie
|
||||
and initialization vector. Descriptors always contain a number of
|
||||
encrypted session keys that is a multiple of 16 by adding fake entries.
|
||||
Encrypted session keys are ordered by client IDs in order to conceal
|
||||
addition or removal of authorized clients by the service provider.
|
||||
|
||||
ATYPE Authorization type: set to 1. [1 octet]
|
||||
ALEN Number of clients := 1 + ((clients - 1) div 16) [1 octet]
|
||||
for each symmetric descriptor cookie:
|
||||
ID Client ID: H(descriptor cookie | IV)[:4] [4 octets]
|
||||
SKEY Session key encrypted with descriptor cookie [16 octets]
|
||||
(end of client-specific part)
|
||||
RND Random data [(15 - ((clients - 1) mod 16)) * 20 octets]
|
||||
IV AES initialization vector [16 octets]
|
||||
IPOS Intro points, encrypted with session key [remaining octets]
|
||||
|
||||
An authorized client needs to configure Tor to use the descriptor cookie
|
||||
when accessing the hidden service. Therefore, a user adds the contact
|
||||
information that she received from the service provider to her torrc
|
||||
file. Upon downloading a hidden service descriptor, Tor finds the
|
||||
encrypted introduction-point part and attempts to decrypt it using the
|
||||
configured descriptor cookie. (In the rare event of two or more client
|
||||
IDs being equal a client tries to decrypt all of them.)
|
||||
|
||||
Upon sending the introduction, the client includes her descriptor cookie
|
||||
as auth type "1" in the INTRODUCE2 cell that she sends to the service.
|
||||
The hidden service checks whether the included descriptor cookie is
|
||||
authorized to access the service and either responds to the introduction
|
||||
request, or not.
|
||||
|
||||
2.2. Authorization for limited number of clients
|
||||
|
||||
A second, more sophisticated client authorization protocol goes the extra
|
||||
mile of hiding service activity from unauthorized clients. With all else
|
||||
being equal to the preceding authorization protocol, the second protocol
|
||||
publishes hidden service descriptors for each user separately and gets
|
||||
along with encrypting the introduction-point part of descriptors to a
|
||||
single client. This allows the service to stop publishing descriptors for
|
||||
removed clients. As long as a removed client cannot link descriptors
|
||||
issued for other clients to the service, it cannot derive service
|
||||
activity any more. The downside of this approach is limited scalability.
|
||||
Even though the distributed storage of descriptors (cf. proposal 114)
|
||||
tackles the problem of limited scalability to a certain extent, this
|
||||
protocol should not be used for services with more than 16 clients. (In
|
||||
fact, Tor should refuse to advertise services for more than this number
|
||||
of clients.)
|
||||
|
||||
A hidden service generates an asymmetric "client key" and a symmetric
|
||||
"descriptor cookie" for each client. The client key is used as
|
||||
replacement for the service's permanent key, so that the service uses a
|
||||
different identity for each of his clients. The descriptor cookie is used
|
||||
to store descriptors at changing directory nodes that are unpredictable
|
||||
for anyone but service and client, to encrypt the introduction-point
|
||||
part, and to be included in INTRODUCE2 cells. Once the service has
|
||||
created client key and descriptor cookie, he tells them to the client
|
||||
outside of Tor. The contact information string looks similar to the one
|
||||
used by the preceding authorization protocol (with the only difference
|
||||
that it has "1" encoded as auth-type in the remaining 4 of 132 bits
|
||||
instead of "0" as before).
|
||||
|
||||
When creating a hidden service descriptor for an authorized client, the
|
||||
hidden service uses the client key and descriptor cookie to compute
|
||||
secret ID part and descriptor ID:
|
||||
|
||||
secret-id-part = H(time-period | descriptor-cookie | replica)
|
||||
|
||||
descriptor-id = H(client-key[:10] | secret-id-part)
|
||||
|
||||
The hidden service also replaces permanent-key in the descriptor with
|
||||
client-key and encrypts introduction-points with the descriptor cookie.
|
||||
|
||||
ATYPE Authorization type: set to 2. [1 octet]
|
||||
IV AES initialization vector [16 octets]
|
||||
IPOS Intro points, encr. with descriptor cookie [remaining octets]
|
||||
|
||||
When uploading descriptors, the hidden service needs to make sure that
|
||||
descriptors for different clients are not uploaded at the same time (cf.
|
||||
Section 1.1) which is also a limiting factor for the number of clients.
|
||||
|
||||
When a client is requested to establish a connection to a hidden service
|
||||
it looks up whether it has any authorization data configured for that
|
||||
service. If the user has configured authorization data for authorization
|
||||
protocol "2", the descriptor ID is determined as described in the last
|
||||
paragraph. Upon receiving a descriptor, the client decrypts the
|
||||
introduction-point part using its descriptor cookie. Further, the client
|
||||
includes its descriptor cookie as auth-type "2" in INTRODUCE2 cells that
|
||||
it sends to the service.
|
||||
|
||||
2.3. Hidden service configuration
|
||||
|
||||
A hidden service that is meant to perform client authorization adds a
|
||||
new option HiddenServiceAuthorizeClient to its hidden service
|
||||
configuration. This option contains the authorization type which is
|
||||
either "1" for the protocol described in 2.1 or "2" for the protocol in
|
||||
2.2 and a comma-separated list of human-readable client names, so that
|
||||
Tor can create authorization data for these clients:
|
||||
|
||||
HiddenServiceAuthorizeClient auth-type client-name,client-name,...
|
||||
|
||||
If this option is configured, HiddenServiceVersion is automatically
|
||||
reconfigured to contain only version numbers of 2 or higher.
|
||||
|
||||
Tor stores all generated authorization data for the authorization
|
||||
protocols described in Sections 2.1 and 2.2 in a new file using the
|
||||
following file format:
|
||||
|
||||
"client-name" human-readable client identifier NL
|
||||
"descriptor-cookie" 128-bit key ^= 22 base64 chars NL
|
||||
|
||||
If the authorization protocol of Section 2.2 is used, Tor also generates
|
||||
and stores the following data:
|
||||
|
||||
"client-key" NL a public key in PEM format
|
||||
|
||||
2.4. Client configuration
|
||||
|
||||
Clients need to make their authorization data known to Tor using another
|
||||
configuration option that contains a service name (mainly for the sake of
|
||||
convenience), the service address, and the descriptor cookie that is
|
||||
required to access a hidden service (the authorization protocol number is
|
||||
encoded in the descriptor cookie):
|
||||
|
||||
HidServAuth service-name service-address descriptor-cookie
|
||||
|
||||
Security implications:
|
||||
|
||||
In the following we want to discuss possible attacks by dishonest
|
||||
entities in the presented infrastructure and specific protocol. These
|
||||
security implications would have to be verified once more when adding
|
||||
another protocol. The dishonest entities (theoretically) include the
|
||||
hidden service itself, the authenticated clients, hidden service directory
|
||||
nodes, introduction points, and rendezvous points. The relays that are
|
||||
part of circuits used during protocol execution, but never learn about
|
||||
the exchanged descriptors or cells by design, are not considered.
|
||||
Obviously, this list makes no claim to be complete. The discussed attacks
|
||||
are sorted by the difficulty to perform them, in ascending order,
|
||||
starting with roles that everyone could attempt to take and ending with
|
||||
partially trusted entities abusing the trust put in them.
|
||||
|
||||
(1) A hidden service directory could attempt to conclude presence of a
|
||||
service from the existence of a locally stored hidden service descriptor:
|
||||
This passive attack is possible only for a single client-service
|
||||
relation, because descriptors need to contain a publicly visible
|
||||
signature of the service using the client key.
|
||||
A possible protection would be to increase the number of hidden service
|
||||
directories in the network.
|
||||
|
||||
(2) A hidden service directory could try to break the descriptor cookies
|
||||
of locally stored descriptors: This attack can be performed offline. The
|
||||
only useful countermeasure against it might be using safe passwords that
|
||||
are generated by Tor.
|
||||
|
||||
[passwords? where did those come in? -RD]
|
||||
|
||||
(3) An introduction point could try to identify the pseudonym of the
|
||||
hidden service on behalf of which it operates: This is impossible by
|
||||
design, because the service uses a fresh public key for every
|
||||
establishment of an introduction point (see proposal 114) and the
|
||||
introduction point receives a fresh introduction cookie, so that there is
|
||||
no identifiable information about the service that the introduction point
|
||||
could learn. The introduction point cannot even tell if client accesses
|
||||
belong to the same client or not, nor can it know the total number of
|
||||
authorized clients. The only information might be the pattern of
|
||||
anonymous client accesses, but that is hardly enough to reliably identify
|
||||
a specific service.
|
||||
|
||||
(4) An introduction point could want to learn the identities of accessing
|
||||
clients: This is also impossible by design, because all clients use the
|
||||
same introduction cookie for authorization at the introduction point.
|
||||
|
||||
(5) An introduction point could try to replay a correct INTRODUCE1 cell
|
||||
to other introduction points of the same service, e.g. in order to force
|
||||
the service to create a huge number of useless circuits: This attack is
|
||||
not possible by design, because INTRODUCE1 cells are encrypted using a
|
||||
freshly created introduction key that is only known to authorized
|
||||
clients.
|
||||
|
||||
(6) An introduction point could attempt to replay a correct INTRODUCE2
|
||||
cell to the hidden service, e.g. for the same reason as in the last
|
||||
attack: This attack is stopped by the fact that a service will drop
|
||||
INTRODUCE2 cells containing a DH handshake they have seen recently.
|
||||
|
||||
(7) An introduction point could block client requests by sending either
|
||||
positive or negative INTRODUCE_ACK cells back to the client, but without
|
||||
forwarding INTRODUCE2 cells to the server: This attack is an annoyance
|
||||
for clients, because they might wait for a timeout to elapse until trying
|
||||
another introduction point. However, this attack is not introduced by
|
||||
performing authorization and it cannot be targeted towards a specific
|
||||
client. A countermeasure might be for the server to periodically perform
|
||||
introduction requests to his own service to see if introduction points
|
||||
are working correctly.
|
||||
|
||||
(8) The rendezvous point could attempt to identify either server or
|
||||
client: This remains impossible as it was before, because the
|
||||
rendezvous cookie does not contain any identifiable information.
|
||||
|
||||
(9) An authenticated client could swamp the server with valid INTRODUCE1
|
||||
and INTRODUCE2 cells, e.g. in order to force the service to create
|
||||
useless circuits to rendezvous points; as opposed to an introduction
|
||||
point replaying the same INTRODUCE2 cell, a client could include a new
|
||||
rendezvous cookie for every request: The countermeasure for this attack
|
||||
is the restriction to 10 connection establishments per client per hour.
|
||||
|
||||
Compatibility:
|
||||
|
||||
An implementation of this proposal would require changes to hidden
|
||||
services and clients to process authorization data and encode and
|
||||
understand the new formats. However, both services and clients would
|
||||
remain compatible to regular hidden services without authorization.
|
||||
|
||||
Implementation:
|
||||
|
||||
The implementation of this proposal can be divided into a number of
|
||||
changes to hidden service and client side. There are no
|
||||
changes necessary on directory, introduction, or rendezvous nodes. All
|
||||
changes are marked with either [service] or [client] do denote on which
|
||||
side they need to be made.
|
||||
|
||||
/1/ Configure client authorization [service]
|
||||
|
||||
- Parse configuration option HiddenServiceAuthorizeClient containing
|
||||
authorized client names.
|
||||
- Load previously created client keys and descriptor cookies.
|
||||
- Generate missing client keys and descriptor cookies, add them to
|
||||
client_keys file.
|
||||
- Rewrite the hostname file.
|
||||
- Keep client keys and descriptor cookies of authorized clients in
|
||||
memory.
|
||||
[- In case of reconfiguration, mark which client authorizations were
|
||||
added and whether any were removed. This can be used later when
|
||||
deciding whether to rebuild introduction points and publish new
|
||||
hidden service descriptors. Not implemented yet.]
|
||||
|
||||
/2/ Publish hidden service descriptors [service]
|
||||
|
||||
- Create and upload hidden service descriptors for all authorized
|
||||
clients.
|
||||
[- See /1/ for the case of reconfiguration.]
|
||||
|
||||
/3/ Configure permission for hidden services [client]
|
||||
|
||||
- Parse configuration option HidServAuth containing service
|
||||
authorization, store authorization data in memory.
|
||||
|
||||
/5/ Fetch hidden service descriptors [client]
|
||||
|
||||
- Look up client authorization upon receiving a hidden service request.
|
||||
- Request hidden service descriptor ID including client key and
|
||||
descriptor cookie. Only request v2 descriptors, no v0.
|
||||
|
||||
/6/ Process hidden service descriptor [client]
|
||||
|
||||
- Decrypt introduction points with descriptor cookie.
|
||||
|
||||
/7/ Create introduction request [client]
|
||||
|
||||
- Include descriptor cookie in INTRODUCE2 cell to introduction point.
|
||||
- Pass descriptor cookie around between involved connections and
|
||||
circuits.
|
||||
|
||||
/8/ Process introduction request [service]
|
||||
|
||||
- Read descriptor cookie from INTRODUCE2 cell.
|
||||
- Check whether descriptor cookie is authorized for access, including
|
||||
checking access counters.
|
||||
- Log access for accountability.
|
||||
|
@ -1,138 +0,0 @@
|
||||
Filename: 122-unnamed-flag.txt
|
||||
Title: Network status entries need a new Unnamed flag
|
||||
Version: $Revision$
|
||||
Last-Modified: $Date$
|
||||
Author: Roger Dingledine
|
||||
Created: 04-Oct-2007
|
||||
Status: Closed
|
||||
Implemented-In: 0.2.0.x
|
||||
|
||||
1. Overview:
|
||||
|
||||
Tor's directory authorities can give certain servers a "Named" flag
|
||||
in the network-status entry, when they want to bind that nickname to
|
||||
that identity key. This allows clients to specify a nickname rather
|
||||
than an identity fingerprint and still be certain they're getting the
|
||||
"right" server. As dir-spec.txt describes it,
|
||||
|
||||
Name X is bound to identity Y if at least one binding directory lists
|
||||
it, and no directory binds X to some other Y'.
|
||||
|
||||
In practice, clients can refer to servers by nickname whether they are
|
||||
Named or not; if they refer to nicknames that aren't Named, a complaint
|
||||
shows up in the log asking them to use the identity key in the future
|
||||
--- but it still works.
|
||||
|
||||
The problem? Imagine a Tor server with nickname Bob. Bob and his
|
||||
identity fingerprint are registered in tor26's approved-routers
|
||||
file, but none of the other authorities registered him. Imagine
|
||||
there are several other unregistered servers also with nickname Bob
|
||||
("the imposters").
|
||||
|
||||
While Bob is online, all is well: a) tor26 gives a Named flag to
|
||||
the real one, and refuses to list the other ones; and b) the other
|
||||
authorities list the imposters but don't give them a Named flag. Clients
|
||||
who have all the network-statuses can compute which one is the real Bob.
|
||||
|
||||
But when the real Bob disappears and his descriptor expires? tor26
|
||||
continues to refuse to list any of the imposters, and the other
|
||||
authorities continue to list the imposters. Clients don't have any
|
||||
idea that there exists a Named Bob, so they can ask for server Bob and
|
||||
get one of the imposters. (A warning will also appear in their log,
|
||||
but so what.)
|
||||
|
||||
2. The stopgap solution:
|
||||
|
||||
tor26 should start accepting and listing the imposters, but it should
|
||||
assign them a new flag: "Unnamed".
|
||||
|
||||
This would produce three cases in terms of assigning flags in the consensus
|
||||
networkstatus:
|
||||
|
||||
i) a router gets the Named flag in the v3 networkstatus if
|
||||
a) it's the only router with that nickname that has the Named flag
|
||||
out of all the votes, and
|
||||
b) no vote lists it as Unnamed
|
||||
else,
|
||||
ii) a router gets the Unnamed flag if
|
||||
a) some vote lists a different router with that nickname as Named, or
|
||||
b) at least one vote lists it as Unnamed, or
|
||||
c) there are other routers with the same nickname that are Unnamed
|
||||
else,
|
||||
iii) the router neither gets a Named nor an Unnamed flag.
|
||||
|
||||
(This whole proposal is meant only for v3 dir flags; we shouldn't try
|
||||
to backport it to the v2 dir world.)
|
||||
|
||||
Then client behavior is:
|
||||
|
||||
a) If there's a Bob with a Named flag, pick that one.
|
||||
else b) If the Bobs don't have the Unnamed flag (notice that they should
|
||||
either all have it, or none), pick one of them and warn.
|
||||
else c) They all have the Unnamed flag -- no router found.
|
||||
|
||||
3. Problems not solved by this stopgap:
|
||||
|
||||
3.1. Naming authorities can go offline.
|
||||
|
||||
If tor26 is the only authority that provides a binding for Bob, when
|
||||
tor26 goes offline we're back in our previous situation -- the imposters
|
||||
can be referenced with a mere ignorable warning in the client's log.
|
||||
|
||||
If some other authority Names a different Bob, and tor26 goes offline,
|
||||
then that other Bob becomes the unique Named Bob.
|
||||
|
||||
So be it. We should try to solve these one day, but there's no clear way
|
||||
to do it that doesn't destroy usability in other ways, and if we want
|
||||
to get the Unnamed flag into v3 network statuses we should add it soon.
|
||||
|
||||
3.2. V3 dir spec magnifies brief discrepancies.
|
||||
|
||||
Another point to notice is if tor26 names Bob(1), doesn't know about
|
||||
Bob(2), but moria lists Bob(2). Then Bob(2) doesn't get an Unnamed flag
|
||||
even if it should (and Bob(1) is not around).
|
||||
|
||||
Right now, in v2 dirs, the case where an authority doesn't know about
|
||||
a server but the other authorities do know is rare. That's because
|
||||
authorities periodically ask for other networkstatuses and then fetch
|
||||
descriptors that are missing.
|
||||
|
||||
With v3, if that window occurs at the wrong time, it is extended for the
|
||||
entire period. We could solve this by making the voting more complex,
|
||||
but that doesn't seem worth it.
|
||||
|
||||
[3.3. Tor26 is only one tor26.
|
||||
|
||||
We need more naming authorities, possibly with some kind of auto-naming
|
||||
feature. This is out-of-scope for this proposal -NM]
|
||||
|
||||
4. Changes to the v2 directory
|
||||
|
||||
Previously, v2 authorities that had a binding for a server named Bob did
|
||||
not list any other server named Bob. This will change too:
|
||||
|
||||
Version 2 authorities will start listing all routers they know about,
|
||||
whether they conflict with a name-binding or not: Servers for which
|
||||
this authority has a binding will continue to be marked Named,
|
||||
additionally all other servers of that nickname will be listed without the
|
||||
Named flag (i.e. there will be no Unnamed flag in v2 status documents).
|
||||
|
||||
Clients already should handle having a named Bob alongside unnamed
|
||||
Bobs correctly, and having the unnamed Bobs in the status file even
|
||||
without the named server is no worse than the current status quo where
|
||||
clients learn about those servers from other authorities.
|
||||
|
||||
The benefit of this is that an authority's opinion on a server like
|
||||
Guard, Stable, Fast etc. can now be learned by clients even if that
|
||||
specific authority has reserved that server's name for somebody else.
|
||||
|
||||
5. Other benefits:
|
||||
|
||||
This new flag will allow people to operate servers that happen to have
|
||||
the same nickname as somebody who registered their server two years ago
|
||||
and left soon after. Right now there are dozens of nicknames that are
|
||||
registered on all three binding directory authorities, yet haven't been
|
||||
running for years. While it's bad that these nicknames are effectively
|
||||
blacklisted from the network, the really bad part is that this logic
|
||||
is really unintuitive to prospective new server operators.
|
||||
|
@ -1,56 +0,0 @@
|
||||
Filename: 123-autonaming.txt
|
||||
Title: Naming authorities automatically create bindings
|
||||
Version: $Revision$
|
||||
Last-Modified: $Date$
|
||||
Author: Peter Palfrader
|
||||
Created: 2007-10-11
|
||||
Status: Closed
|
||||
Implemented-In: 0.2.0.x
|
||||
|
||||
Overview:
|
||||
|
||||
Tor's directory authorities can give certain servers a "Named" flag
|
||||
in the network-status entry, when they want to bind that nickname to
|
||||
that identity key. This allows clients to specify a nickname rather
|
||||
than an identity fingerprint and still be certain they're getting the
|
||||
"right" server.
|
||||
|
||||
Authority operators name a server by adding their nickname and
|
||||
identity fingerprint to the 'approved-routers' file. Historically
|
||||
being listed in the file was required for a router, at first for being
|
||||
listed in the directory at all, and later in order to be used by
|
||||
clients as a first or last hop of a circuit.
|
||||
|
||||
Adding identities to the list of named routers so far has been a
|
||||
manual, time consuming, and boring job. Given that and the fact that
|
||||
the Tor network works just fine without named routers the last
|
||||
authority to keep a current binding list stopped updating it well over
|
||||
half a year ago.
|
||||
|
||||
Naming, if it were done, would serve a useful purpose however in that
|
||||
users can have a reasonable expectation that the exit server Bob they
|
||||
are using in their http://www.google.com.bob.exit/ URL is the same
|
||||
Bob every time.
|
||||
|
||||
Proposal:
|
||||
I propose that identity<->name binding be completely automated:
|
||||
|
||||
New bindings should be added after the router has been around for a
|
||||
bit and their name has not been used by other routers, similarly names
|
||||
that have not appeared on the network for a long time should be freed
|
||||
in case a new router wants to use it.
|
||||
|
||||
The following rules are suggested:
|
||||
i) If a named router has not been online for half a year, the
|
||||
identity<->name binding for that name is removed. The nickname
|
||||
is free to be taken by other routers now.
|
||||
ii) If a router claims a certain nickname and
|
||||
a) has been on the network for at least two weeks, and
|
||||
b) that nickname is not yet linked to a different router, and
|
||||
c) no other router has wanted that nickname in the last month,
|
||||
a new binding should be created for this router and its desired
|
||||
nickname.
|
||||
|
||||
This automaton does not necessarily need to live in the Tor code, it
|
||||
can do its job just as well when it's an external tool.
|
||||
|
@ -1,315 +0,0 @@
|
||||
Filename: 124-tls-certificates.txt
|
||||
Title: Blocking resistant TLS certificate usage
|
||||
Version: $Revision$
|
||||
Last-Modified: $Date$
|
||||
Author: Steven J. Murdoch
|
||||
Created: 2007-10-25
|
||||
Status: Superseded
|
||||
|
||||
Overview:
|
||||
|
||||
To be less distinguishable from HTTPS web browsing, only Tor servers should
|
||||
present TLS certificates. This should be done whilst maintaining backwards
|
||||
compatibility with Tor nodes which present and expect client certificates, and
|
||||
while preserving existing security properties. This specification describes
|
||||
the negotiation protocol, what certificates should be presented during the TLS
|
||||
negotiation, and how to move the client authentication within the encrypted
|
||||
tunnel.
|
||||
|
||||
Motivation:
|
||||
|
||||
In Tor's current TLS [1] handshake, both client and server present a
|
||||
two-certificate chain. Since TLS performs authentication prior to establishing
|
||||
the encrypted tunnel, the contents of these certificates are visible to an
|
||||
eavesdropper. In contrast, during normal HTTPS web browsing, the server
|
||||
presents a single certificate, signed by a root CA and the client presents no
|
||||
certificate. Hence it is possible to distinguish Tor from HTTP by identifying
|
||||
this pattern.
|
||||
|
||||
To resist blocking based on traffic identification, Tor should behave as close
|
||||
to HTTPS as possible, i.e. servers should offer a single certificate and not
|
||||
request a client certificate; clients should present no certificate. This
|
||||
presents two difficulties: clients are no longer authenticated and servers are
|
||||
authenticated by the connection key, rather than identity key. The link
|
||||
protocol must thus be modified to preserve the old security semantics.
|
||||
|
||||
Finally, in order to maintain backwards compatibility, servers must correctly
|
||||
identify whether the client supports the modified certificate handling. This
|
||||
is achieved by modifying the cipher suites that clients advertise support
|
||||
for. These cipher suites are selected to be similar to those chosen by web
|
||||
browsers, in order to resist blocking based on client hello.
|
||||
|
||||
Terminology:
|
||||
|
||||
Initiator: OP or OR which initiates a TLS connection ("client" in TLS
|
||||
terminology)
|
||||
|
||||
Responder: OR which receives an incoming TLS connection ("server" in TLS
|
||||
terminology)
|
||||
|
||||
Version negotiation and cipher suite selection:
|
||||
|
||||
In the modified TLS handshake, the responder does not request a certificate
|
||||
from the initiator. This request would normally occur immediately after the
|
||||
responder receives the client hello (the first message in a TLS handshake) and
|
||||
so the responder must decide whether to request a certificate based only on
|
||||
the information in the client hello. This is achieved by examining the cipher
|
||||
suites in the client hello.
|
||||
|
||||
List 1: cipher suites lists offered by version 0/1 Tor
|
||||
|
||||
From src/common/tortls.c, revision 12086:
|
||||
TLS1_TXT_DHE_RSA_WITH_AES_128_SHA
|
||||
TLS1_TXT_DHE_RSA_WITH_AES_128_SHA : SSL3_TXT_EDH_RSA_DES_192_CBC3_SHA
|
||||
SSL3_TXT_EDH_RSA_DES_192_CBC3_SHA
|
||||
|
||||
Client hello sent by initiator:
|
||||
|
||||
Initiators supporting version 2 of the Tor connection protocol MUST
|
||||
offer a different cipher suite list from those sent by pre-version 2
|
||||
Tors, contained in List 1. To maintain compatibility with older Tor
|
||||
versions and common browsers, the cipher suite list MUST include
|
||||
support for:
|
||||
|
||||
TLS_DHE_RSA_WITH_AES_256_CBC_SHA
|
||||
TLS_DHE_RSA_WITH_AES_128_CBC_SHA
|
||||
SSL_DHE_RSA_WITH_3DES_EDE_CBC_SHA
|
||||
SSL_DHE_DSS_WITH_3DES_EDE_CBC_SHA
|
||||
|
||||
Client hello received by responder/server hello sent by responder:
|
||||
|
||||
Responders supporting version 2 of the Tor connection protocol should compare
|
||||
the cipher suite list in the client hello with those in List 1. If it matches
|
||||
any in the list then the responder should assume that the initiatior supports
|
||||
version 1, and thus should maintain the version 1 behavior, i.e. send a
|
||||
two-certificate chain, request a client certificate and do not send or expect
|
||||
a VERSIONS cell [2].
|
||||
|
||||
Otherwise, the responder should assume version 2 behavior and select a cipher
|
||||
suite following TLS [1] behavior, i.e. select the first entry from the client
|
||||
hello cipher list which is acceptable. Responders MUST NOT select any suite
|
||||
that lacks ephemeral keys, or whose symmetric keys are less then KEY_LEN bits,
|
||||
or whose digests are less than HASH_LEN bits. Implementations SHOULD NOT
|
||||
allow other SSLv3 ciphersuites.
|
||||
|
||||
Should no mutually acceptable cipher suite be found, the connection MUST be
|
||||
closed.
|
||||
|
||||
If the responder is implementing version 2 of the connection protocol it
|
||||
SHOULD send a server certificate with random contents. The organizationName
|
||||
field MUST NOT be "Tor", "TOR" or "t o r".
|
||||
|
||||
Server certificate received by initiator:
|
||||
|
||||
If the server certificate has an organizationName of "Tor", "TOR" or "t o r",
|
||||
the initiator should assume that the responder does not support version 2 of
|
||||
the connection protocol. In which case the initiator should respond following
|
||||
version 1, i.e. send a two-certificate client chain and do not send or expect
|
||||
a VERSIONS cell.
|
||||
|
||||
[SJM: We could also use the fact that a client certificate request was sent]
|
||||
|
||||
If the server hello contains a ciphersuite which does not comply with the key
|
||||
length requirements above, even if it was one offered in the client hello, the
|
||||
connection MUST be closed. This will only occur if the responder is not a Tor
|
||||
server.
|
||||
|
||||
Backward compatibility:
|
||||
|
||||
v1 Initiator, v1 Responder: No change
|
||||
v1 Initiator, v2 Responder: Responder detects v1 initiator by client hello
|
||||
v2 Initiator, v1 Responder: Responder accepts v2 client hello. Initiator
|
||||
detects v1 server certificate and continues with v1 protocol
|
||||
v2 Initiator, v2 Responder: Responder accepts v2 client hello. Initiator
|
||||
detects v2 server certificate and continues with v2 protocol.
|
||||
|
||||
Additional link authentication process:
|
||||
|
||||
Following VERSION and NETINFO negotiation, both responder and
|
||||
initiator MUST send a certification chain in a CERT cell. If one
|
||||
party does not have a certificate, the CERT cell MUST still be sent,
|
||||
but with a length of zero.
|
||||
|
||||
A CERT cell is a variable length cell, of the format
|
||||
CircID [2 bytes]
|
||||
Command [1 byte]
|
||||
Length [2 bytes]
|
||||
Payload [<length> bytes]
|
||||
|
||||
CircID MUST set to be 0x0000
|
||||
Command is [SJM: TODO]
|
||||
Length is the length of the payload
|
||||
Payload contains 0 or more certificates, each is of the format:
|
||||
Cert_Length [2 bytes]
|
||||
Certificate [<cert_length> bytes]
|
||||
|
||||
Each certificate MUST sign the one preceding it. The initator MUST
|
||||
place its connection certificate first; the responder, having
|
||||
already sent its connection certificate as part of the TLS handshake
|
||||
MUST place its identity certificate first.
|
||||
|
||||
Initiators who send a CERT cell MUST follow that with an LINK_AUTH
|
||||
cell to prove that they posess the corresponding private key.
|
||||
|
||||
A LINK_AUTH cell is fixed-lenth, of the format:
|
||||
CircID [2 bytes]
|
||||
Command [1 byte]
|
||||
Length [2 bytes]
|
||||
Payload (padded with 0 bytes) [PAYLOAD_LEN - 2 bytes]
|
||||
|
||||
CircID MUST set to be 0x0000
|
||||
Command is [SJM: TODO]
|
||||
Length is the valid portion of the payload
|
||||
Payload is of the format:
|
||||
Signature version [1 byte]
|
||||
Signature [<length> - 1 bytes]
|
||||
Padding [PAYLOAD_LEN - <length> - 2 bytes]
|
||||
|
||||
Signature version: Identifies the type of signature, currently 0x00
|
||||
Signature: Digital signature under the initiator's connection key of the
|
||||
following item, in PKCS #1 block type 1 [3] format:
|
||||
|
||||
HMAC-SHA1, using the TLS master secret as key, of the
|
||||
following elements concatenated:
|
||||
- The signature version (0x00)
|
||||
- The NUL terminated ASCII string: "Tor initiator certificate verification"
|
||||
- client_random, as sent in the Client Hello
|
||||
- server_random, as sent in the Server Hello
|
||||
- SHA-1 hash of the initiator connection certificate
|
||||
- SHA-1 hash of the responder connection certificate
|
||||
|
||||
Security checks:
|
||||
|
||||
- Before sending a LINK_AUTH cell, a node MUST ensure that the TLS
|
||||
connection is authenticated by the responder key.
|
||||
- For the handshake to have succeeded, the initiator MUST confirm:
|
||||
- That the TLS handshake was authenticated by the
|
||||
responder connection key
|
||||
- That the responder connection key was signed by the first
|
||||
certificate in the CERT cell
|
||||
- That each certificate in the CERT cell was signed by the
|
||||
following certificate, with the exception of the last
|
||||
- That the last certificate in the CERT cell is the expected
|
||||
identity certificate for the node being connected to
|
||||
- For the handshake to have succeeded, the responder MUST confirm
|
||||
either:
|
||||
A) - A zero length CERT cell was sent and no LINK_AUTH cell was
|
||||
sent
|
||||
In which case the responder shall treat the identity of the
|
||||
initiator as unknown
|
||||
or
|
||||
B) - That the LINK_AUTH MAC contains a signature by the first
|
||||
certificate in the CERT cell
|
||||
- That the MAC signed matches the expected value
|
||||
- That each certificate in the CERT cell was signed by the
|
||||
following certificate, with the exception of the last
|
||||
In which case the responder shall treat the identity of the
|
||||
initiator as that of the last certificate in the CERT cell
|
||||
|
||||
Protocol summary:
|
||||
|
||||
1. I(nitiator) <-> R(esponder): TLS handshake, including responder
|
||||
authentication under connection certificate R_c
|
||||
2. I <->: VERSION and NETINFO negotiation
|
||||
3. R -> I: CERT (Responder identity certificate R_i (which signs R_c))
|
||||
4. I -> R: CERT (Initiator connection certificate I_c,
|
||||
Initiator identity certificate I_i (which signs I_c)
|
||||
5. I -> R: LINK_AUTH (Signature, under I_c of HMAC-SHA1(master_secret,
|
||||
"Tor initiator certificate verification" ||
|
||||
client_random || server_random ||
|
||||
I_c hash || R_c hash)
|
||||
|
||||
Notes: I -> R doesn't need to wait for R_i before sending its own
|
||||
messages (reduces round-trips).
|
||||
Certificate hash is calculated like identity hash in CREATE cells.
|
||||
Initiator signature is calculated in a similar way to Certificate
|
||||
Verify messages in TLS 1.1 (RFC4346, Sections 7.4.8 and 4.7).
|
||||
If I is an OP, a zero length certificate chain may be sent in step 4;
|
||||
In which case, step 5 is not performed
|
||||
|
||||
Rationale:
|
||||
|
||||
- Version and netinfo negotiation before authentication: The version cell needs
|
||||
to come before before the rest of the protocol, since we may choose to alter
|
||||
the rest at some later point, e.g switch to a different MAC/signature scheme.
|
||||
It is useful to keep the NETINFO and VERSION cells close to each other, since
|
||||
the time between them is used to check if there is a delay-attack. Still, a
|
||||
server might want to not act on NETINFO data from an initiator until the
|
||||
authentication is complete.
|
||||
|
||||
Appendix A: Cipher suite choices
|
||||
|
||||
This specification intentionally does not put any constraints on the
|
||||
TLS ciphersuite lists presented by clients, other than a minimum
|
||||
required for compatibility. However, to maximize blocking
|
||||
resistance, ciphersuite lists should be carefully selected.
|
||||
|
||||
Recommended client ciphersuite list
|
||||
|
||||
Source: http://lxr.mozilla.org/security/source/security/nss/lib/ssl/sslproto.h
|
||||
|
||||
0xc00a: TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA
|
||||
0xc014: TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA
|
||||
0x0039: TLS_DHE_RSA_WITH_AES_256_CBC_SHA
|
||||
0x0038: TLS_DHE_DSS_WITH_AES_256_CBC_SHA
|
||||
0xc00f: TLS_ECDH_RSA_WITH_AES_256_CBC_SHA
|
||||
0xc005: TLS_ECDH_ECDSA_WITH_AES_256_CBC_SHA
|
||||
0x0035: TLS_RSA_WITH_AES_256_CBC_SHA
|
||||
0xc007: TLS_ECDHE_ECDSA_WITH_RC4_128_SHA
|
||||
0xc009: TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA
|
||||
0xc011: TLS_ECDHE_RSA_WITH_RC4_128_SHA
|
||||
0xc013: TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA
|
||||
0x0033: TLS_DHE_RSA_WITH_AES_128_CBC_SHA
|
||||
0x0032: TLS_DHE_DSS_WITH_AES_128_CBC_SHA
|
||||
0xc00c: TLS_ECDH_RSA_WITH_RC4_128_SHA
|
||||
0xc00e: TLS_ECDH_RSA_WITH_AES_128_CBC_SHA
|
||||
0xc002: TLS_ECDH_ECDSA_WITH_RC4_128_SHA
|
||||
0xc004: TLS_ECDH_ECDSA_WITH_AES_128_CBC_SHA
|
||||
0x0004: SSL_RSA_WITH_RC4_128_MD5
|
||||
0x0005: SSL_RSA_WITH_RC4_128_SHA
|
||||
0x002f: TLS_RSA_WITH_AES_128_CBC_SHA
|
||||
0xc008: TLS_ECDHE_ECDSA_WITH_3DES_EDE_CBC_SHA
|
||||
0xc012: TLS_ECDHE_RSA_WITH_3DES_EDE_CBC_SHA
|
||||
0x0016: SSL_DHE_RSA_WITH_3DES_EDE_CBC_SHA
|
||||
0x0013: SSL_DHE_DSS_WITH_3DES_EDE_CBC_SHA
|
||||
0xc00d: TLS_ECDH_RSA_WITH_3DES_EDE_CBC_SHA
|
||||
0xc003: TLS_ECDH_ECDSA_WITH_3DES_EDE_CBC_SHA
|
||||
0xfeff: SSL_RSA_FIPS_WITH_3DES_EDE_CBC_SHA (168-bit Triple DES with RSA and a SHA1 MAC)
|
||||
0x000a: SSL_RSA_WITH_3DES_EDE_CBC_SHA
|
||||
|
||||
Order specified in:
|
||||
http://lxr.mozilla.org/security/source/security/nss/lib/ssl/sslenum.c#47
|
||||
|
||||
Recommended options:
|
||||
0x0000: Server Name Indication [4]
|
||||
0x000a: Supported Elliptic Curves [5]
|
||||
0x000b: Supported Point Formats [5]
|
||||
|
||||
Recommended compression:
|
||||
0x00
|
||||
|
||||
Recommended server ciphersuite selection:
|
||||
|
||||
The responder should select the first entry in this list which is
|
||||
listed in the client hello:
|
||||
|
||||
0x0039: TLS_DHE_RSA_WITH_AES_256_CBC_SHA [ Common Firefox choice ]
|
||||
0x0033: TLS_DHE_RSA_WITH_AES_128_CBC_SHA [ Tor v1 default ]
|
||||
0x0016: SSL_DHE_RSA_WITH_3DES_EDE_CBC_SHA [ Tor v1 fallback ]
|
||||
0x0013: SSL_DHE_DSS_WITH_3DES_EDE_CBC_SHA [ Valid IE option ]
|
||||
|
||||
References:
|
||||
|
||||
[1] The Transport Layer Security (TLS) Protocol, Version 1.1, RFC4346, IETF
|
||||
|
||||
[2] Version negotiation for the Tor protocol, Tor proposal 105
|
||||
|
||||
[3] B. Kaliski, "Public-Key Cryptography Standards (PKCS) #1:
|
||||
RSA Cryptography Specifications Version 1.5", RFC 2313,
|
||||
March 1998.
|
||||
|
||||
[4] TLS Extensions, RFC 3546
|
||||
|
||||
[5] Elliptic Curve Cryptography (ECC) Cipher Suites for Transport Layer Security (TLS)
|
||||
|
||||
% <!-- Local IspellDict: american -->
|
@ -1,293 +0,0 @@
|
||||
Filename: 125-bridges.txt
|
||||
Title: Behavior for bridge users, bridge relays, and bridge authorities
|
||||
Version: $Revision$
|
||||
Last-Modified: $Date$
|
||||
Author: Roger Dingledine
|
||||
Created: 11-Nov-2007
|
||||
Status: Closed
|
||||
Implemented-In: 0.2.0.x
|
||||
|
||||
0. Preface
|
||||
|
||||
This document describes the design decisions around support for bridge
|
||||
users, bridge relays, and bridge authorities. It acts as an overview
|
||||
of the bridge design and deployment for developers, and it also tries
|
||||
to point out limitations in the current design and implementation.
|
||||
|
||||
For more details on what all of these mean, look at blocking.tex in
|
||||
/doc/design-paper/
|
||||
|
||||
1. Bridge relays
|
||||
|
||||
Bridge relays are just like normal Tor relays except they don't publish
|
||||
their server descriptors to the main directory authorities.
|
||||
|
||||
1.1. PublishServerDescriptor
|
||||
|
||||
To configure your relay to be a bridge relay, just add
|
||||
BridgeRelay 1
|
||||
PublishServerDescriptor bridge
|
||||
to your torrc. This will cause your relay to publish its descriptor
|
||||
to the bridge authorities rather than to the default authorities.
|
||||
|
||||
Alternatively, you can say
|
||||
BridgeRelay 1
|
||||
PublishServerDescriptor 0
|
||||
which will cause your relay to not publish anywhere. This could be
|
||||
useful for private bridges.
|
||||
|
||||
1.2. Exit policy
|
||||
|
||||
Bridge relays should use an exit policy of "reject *:*". This is
|
||||
because they only need to relay traffic between the bridge users
|
||||
and the rest of the Tor network, so there's no need to let people
|
||||
exit directly from them.
|
||||
|
||||
1.3. RelayBandwidthRate / RelayBandwidthBurst
|
||||
|
||||
We invented the RelayBandwidth* options for this situation: Tor clients
|
||||
who want to allow relaying too. See proposal 111 for details. Relay
|
||||
operators should feel free to rate-limit their relayed traffic.
|
||||
|
||||
1.4. Helping the user with port forwarding, NAT, etc.
|
||||
|
||||
Just as for operating normal relays, our documentation and hints for
|
||||
how to make your ORPort reachable are inadequate for normal users.
|
||||
|
||||
We need to work harder on this step, perhaps in 0.2.2.x.
|
||||
|
||||
1.5. Vidalia integration
|
||||
|
||||
Vidalia has turned its "Relay" settings page into a tri-state
|
||||
"Don't relay" / "Relay for the Tor network" / "Help censored users".
|
||||
|
||||
If you click the third choice, it forces your exit policy to reject *:*.
|
||||
|
||||
If all the bridges end up on port 9001, that's not so good. On the
|
||||
other hand, putting the bridges on a low-numbered port in the Unix
|
||||
world requires jumping through extra hoops. The current compromise is
|
||||
that Vidalia makes the ORPort default to 443 on Windows, and 9001 on
|
||||
other platforms.
|
||||
|
||||
At the bottom of the relay config settings window, Vidalia displays
|
||||
the bridge identifier to the operator (see Section 3.1) so he can pass
|
||||
it on to bridge users.
|
||||
|
||||
1.6. What if the default ORPort is already used?
|
||||
|
||||
If the user already has a webserver or some other application
|
||||
bound to port 443, then Tor will fail to bind it and complain to the
|
||||
user, probably in a cryptic way. Rather than just working on a better
|
||||
error message (though we should do this), we should consider an
|
||||
"ORPort auto" option that tells Tor to try to find something that's
|
||||
bindable and reachable. This would also help us tolerate ISPs that
|
||||
filter incoming connections on port 80 and port 443. But this should
|
||||
be a different proposal, and can wait until 0.2.2.x.
|
||||
|
||||
2. Bridge authorities.
|
||||
|
||||
Bridge authorities are like normal directory authorities, except they
|
||||
don't create their own network-status documents or votes. So if you
|
||||
ask an authority for a network-status document or consensus, they
|
||||
behave like a directory mirror: they give you one from one of the main
|
||||
authorities. But if you ask the bridge authority for the descriptor
|
||||
corresponding to a particular identity fingerprint, it will happily
|
||||
give you the latest descriptor for that fingerprint.
|
||||
|
||||
To become a bridge authority, add these lines to your torrc:
|
||||
AuthoritativeDirectory 1
|
||||
BridgeAuthoritativeDir 1
|
||||
|
||||
Right now there's one bridge authority, running on the Tonga relay.
|
||||
|
||||
2.1. Exporting bridge-purpose descriptors
|
||||
|
||||
We've added a new purpose for server descriptors: the "bridge"
|
||||
purpose. With the new router-descriptors file format that includes
|
||||
annotations, it's easy to look through it and find the bridge-purpose
|
||||
descriptors.
|
||||
|
||||
Currently we export the bridge descriptors from Tonga to the
|
||||
BridgeDB server, so it can give them out according to the policies
|
||||
in blocking.pdf.
|
||||
|
||||
2.2. Reachability/uptime testing
|
||||
|
||||
Right now the bridge authorities do active reachability testing of
|
||||
bridges, so we know which ones to recommend for users.
|
||||
|
||||
But in the design document, we suggested that bridges should publish
|
||||
anonymously (i.e. via Tor) to the bridge authority, so somebody watching
|
||||
the bridge authority can't just enumerate all the bridges. But if we're
|
||||
doing active measurement, the game is up. Perhaps we should back off on
|
||||
this goal, or perhaps we should do our active measurement anonymously?
|
||||
|
||||
Answering this issue is scheduled for 0.2.1.x.
|
||||
|
||||
2.3. Migrating to multiple bridge authorities
|
||||
|
||||
Having only one bridge authority is both a trust bottleneck (if you
|
||||
break into one place you learn about every single bridge we've got)
|
||||
and a robustness bottleneck (when it's down, bridge users become sad).
|
||||
|
||||
Right now if we put up a second bridge authority, all the bridges would
|
||||
publish to it, and (assuming the code works) bridge users would query
|
||||
a random bridge authority. This resolves the robustness bottleneck,
|
||||
but makes the trust bottleneck even worse.
|
||||
|
||||
In 0.2.2.x and later we should think about better ways to have multiple
|
||||
bridge authorities.
|
||||
|
||||
3. Bridge users.
|
||||
|
||||
Bridge users are like ordinary Tor users except they use encrypted
|
||||
directory connections by default, and they use bridge relays as both
|
||||
entry guards (their first hop) and directory guards (the source of
|
||||
all their directory information).
|
||||
|
||||
To become a bridge user, add the following line to your torrc:
|
||||
|
||||
UseBridges 1
|
||||
|
||||
and then add at least one "Bridge" line to your torrc based on the
|
||||
format below.
|
||||
|
||||
3.1. Format of the bridge identifier.
|
||||
|
||||
The canonical format for a bridge identifier contains an IP address,
|
||||
an ORPort, and an identity fingerprint:
|
||||
bridge 128.31.0.34:9009 4C17 FB53 2E20 B2A8 AC19 9441 ECD2 B017 7B39 E4B1
|
||||
|
||||
However, the identity fingerprint can be left out, in which case the
|
||||
bridge user will connect to that relay and use it as a bridge regardless
|
||||
of what identity key it presents:
|
||||
bridge 128.31.0.34:9009
|
||||
This might be useful for cases where only short bridge identifiers
|
||||
can be communicated to bridge users.
|
||||
|
||||
In a future version we may also support bridge identifiers that are
|
||||
only a key fingerprint:
|
||||
bridge 4C17 FB53 2E20 B2A8 AC19 9441 ECD2 B017 7B39 E4B1
|
||||
and the bridge user can fetch the latest descriptor from the bridge
|
||||
authority (see Section 3.4).
|
||||
|
||||
3.2. Bridges as entry guards
|
||||
|
||||
For now, bridge users add their bridge relays to their list of "entry
|
||||
guards" (see path-spec.txt for background on entry guards). They are
|
||||
managed by the entry guard algorithms exactly as if they were a normal
|
||||
entry guard -- their keys and timing get cached in the "state" file,
|
||||
etc. This means that when the Tor user starts up with "UseBridges"
|
||||
disabled, he will skip past the bridge entries since they won't be
|
||||
listed as up and usable in his networkstatus consensus. But to be clear,
|
||||
the "entry_guards" list doesn't currently distinguish guards by purpose.
|
||||
|
||||
Internally, each bridge user keeps a smartlist of "bridge_info_t"
|
||||
that reflects the "bridge" lines from his torrc along with a download
|
||||
schedule (see Section 3.5 below). When he starts Tor, he attempts
|
||||
to fetch a descriptor for each configured bridge (see Section 3.4
|
||||
below). When he succeeds at getting a descriptor for one of the bridges
|
||||
in his list, he adds it directly to the entry guard list using the
|
||||
normal add_an_entry_guard() interface. Once a bridge descriptor has
|
||||
been added, should_delay_dir_fetches() will stop delaying further
|
||||
directory fetches, and the user begins to bootstrap his directory
|
||||
information from that bridge (see Section 3.3).
|
||||
|
||||
Currently bridge users cache their bridge descriptors to the
|
||||
"cached-descriptors" file (annotated with purpose "bridge"), but
|
||||
they don't make any attempt to reuse descriptors they find in this
|
||||
file. The theory is that either the bridge is available now, in which
|
||||
case you can get a fresh descriptor, or it's not, in which case an
|
||||
old descriptor won't do you much good.
|
||||
|
||||
We could disable writing out the bridge lines to the state file, if
|
||||
we think this is a problem.
|
||||
|
||||
As an exception, if we get an application request when we have one
|
||||
or more bridge descriptors but we believe none of them are running,
|
||||
we mark them all as running again. This is similar to the exception
|
||||
already in place to help long-idle Tor clients realize they should
|
||||
fetch fresh directory information rather than just refuse requests.
|
||||
|
||||
3.3. Bridges as directory guards
|
||||
|
||||
In addition to using bridges as the first hop in their circuits, bridge
|
||||
users also use them to fetch directory updates. Other than initial
|
||||
bootstrapping to find a working bridge descriptor (see Section 3.4
|
||||
below), all further non-anonymized directory fetches will be redirected
|
||||
to the bridge.
|
||||
|
||||
This means that bridge relays need to have cached answers for all
|
||||
questions the bridge user might ask. This makes the upgrade path
|
||||
tricky --- for example, if we migrate to a v4 directory design, the
|
||||
bridge user would need to keep using v3 so long as his bridge relays
|
||||
only knew how to answer v3 queries.
|
||||
|
||||
In a future design, for cases where the user has enough information
|
||||
to build circuits yet the chosen bridge doesn't know how to answer a
|
||||
given query, we might teach bridge users to make an anonymized request
|
||||
to a more suitable directory server.
|
||||
|
||||
3.4. How bridge users get their bridge descriptor
|
||||
|
||||
Bridge users can fetch bridge descriptors in two ways: by going directly
|
||||
to the bridge and asking for "/tor/server/authority", or by going to
|
||||
the bridge authority and asking for "/tor/server/fp/ID". By default,
|
||||
they will only try the direct queries. If the user sets
|
||||
UpdateBridgesFromAuthority 1
|
||||
in his config file, then he will try querying the bridge authority
|
||||
first for bridges where he knows a digest (if he only knows an IP
|
||||
address and ORPort, then his only option is a direct query).
|
||||
|
||||
If the user has at least one working bridge, then he will do further
|
||||
queries to the bridge authority through a full three-hop Tor circuit.
|
||||
But when bootstrapping, he will make a direct begin_dir-style connection
|
||||
to the bridge authority.
|
||||
|
||||
As of Tor 0.2.0.10-alpha, if the user attempts to fetch a descriptor
|
||||
from the bridge authority and it returns a 404 not found, the user
|
||||
will automatically fall back to trying a direct query. Therefore it is
|
||||
recommended that bridge users always set UpdateBridgesFromAuthority,
|
||||
since at worst it will delay their fetches a little bit and notify
|
||||
the bridge authority of the identity fingerprint (but not location)
|
||||
of their intended bridges.
|
||||
|
||||
3.5. Bridge descriptor retry schedule
|
||||
|
||||
Bridge users try to fetch a descriptor for each bridge (using the
|
||||
steps in Section 3.4 above) on startup. Whenever they receive a
|
||||
bridge descriptor, they reschedule a new descriptor download for 1
|
||||
hour from then.
|
||||
|
||||
If on the other hand it fails, they try again after 15 minutes for the
|
||||
first attempt, after 15 minutes for the second attempt, and after 60
|
||||
minutes for subsequent attempts.
|
||||
|
||||
In 0.2.2.x we should come up with some smarter retry schedules.
|
||||
|
||||
3.6. Vidalia integration
|
||||
|
||||
Vidalia 0.0.16 has a checkbox in its Network config window called
|
||||
"My ISP blocks connections to the Tor network." Users who click that
|
||||
box change their configuration to:
|
||||
UseBridges 1
|
||||
UpdateBridgesFromAuthority 1
|
||||
and should specify at least one Bridge identifier.
|
||||
|
||||
3.7. Do we need a second layer of entry guards?
|
||||
|
||||
If the bridge user uses the bridge as its entry guard, then the
|
||||
triangulation attacks from Lasse and Paul's Oakland paper work to
|
||||
locate the user's bridge(s).
|
||||
|
||||
Worse, this is another way to enumerate bridges: if the bridge users
|
||||
keep rotating through second hops, then if you run a few fast servers
|
||||
(and avoid getting considered an Exit or a Guard) you'll quickly get
|
||||
a list of the bridges in active use.
|
||||
|
||||
That's probably the strongest reason why bridge users will need to
|
||||
pick second-layer guards. Would this mean bridge users should switch
|
||||
to four-hop circuits?
|
||||
|
||||
We should figure this out in the 0.2.1.x timeframe.
|
||||
|
@ -1,412 +0,0 @@
|
||||
Filename: 126-geoip-reporting.txt
|
||||
Title: Getting GeoIP data and publishing usage summaries
|
||||
Version: $Revision$
|
||||
Last-Modified: $Date$
|
||||
Author: Roger Dingledine
|
||||
Created: 2007-11-24
|
||||
Status: Closed
|
||||
Implemented-In: 0.2.0.x
|
||||
|
||||
0. Status
|
||||
|
||||
In 0.2.0.x, this proposal is implemented to the extent needed to
|
||||
address its motivations. See notes below with the test "RESOLUTION"
|
||||
for details.
|
||||
|
||||
1. Background and motivation
|
||||
|
||||
Right now we can keep a rough count of Tor users, both total and by
|
||||
country, by watching connections to a single directory mirror. Being
|
||||
able to get usage estimates is useful both for our funders (to
|
||||
demonstrate progress) and for our own development (so we know how
|
||||
quickly we're scaling and can design accordingly, and so we know which
|
||||
countries and communities to focus on more). This need for information
|
||||
is the only reason we haven't deployed "directory guards" (think of
|
||||
them like entry guards but for directory information; in practice,
|
||||
it would seem that Tor clients should simply use their entry guards
|
||||
as their directory guards; see also proposal 125).
|
||||
|
||||
With the move toward bridges, we will no longer be able to track Tor
|
||||
clients that use bridges, since they use their bridges as directory
|
||||
guards. Further, we need to be able to learn which bridges stop seeing
|
||||
use from certain countries (and are thus likely blocked), so we can
|
||||
avoid giving them out to other users in those countries.
|
||||
|
||||
Right now we already do GeoIP lookups in Vidalia: Vidalia draws relays
|
||||
and circuits on its 'network map', and it performs anonymized GeoIP
|
||||
lookups to its central servers to know where to put the dots. Vidalia
|
||||
caches answers it gets -- to reduce delay, to reduce overhead on
|
||||
the network, and to reduce anonymity issues where users reveal their
|
||||
knowledge about the network through which IP addresses they ask about.
|
||||
|
||||
But with the advent of bridges, Tor clients are asking about IP
|
||||
addresses that aren't in the main directory. In particular, bridge
|
||||
users inform the central Vidalia servers about each bridge as they
|
||||
discover it and their Vidalia tries to map it.
|
||||
|
||||
Also, we wouldn't mind letting Vidalia do a GeoIP lookup on the client's
|
||||
own IP address, so it can provide a more useful map.
|
||||
|
||||
Finally, Vidalia's central servers leave users open to partitioning
|
||||
attacks, even if they can't target specific users. Further, as we
|
||||
start using GeoIP results for more operational or security-relevant
|
||||
goals, such as avoiding or including particular countries in circuits,
|
||||
it becomes more important that users can't be singled out in terms of
|
||||
their IP-to-country mapping beliefs.
|
||||
|
||||
2. The available GeoIP databases
|
||||
|
||||
There are at least two classes of GeoIP database out there: "IP to
|
||||
country", which tells us the country code for the IP address but
|
||||
no more details, and "IP to city", which tells us the country code,
|
||||
the name of the city, and some basic latitude/longitude guesses.
|
||||
|
||||
A recent ip-to-country.csv is 3421362 bytes. Compressed, it is 564252
|
||||
bytes. A typical line is:
|
||||
"205500992","208605279","US","USA","UNITED STATES"
|
||||
http://ip-to-country.webhosting.info/node/view/5
|
||||
|
||||
Similarly, the maxmind GeoLite Country database is also about 500KB
|
||||
compressed.
|
||||
http://www.maxmind.com/app/geolitecountry
|
||||
|
||||
The maxmind GeoLite City database gives more finegrained detail like
|
||||
geo coordinates and city name. Vidalia currently makes use of this
|
||||
information. On the other hand it's 16MB compressed. A typical line is:
|
||||
206.124.149.146,Bellevue,WA,US,47.6051,-122.1134
|
||||
http://www.maxmind.com/app/geolitecity
|
||||
|
||||
There are other databases out there, like
|
||||
http://www.hostip.info/faq.html
|
||||
http://www.webconfs.com/ip-to-city.php
|
||||
that want more attention, but for now let's assume that all the db's
|
||||
are around this size.
|
||||
|
||||
3. What we'd like to solve
|
||||
|
||||
Goal #1a: Tor relays collect IP-to-country user stats and publish
|
||||
sanitized versions.
|
||||
Goal #1b: Tor bridges collect IP-to-country user stats and publish
|
||||
sanitized versions.
|
||||
|
||||
Goal #2a: Vidalia learns IP-to-city stats for Tor relays, for better
|
||||
mapping.
|
||||
Goal #2b: Vidalia learns IP-to-country stats for Tor relays, so the user
|
||||
can pick countries for her paths.
|
||||
|
||||
Goal #3: Vidalia doesn't do external lookups on bridge relay addresses.
|
||||
|
||||
Goal #4: Vidalia resolves the Tor client's IP-to-country or IP-to-city
|
||||
for better mapping.
|
||||
|
||||
Goal #5: Reduce partitioning opportunities where Vidalia central
|
||||
servers can give different (distinguishing) responses.
|
||||
|
||||
4. Solution overview
|
||||
|
||||
Our goal is to allow Tor relays, bridges, and clients to learn enough
|
||||
GeoIP information so they can do local private queries.
|
||||
|
||||
4.1. The IP-to-country db
|
||||
|
||||
Directory authorities should publish a "geoip" file that contains
|
||||
IP-to-country mappings. Directory caches will mirror it, and Tor clients
|
||||
and relays (including bridge relays) will fetch it. Thus we can solve
|
||||
goals 1a and 1b (publish sanitized usage info). Controllers could also
|
||||
use this to solve goal 2b (choosing path by country attributes). It
|
||||
also solves goal 4 (learning the Tor client's country), though for
|
||||
huge countries like the US we'd still need to decide where the "middle"
|
||||
should be when we're mapping that address.
|
||||
|
||||
The IP-to-country details are described further in Sections 5 and
|
||||
6 below.
|
||||
|
||||
[RESOLUTION: The geoip file in 0.2.0.x is not distributed through
|
||||
Tor. Instead, it is shipped with the bundle.]
|
||||
|
||||
4.2. The IP-to-city db
|
||||
|
||||
In an ideal world, the IP-to-city db would be small enough that we
|
||||
could distribute it in the above manner too. But for now, it is too
|
||||
large. Here's where the design choice forks.
|
||||
|
||||
Option A: Vidalia should continue doing its anonymized IP-to-city
|
||||
queries. Thus we can achieve goals 2a and 2b. We would solve goal
|
||||
3 by only doing lookups on descriptors that are purpose "general"
|
||||
(see Section 4.2.1 for how). We would leave goal 5 unsolved.
|
||||
|
||||
Option B: Each directory authority should keep an IP-to-city db,
|
||||
lookup the value for each router it lists, and include that line in
|
||||
the router's network-status entry. The network-status consensus would
|
||||
then use the line that appears in the majority of votes. This approach
|
||||
also solves goals 2a and 2b, goal 3 (Vidalia doesn't do any lookups
|
||||
at all now), and goal 5 (reduced partitioning risks).
|
||||
|
||||
Option B has the advantage that Vidalia can simplify its operation,
|
||||
and the advantage that this consensus IP-to-city data is available to
|
||||
other controllers besides just Vidalia. But it has the disadvantage
|
||||
that the networkstatus consensus becomes larger, even though most of
|
||||
the GeoIP information won't change from one consensus to the next. Is
|
||||
there another reasonable location for it that can provide similar
|
||||
consensus security properties?
|
||||
|
||||
[RESOLUTION: IP-to-city is not supported.]
|
||||
|
||||
4.2.1. Controllers can query for router annotations
|
||||
|
||||
Vidalia needs to stop doing queries on bridge relay IP addresses.
|
||||
It could do that by only doing lookups on descriptors that are in
|
||||
the networkstatus consensus, but that precludes designs like Blossom
|
||||
that might want to map its relay locations. The best answer is that it
|
||||
should learn the router annotations, with a new controller 'getinfo'
|
||||
command:
|
||||
"GETINFO desc-annotations/id/<OR identity>"
|
||||
which would respond with something like
|
||||
@downloaded-at 2007-11-29 08:06:38
|
||||
@source "128.31.0.34"
|
||||
@purpose bridge
|
||||
|
||||
[We could also make the answer include the digest for the router in
|
||||
question, which would enable us to ask GETINFO router-annotations/all.
|
||||
Is this worth it? -RD]
|
||||
|
||||
Then Vidalia can avoid doing lookups on descriptors with purpose
|
||||
"bridge". Even better would be to add a new annotation "@private true"
|
||||
so Vidalia can know how to handle new purposes that we haven't created
|
||||
yet. Vidalia could special-case "bridge" for now, for compatibility
|
||||
with the current 0.2.0.x-alphas.
|
||||
|
||||
4.3. Recommendation
|
||||
|
||||
My overall recommendation is that we should implement 4.1 soon
|
||||
(e.g. early in 0.2.1.x), and we can go with 4.2 option A for now,
|
||||
with the hope that later we discover a better way to distribute the
|
||||
IP-to-city info and can switch to 4.2 option B.
|
||||
|
||||
Below we discuss more how to go about achieving 4.1.
|
||||
|
||||
5. Publishing and caching the GeoIP (IP-to-country) database
|
||||
|
||||
Each v3 directory authority should put a copy of the "geoip" file in
|
||||
its datadirectory. Then its network-status votes should include a hash
|
||||
of this file (Recommended-geoip-hash: %s), and the resulting consensus
|
||||
directory should specify the consensus hash.
|
||||
|
||||
There should be a new URL for fetching this geoip db (by "current.z"
|
||||
for testing purposes, and by hash.z for typical downloads). Authorities
|
||||
should fetch and serve the one listed in the consensus, even when they
|
||||
vote for their own. This would argue for storing the cached version
|
||||
in a better filename than "geoip".
|
||||
|
||||
Directory mirrors should keep a copy of this file available via the
|
||||
same URLs.
|
||||
|
||||
We assume that the file would change at most a few times a month. Should
|
||||
Tor ship with a bootstrap geoip file? An out-of-date geoip file may
|
||||
open you up to partitioning attacks, but for the most part it won't
|
||||
be that different.
|
||||
|
||||
There should be a config option to disable updating the geoip file,
|
||||
in case users want to use their own file (e.g. they have a proprietary
|
||||
GeoIP file they prefer to use). In that case we leave it up to the
|
||||
user to update his geoip file out-of-band.
|
||||
|
||||
[XXX Should consider forward/backward compatibility, e.g. if we want
|
||||
to move to a new geoip file format. -RD]
|
||||
|
||||
[RESOLUTION: Not done over Tor.]
|
||||
|
||||
6. Controllers use the IP-to-country db for mapping and for path building
|
||||
|
||||
Down the road, Vidalia could use the IP-to-country mappings for placing
|
||||
on its map:
|
||||
- The location of the client
|
||||
- The location of the bridges, or other relays not in the
|
||||
networkstatus, on the map.
|
||||
- Any relays that it doesn't yet have an IP-to-city answer for.
|
||||
|
||||
Other controllers can also use it to set EntryNodes, ExitNodes, etc
|
||||
in a per-country way.
|
||||
|
||||
To support these features, we need to export the IP-to-country data
|
||||
via the Tor controller protocol.
|
||||
|
||||
Is it sufficient just to add a new GETINFO command?
|
||||
GETINFO ip-to-country/128.31.0.34
|
||||
250+ip-to-country/128.31.0.34="US","USA","UNITED STATES"
|
||||
|
||||
[RESOLUTION: Not done now, except for the getinfo command.]
|
||||
|
||||
6.1. Other interfaces
|
||||
|
||||
Robert Hogan has also suggested a
|
||||
|
||||
GETINFO relays-by-country/cn
|
||||
|
||||
as well as torrc options for ExitCountryCodes, EntryCountryCodes,
|
||||
ExcludeCountryCodes, etc.
|
||||
|
||||
[RESOLUTION: Not implemented in 0.2.0.x. Fodder for a future proposal.]
|
||||
|
||||
7. Relays and bridges use the IP-to-country db for usage summaries
|
||||
|
||||
Once bridges have a GeoIP database locally, they can start to publish
|
||||
sanitized summaries of client usage -- how many users they see and from
|
||||
what countries. This might also be a more useful way for ordinary Tor
|
||||
relays to convey the level of usage they see, which would allow us to
|
||||
switch to using directory guards for all users by default.
|
||||
|
||||
But how to safely summarize this information without opening too many
|
||||
anonymity leaks?
|
||||
|
||||
7.1 Attacks to think about
|
||||
|
||||
First, note that we need to have a large enough time window that we're
|
||||
not aiding correlation attacks much. I hope 24 hours is enough. So
|
||||
that means no publishing stats until you've been up at least 24 hours.
|
||||
And you can't publish follow-up stats more often than every 24 hours,
|
||||
or people could look at the differential.
|
||||
|
||||
Second, note that we need to be sufficiently vague about the IP
|
||||
addresses we're reporting. We are hoping that just specifying the
|
||||
country will be vague enough. But a) what about active attacks where
|
||||
we convince a bridge to use a GeoIP db that labels each suspect IP
|
||||
address as a unique country? We have to assume that the consensus GeoIP
|
||||
db won't be malicious in this way. And b) could such singling-out
|
||||
attacks occur naturally, for example because of countries that have
|
||||
a very small IP space? We should investigate that.
|
||||
|
||||
7.2. Granularity of users
|
||||
|
||||
Do we only want to report countries that have a sufficient anonymity set
|
||||
(that is, number of users) for the day? For example, we might avoid
|
||||
listing any countries that have seen less than five addresses over
|
||||
the 24 hour period. This approach would be helpful in reducing the
|
||||
singling-out opportunities -- in the extreme case, we could imagine a
|
||||
situation where one blogger from the Sudan used Tor on a given day, and
|
||||
we can discover which entry guard she used.
|
||||
|
||||
But I fear that especially for bridges, seeing only one hit from a
|
||||
given country in a given day may be quite common.
|
||||
|
||||
As a compromise, we should start out with an "Other" category in
|
||||
the reported stats, which is the sum of unlisted countries; if that
|
||||
category is consistently interesting, we can think harder about how
|
||||
to get the right data from it safely.
|
||||
|
||||
But note that bridge summaries will not be made public individually,
|
||||
since doing so would help people enumerate bridges. Whereas summaries
|
||||
from normal relays will be public. So perhaps that means we can afford
|
||||
to be more specific in bridge summaries? In particular, I'm thinking the
|
||||
"other" category should be used by public relays but not for bridges
|
||||
(or if it is, used with a lower threshold).
|
||||
|
||||
Even for countries that have many Tor users, we might not want to be
|
||||
too specific about how many users we've seen. For example, we might
|
||||
round down the number of users we report to the nearest multiple of 5.
|
||||
My instinct for now is that this won't be that useful.
|
||||
|
||||
7.3 Other issues
|
||||
|
||||
Another note: we'll likely be overreporting in the case of users with
|
||||
dynamic IP addresses: if they rotate to a new address over the course
|
||||
of the day, we'll count them twice. So be it.
|
||||
|
||||
7.4. Where to publish the summaries?
|
||||
|
||||
We designed extrainfo documents for information like this. So they
|
||||
should just be more entries in the extrainfo doc.
|
||||
|
||||
But if we want to publish summaries every 24 hours (no more often,
|
||||
no less often), aren't we tried to the router descriptor publishing
|
||||
schedule? That is, if we publish a new router descriptor at the 18
|
||||
hour mark, and nothing much has changed at the 24 hour mark, won't
|
||||
the new descriptor get dropped as being "cosmetically similar", and
|
||||
then nobody will know to ask about the new extrainfo document?
|
||||
|
||||
One solution would be to make and remember the 24 hour summary at the
|
||||
24 hour mark, but not actually publish it anywhere until we happen to
|
||||
publish a new descriptor for other reasons. If we happen to go down
|
||||
before publishing a new descriptor, then so be it, at least we tried.
|
||||
|
||||
7.5. What if the relay is unreachable or goes to sleep?
|
||||
|
||||
Even if you've been up for 24 hours, if you were hibernating for 18
|
||||
of them, then we're not getting as much fuzziness as we'd like. So
|
||||
I guess that means that we need a 24-hour period of being "awake"
|
||||
before we'll willing to publish a summary. A similar attack works if
|
||||
you've been awake but unreachable for the first 18 of the 24 hours. As
|
||||
another example, a bridge that's on a laptop might be suspended for
|
||||
some of each day.
|
||||
|
||||
This implies that some relays and bridges will never publish summary
|
||||
stats, because they're not ever reliably working for 24 hours in
|
||||
a row. If a significant percentage of our reporters end up being in
|
||||
this boat, we should investigate whether we can accumulate 24 hours of
|
||||
"usefulness", even if there are holes in the middle, and publish based
|
||||
on that.
|
||||
|
||||
What other issues are like this? It seems that just moving to a new
|
||||
IP address shouldn't be a reason to cancel stats publishing, assuming
|
||||
we were usable at each address.
|
||||
|
||||
7.6. IP addresses that aren't in the geoip db
|
||||
|
||||
Some IP addresses aren't in the public geoip databases. In particular,
|
||||
I've found that a lot of African countries are missing, but there
|
||||
are also some common ones in the US that are missing, like parts of
|
||||
Comcast. We could just lump unknown IP addresses into the "other"
|
||||
category, but it might be useful to gather a general sense of how many
|
||||
lookups are failing entirely, by adding a separate "Unknown" category.
|
||||
|
||||
We could also contribute back to the geoip db, by letting bridges set
|
||||
a config option to report the actual IP addresses that failed their
|
||||
lookup. Then the bridge authority operators can manually make sure
|
||||
the correct answer will be in later geoip files. This config option
|
||||
should be disabled by default.
|
||||
|
||||
7.7 Bringing it all together
|
||||
|
||||
So here's the plan:
|
||||
|
||||
24 hours after starting up (modulo Section 7.5 above), bridges and
|
||||
relays should construct a daily summary of client countries they've
|
||||
seen, including the above "Unknown" category (Section 7.6) as well.
|
||||
|
||||
Non-bridge relays lump all countries with less than K (e.g. K=5) users
|
||||
into the "Other" category (see Sec 7.2 above), whereas bridge relays are
|
||||
willing to list a country even when it has only one user for the day.
|
||||
|
||||
Whenever we have a daily summary on record, we include it in our
|
||||
extrainfo document whenever we publish one. The daily summary we
|
||||
remember locally gets replaced with a newer one when another 24
|
||||
hours pass.
|
||||
|
||||
7.8. Some forward secrecy
|
||||
|
||||
How should we remember addresses locally? If we convert them into
|
||||
country-codes immediately, we will count them again if we see them
|
||||
again. On the other hand, we don't really want to keep a list hanging
|
||||
around of all IP addresses we've seen in the past 24 hours.
|
||||
|
||||
Step one is that we should never write this stuff to disk. Keeping it
|
||||
only in ram will make things somewhat better. Step two is to avoid
|
||||
keeping any timestamps associated with it: rather than a rolling
|
||||
24-hour window, which would require us to remember the various times
|
||||
we've seen that address, we can instead just throw out the whole list
|
||||
every 24 hours and start over.
|
||||
|
||||
We could hash the addresses, and then compare hashes when deciding if
|
||||
we've seen a given address before. We could even do keyed hashes. Or
|
||||
Bloom filters. But if our goal is to defend against an adversary
|
||||
who steals a copy of our ram while we're running and then does
|
||||
guess-and-check on whatever blob we're keeping, we're in bad shape.
|
||||
|
||||
We could drop the last octet of the IP address as soon as we see
|
||||
it. That would cause us to undercount some users from cablemodem and
|
||||
DSL networks that have a high density of Tor users. And it wouldn't
|
||||
really help that much -- indeed, the extent to which it does help is
|
||||
exactly the extent to which it makes our stats less useful.
|
||||
|
||||
Other ideas?
|
||||
|
@ -1,157 +0,0 @@
|
||||
Filename: 127-dirport-mirrors-downloads.txt
|
||||
Title: Relaying dirport requests to Tor download site / website
|
||||
Version: $Revision$
|
||||
Last-Modified: $Date$
|
||||
Author: Roger Dingledine
|
||||
Created: 2007-12-02
|
||||
Status: Draft
|
||||
|
||||
1. Overview
|
||||
|
||||
Some countries and networks block connections to the Tor website. As
|
||||
time goes by, this will remain a problem and it may even become worse.
|
||||
|
||||
We have a big pile of mirrors (google for "Tor mirrors"), but few of
|
||||
our users think to try a search like that. Also, many of these mirrors
|
||||
might be automatically blocked since their pages contain words that
|
||||
might cause them to get banned. And lastly, we can imagine a future
|
||||
where the blockers are aware of the mirror list too.
|
||||
|
||||
Here we describe a new set of URLs for Tor's DirPort that will relay
|
||||
connections from users to the official Tor download site. Rather than
|
||||
trying to cache a bunch of new Tor packages (which is a hassle in terms
|
||||
of keeping them up to date, and a hassle in terms of drive space used),
|
||||
we instead just proxy the requests directly to Tor's /dist page.
|
||||
|
||||
Specifically, we should support
|
||||
|
||||
GET /tor/dist/$1
|
||||
|
||||
and
|
||||
|
||||
GET /tor/website/$1
|
||||
|
||||
2. Direct connections, one-hop circuits, or three-hop circuits?
|
||||
|
||||
We could relay the connections directly to the download site -- but
|
||||
this produces recognizable outgoing traffic on the bridge or cache's
|
||||
network, which will probably surprise our nice volunteers. (Is this
|
||||
a good enough reason to discard the direct connection idea?)
|
||||
|
||||
Even if we don't do direct connections, should we do a one-hop
|
||||
begindir-style connection to the mirror site (make a one-hop circuit
|
||||
to it, then send a 'begindir' cell down the circuit), or should we do
|
||||
a normal three-hop anonymized connection?
|
||||
|
||||
If these mirrors are mainly bridges, doing either a direct or a one-hop
|
||||
connection creates another way to enumerate bridges. That would argue
|
||||
for three-hop. On the other hand, downloading a 10+ megabyte installer
|
||||
through a normal Tor circuit can't be fun. But if you're already getting
|
||||
throttled a lot because you're in the "relayed traffic" bucket, you're
|
||||
going to have to accept a slow transfer anyway. So three-hop it is.
|
||||
|
||||
Speaking of which, we would want to label this connection
|
||||
as "relay" traffic for the purposes of rate limiting; see
|
||||
connection_counts_as_relayed_traffic() and or_conn->client_used. This
|
||||
will be a bit tricky though, because these connections will use the
|
||||
bridge's guards.
|
||||
|
||||
3. Scanning resistance
|
||||
|
||||
One other goal we'd like to achieve, or at least not hinder, is making
|
||||
it hard to scan large swaths of the Internet to look for responses
|
||||
that indicate a bridge.
|
||||
|
||||
In general this is a really hard problem, so we shouldn't demand to
|
||||
solve it here. But we can note that some bridges should open their
|
||||
DirPort (and offer this functionality), and others shouldn't. Then
|
||||
some bridges provide a download mirror while others can remain
|
||||
scanning-resistant.
|
||||
|
||||
4. Integrity checking
|
||||
|
||||
If we serve this stuff in plaintext from the bridge, anybody in between
|
||||
the user and the bridge can intercept and modify it. The bridge can too.
|
||||
|
||||
If we do an anonymized three-hop connection, the exit node can also
|
||||
intercept and modify the exe it sends back.
|
||||
|
||||
Are we setting ourselves up for rogue exit relays, or rogue bridges,
|
||||
that trojan our users?
|
||||
|
||||
Answer #1: Users need to do pgp signature checking. Not a very good
|
||||
answer, a) because it's complex, and b) because they don't know the
|
||||
right signing keys in the first place.
|
||||
|
||||
Answer #2: The mirrors could exit from a specific Tor relay, using the
|
||||
'.exit' notation. This would make connections a bit more brittle, but
|
||||
would resolve the rogue exit relay issue. We could even round-robin
|
||||
among several, and the list could be dynamic -- for example, all the
|
||||
relays with an Authority flag that allow exits to the Tor website.
|
||||
|
||||
Answer #3: The mirrors should connect to the main distribution site
|
||||
via SSL. That way the exit relay can't influence anything.
|
||||
|
||||
Answer #4: We could suggest that users only use trusted bridges for
|
||||
fetching a copy of Tor. Hopefully they heard about the bridge from a
|
||||
trusted source rather than from the adversary.
|
||||
|
||||
Answer #5: What if the adversary is trawling for Tor downloads by
|
||||
network signature -- either by looking for known bytes in the binary,
|
||||
or by looking for "GET /tor/dist/"? It would be nice to encrypt the
|
||||
connection from the bridge user to the bridge. And we can! The bridge
|
||||
already supports TLS. Rather than initiating a TLS renegotiation after
|
||||
connecting to the ORPort, the user should actually request a URL. Then
|
||||
the ORPort can either pass the connection off as a linked conn to the
|
||||
dirport, or renegotiate and become a Tor connection, depending on how
|
||||
the client behaves.
|
||||
|
||||
5. Linked connections: at what level should we proxy?
|
||||
|
||||
Check out the connection_ap_make_link() function, as called from
|
||||
directory.c. Tor clients use this to create a "fake" socks connection
|
||||
back to themselves, and then they attach a directory request to it,
|
||||
so they can launch directory fetches via Tor. We can piggyback on
|
||||
this feature.
|
||||
|
||||
We need to decide if we're going to be passing the bytes back and
|
||||
forth between the web browser and the main distribution site, or if
|
||||
we're going to be actually acting like a proxy (parsing out the file
|
||||
they want, fetching that file, and serving it back).
|
||||
|
||||
Advantages of proxying without looking inside:
|
||||
- We don't need to build any sort of http support (including
|
||||
continues, partial fetches, etc etc).
|
||||
Disadvantages:
|
||||
- If the browser thinks it's speaking http, are there easy ways
|
||||
to pass the bytes to an https server and have everything work
|
||||
correctly? At the least, it would seem that the browser would
|
||||
complain about the cert. More generally, ssl wants to be negotiated
|
||||
before the URL and headers are sent, yet we need to read the URL
|
||||
and headers to know that this is a mirror request; so we have an
|
||||
ordering problem here.
|
||||
- Makes it harder to do caching later on, if we don't look at what
|
||||
we're relaying. (It might be useful down the road to cache the
|
||||
answers to popular requests, so we don't have to keep getting
|
||||
them again.)
|
||||
|
||||
6. Outstanding problems
|
||||
|
||||
1) HTTP proxies already exist. Why waste our time cloning one
|
||||
badly? When we clone existing stuff, we usually regret it.
|
||||
|
||||
2) It's overbroad. We only seem to need a secure get-a-tor feature,
|
||||
and instead we're contemplating building a locked-down HTTP proxy.
|
||||
|
||||
3) It's going to add a fair bit of complexity to our code. We do
|
||||
not currently implement HTTPS. We'd need to refactor lots of the
|
||||
low-level connection stuff so that "SSL" and "Cell-based" were no
|
||||
longer synonymous.
|
||||
|
||||
4) It's still unclear how effective this proposal would be in
|
||||
practice. You need to know that this feature exists, which means
|
||||
somebody needs to tell you about a bridge (mirror) address and tell
|
||||
you how to use it. And if they're doing that, they could (e.g.) tell
|
||||
you about a gmail autoresponder address just as easily, and then you'd
|
||||
get better authentication of the Tor program to boot.
|
||||
|
@ -1,66 +0,0 @@
|
||||
Filename: 128-bridge-families.txt
|
||||
Title: Families of private bridges
|
||||
Version: $Revision$
|
||||
Last-Modified: $Date$
|
||||
Author: Roger Dingledine
|
||||
Created: 2007-12-xx
|
||||
Status: Dead
|
||||
|
||||
1. Overview
|
||||
|
||||
Proposal 125 introduced the basic notion of how bridge authorities,
|
||||
bridge relays, and bridge users should behave. But it doesn't get into
|
||||
the various mechanisms of how to distribute bridge relay addresses to
|
||||
bridge users.
|
||||
|
||||
One of the mechanisms we have in mind is called 'families of bridges'.
|
||||
If a bridge user knows about only one private bridge, and that bridge
|
||||
shuts off for the night or gets a new dynamic IP address, the bridge
|
||||
user is out of luck and needs to re-bootstrap manually or wait and
|
||||
hope it comes back. On the other hand, if the bridge user knows about
|
||||
a family of bridges, then as long as one of those bridges is still
|
||||
reachable his Tor client can automatically learn about where the
|
||||
other bridges have gone.
|
||||
|
||||
So in this design, a single volunteer could run multiple coordinated
|
||||
bridges, or a group of volunteers could each run a bridge. We abstract
|
||||
out the details of how these volunteers find each other and decide to
|
||||
set up a family.
|
||||
|
||||
2. Other notes.
|
||||
|
||||
somebody needs to run a bridge authority
|
||||
|
||||
it needs to have a torrc option to publish networkstatuses of its bridges
|
||||
|
||||
it should also do reachability testing just of those bridges
|
||||
|
||||
people ask for the bridge networkstatus by asking for a url that
|
||||
contains a password. (it's safe to do this because of begin_dir.)
|
||||
|
||||
so the bridge users need to know a) a password, and b) a bridge
|
||||
authority line.
|
||||
|
||||
the bridge users need to know the bridge authority line.
|
||||
|
||||
the bridge authority needs to know the password.
|
||||
|
||||
3. Current state
|
||||
|
||||
I implemented a BridgePassword config option. Bridge authorities
|
||||
should set it, and users who want to use those bridge authorities
|
||||
should set it.
|
||||
|
||||
Now there is a new directory URL "/tor/networkstatus-bridges" that
|
||||
directory mirrors serve if BridgeAuthoritativeDir is set and it's a
|
||||
begin_dir connection. It looks for the header
|
||||
Authorization: Basic %s
|
||||
where %s is the base-64 bridge password.
|
||||
|
||||
I never got around to teaching clients how to set the header though,
|
||||
so it may or may not, and may or may not do what we ultimate want.
|
||||
|
||||
I've marked this proposal dead; it really never should have left the
|
||||
ideas/ directory. Somebody should pick it up sometime and finish the
|
||||
design and implementation.
|
||||
|
@ -1,116 +0,0 @@
|
||||
Filename: 129-reject-plaintext-ports.txt
|
||||
Title: Block Insecure Protocols by Default
|
||||
Version: $Revision$
|
||||
Last-Modified: $Date$
|
||||
Author: Kevin Bauer & Damon McCoy
|
||||
Created: 2008-01-15
|
||||
Status: Closed
|
||||
Implemented-In: 0.2.0.x
|
||||
|
||||
Overview:
|
||||
|
||||
Below is a proposal to mitigate insecure protocol use over Tor.
|
||||
|
||||
This document 1) demonstrates the extent to which insecure protocols are
|
||||
currently used within the Tor network, and 2) proposes a simple solution
|
||||
to prevent users from unknowingly using these insecure protocols. By
|
||||
insecure, we consider protocols that explicitly leak sensitive user names
|
||||
and/or passwords, such as POP, IMAP, Telnet, and FTP.
|
||||
|
||||
Motivation:
|
||||
|
||||
As part of a general study of Tor use in 2006/2007 [1], we attempted to
|
||||
understand what types of protocols are used over Tor. While we observed a
|
||||
enormous volume of Web and Peer-to-peer traffic, we were surprised by the
|
||||
number of insecure protocols that were used over Tor. For example, over an
|
||||
8 day observation period, we observed the following number of connections
|
||||
over insecure protocols:
|
||||
|
||||
POP and IMAP:10,326 connections
|
||||
Telnet: 8,401 connections
|
||||
FTP: 3,788 connections
|
||||
|
||||
Each of the above listed protocols exchange user name and password
|
||||
information in plain-text. As an upper bound, we could have observed
|
||||
22,515 user names and passwords. This observation echos the reports of
|
||||
a Tor router logging and posting e-mail passwords in August 2007 [2]. The
|
||||
response from the Tor community has been to further educate users
|
||||
about the dangers of using insecure protocols over Tor. However, we
|
||||
recently repeated our Tor usage study from last year and noticed that the
|
||||
trend in insecure protocol use has not declined. Therefore, we propose that
|
||||
additional steps be taken to protect naive Tor users from inadvertently
|
||||
exposing their identities (and even passwords) over Tor.
|
||||
|
||||
Security Implications:
|
||||
|
||||
This proposal is intended to improve Tor's security by limiting the
|
||||
use of insecure protocols.
|
||||
|
||||
Roger added: By adding these warnings for only some of the risky
|
||||
behavior, users may do other risky behavior, not get a warning, and
|
||||
believe that it is therefore safe. But overall, I think it's better
|
||||
to warn for some of it than to warn for none of it.
|
||||
|
||||
Specification:
|
||||
|
||||
As an initial step towards mitigating the use of the above-mentioned
|
||||
insecure protocols, we propose that the default ports for each respective
|
||||
insecure service be blocked at the Tor client's socks proxy. These default
|
||||
ports include:
|
||||
|
||||
23 - Telnet
|
||||
109 - POP2
|
||||
110 - POP3
|
||||
143 - IMAP
|
||||
|
||||
Notice that FTP is not included in the proposed list of ports to block. This
|
||||
is because FTP is often used anonymously, i.e., without any identifying
|
||||
user name or password.
|
||||
|
||||
This blocking scheme can be implemented as a set of flags in the client's
|
||||
torrc configuration file:
|
||||
|
||||
BlockInsecureProtocols 0|1
|
||||
WarnInsecureProtocols 0|1
|
||||
|
||||
When the warning flag is activated, a message should be displayed to
|
||||
the user similar to the message given when Tor's socks proxy is given an IP
|
||||
address rather than resolving a host name.
|
||||
|
||||
We recommend that the default torrc configuration file block insecure
|
||||
protocols and provide a warning to the user to explain the behavior.
|
||||
|
||||
Finally, there are many popular web pages that do not offer secure
|
||||
login features, such as MySpace, and it would be prudent to provide
|
||||
additional rules to Privoxy to attempt to protect users from unknowingly
|
||||
submitting their login credentials in plain-text.
|
||||
|
||||
Compatibility:
|
||||
|
||||
None, as the proposed changes are to be implemented in the client.
|
||||
|
||||
References:
|
||||
|
||||
[1] Shining Light in Dark Places: A Study of Anonymous Network Usage.
|
||||
University of Colorado Technical Report CU-CS-1032-07. August 2007.
|
||||
|
||||
[2] Rogue Nodes Turn Tor Anonymizer Into Eavesdropper's Paradise.
|
||||
http://www.wired.com/politics/security/news/2007/09/embassy_hacks.
|
||||
Wired. September 10, 2007.
|
||||
|
||||
Implementation:
|
||||
|
||||
Roger added this feature in
|
||||
http://archives.seul.org/or/cvs/Jan-2008/msg00182.html
|
||||
He also added a status event for Vidalia to recognize attempts to use
|
||||
vulnerable-plaintext ports, so it can help the user understand what's
|
||||
going on and how to fix it.
|
||||
|
||||
Next steps:
|
||||
|
||||
a) Vidalia should learn to recognize this controller status event,
|
||||
so we don't leave users out in the cold when we enable this feature.
|
||||
|
||||
b) We should decide which ports to reject by default. The current
|
||||
consensus is 23,109,110,143 -- the same set that we warn for now.
|
||||
|
@ -1,186 +0,0 @@
|
||||
Filename: 130-v2-conn-protocol.txt
|
||||
Title: Version 2 Tor connection protocol
|
||||
Version: $Revision$
|
||||
Last-Modified: $Date$
|
||||
Author: Nick Mathewson
|
||||
Created: 2007-10-25
|
||||
Status: Closed
|
||||
Implemented-In: 0.2.0.x
|
||||
|
||||
Overview:
|
||||
|
||||
This proposal describes the significant changes to be made in the v2
|
||||
Tor connection protocol.
|
||||
|
||||
This proposal relates to other proposals as follows:
|
||||
|
||||
It refers to and supersedes:
|
||||
Proposal 124: Blocking resistant TLS certificate usage
|
||||
It refers to aspects of:
|
||||
Proposal 105: Version negotiation for the Tor protocol
|
||||
|
||||
|
||||
In summary, The Tor connection protocol has been in need of a redesign
|
||||
for a while. This proposal describes how we can add to the Tor
|
||||
protocol:
|
||||
|
||||
- A new TLS handshake (to achieve blocking resistance without
|
||||
breaking backward compatibility)
|
||||
- Version negotiation (so that future connection protocol changes
|
||||
can happen without breaking compatibility)
|
||||
- The actual changes in the v2 Tor connection protocol.
|
||||
|
||||
Motivation:
|
||||
|
||||
For motivation, see proposal 124.
|
||||
|
||||
Proposal:
|
||||
|
||||
0. Terminology
|
||||
|
||||
The version of the Tor connection protocol implemented up to now is
|
||||
"version 1". This proposal describes "version 2".
|
||||
|
||||
"Old" or "Older" versions of Tor are ones not aware that version 2
|
||||
of this protocol exists;
|
||||
"New" or "Newer" versions are ones that are.
|
||||
|
||||
The connection initiator is referred to below as the Client; the
|
||||
connection responder is referred to below as the Server.
|
||||
|
||||
1. The revised TLS handshake.
|
||||
|
||||
For motivation, see proposal 124. This is a simplified version of the
|
||||
handshake that uses TLS's renegotiation capability in order to avoid
|
||||
some of the extraneous steps in proposal 124.
|
||||
|
||||
The Client connects to the Server and, as in ordinary TLS, sends a
|
||||
list of ciphers. Older versions of Tor will send only ciphers from
|
||||
the list:
|
||||
TLS_DHE_RSA_WITH_AES_256_CBC_SHA
|
||||
TLS_DHE_RSA_WITH_AES_128_CBC_SHA
|
||||
SSL_DHE_RSA_WITH_3DES_EDE_CBC_SHA
|
||||
SSL_DHE_DSS_WITH_3DES_EDE_CBC_SHA
|
||||
Clients that support the revised handshake will send the recommended
|
||||
list of ciphers from proposal 124, in order to emulate the behavior of
|
||||
a web browser.
|
||||
|
||||
If the server notices that the list of ciphers contains only ciphers
|
||||
from this list, it proceeds with Tor's version 1 TLS handshake as
|
||||
documented in tor-spec.txt.
|
||||
|
||||
(The server may also notice cipher lists used by other implementations
|
||||
of the Tor protocol (in particular, the BouncyCastle default cipher
|
||||
list as used by some Java-based implementations), and whitelist them.)
|
||||
|
||||
On the other hand, if the server sees a list of ciphers that could not
|
||||
have been sent from an older implementation (because it includes other
|
||||
ciphers, and does not match any known-old list), the server sends a
|
||||
reply containing a single connection certificate, constructed as for
|
||||
the link certificate in the v1 Tor protocol. The subject names in
|
||||
this certificate SHOULD NOT have any strings to identify them as
|
||||
coming from a Tor server. The server does not ask the client for
|
||||
certificates.
|
||||
|
||||
Old Servers will (mostly) ignore the cipher list and respond as in the v1
|
||||
protocol, sending back a two-certificate chain.
|
||||
|
||||
After the Client gets a response from the server, it checks for the
|
||||
number of certificates it received. If there are two certificates,
|
||||
the client assumes a V1 connection and proceeds as in tor-spec.txt.
|
||||
But if there is only one certificate, the client assumes a V2 or later
|
||||
protocol and continues.
|
||||
|
||||
At this point, the client has established a TLS connection with the
|
||||
server, but the parties have not been authenticated: the server hasn't
|
||||
sent its identity certificate, and the client hasn't sent any
|
||||
certificates at all. To fix this, the client begins a TLS session
|
||||
renegotiation. This time, the server continues with two certificates
|
||||
as usual, and asks for certificates so that the client will send
|
||||
certificates of its own. Because the TLS connection has been
|
||||
established, all of this is encrypted. (The certificate sent by the
|
||||
server in the renegotiated connection need not be the same that
|
||||
as sentin the original connection.)
|
||||
|
||||
The server MUST NOT write any data until the client has renegotiated.
|
||||
|
||||
Once the renegotiation is finished, the server and client check one
|
||||
another's certificates as in V1. Now they are mutually authenticated.
|
||||
|
||||
1.1. Revised TLS handshake: implementation notes.
|
||||
|
||||
It isn't so easy to adjust server behavior based on the client's
|
||||
ciphersuite list. Here's how we can do it using OpenSSL. This is a
|
||||
bit of an abuse of the OpenSSL APIs, but it's the best we can do, and
|
||||
we won't have to do it forever.
|
||||
|
||||
We can use OpenSSL's SSL_set_info_callback() to register a function to
|
||||
be called when the state changes. The type/state tuple of
|
||||
SSL_CB_ACCEPT_LOOP/SSL3_ST_SW_SRVR_HELLO_A
|
||||
happens when we have completely parsed the client hello, and are about
|
||||
to send a response. From this callback, we can check the cipherlist
|
||||
and act accordingly:
|
||||
|
||||
* If the ciphersuite list indicates a v1 protocol, we set the
|
||||
verify mode to SSL_VERIFY_NONE with a callback (so we get
|
||||
certificates).
|
||||
|
||||
* If the ciphersuite list indicates a v2 protocol, we set the
|
||||
verify mode to SSL_VERIFY_NONE with no callback (so we get
|
||||
no certificates) and set the SSL_MODE_NO_AUTO_CHAIN flag (so that
|
||||
we send only 1 certificate in the response.
|
||||
|
||||
Once the handshake is done, the server clears the
|
||||
SSL_MODE_NO_AUTO_CHAIN flag and sets the callback as for the V1
|
||||
protocol. It then starts reading.
|
||||
|
||||
The other problem to take care of is missing ciphers and OpenSSL's
|
||||
cipher sorting algorithms. The two main issues are a) OpenSSL doesn't
|
||||
support some of the default ciphers that Firefox advertises, and b)
|
||||
OpenSSL sorts the list of ciphers it offers in a different way than
|
||||
Firefox sorts them, so unless we fix that Tor will still look different
|
||||
than Firefox.
|
||||
[XXXX more on this.]
|
||||
|
||||
|
||||
1.2. Compatibility for clients using libraries less hackable than OpenSSL.
|
||||
|
||||
As discussed in proposal 105, servers advertise which protocol
|
||||
versions they support in their router descriptors. Clients can simply
|
||||
behave as v1 clients when connecting to servers that do not support
|
||||
link version 2 or higher, and as v2 clients when connecting to servers
|
||||
that do support link version 2 or higher.
|
||||
|
||||
(Servers can't use this strategy because we do not assume that servers
|
||||
know one another's capabilities when connecting.)
|
||||
|
||||
2. Version negotiation.
|
||||
|
||||
Version negotiation proceeds as described in proposal 105, except as
|
||||
follows:
|
||||
|
||||
* Version negotiation only happens if the TLS handshake as described
|
||||
above completes.
|
||||
|
||||
* The TLS renegotiation must be finished before the client sends a
|
||||
VERSIONS cell; the server sends its VERSIONS cell in response.
|
||||
|
||||
* The VERSIONS cell uses the following variable-width format:
|
||||
Circuit [2 octets; set to 0]
|
||||
Command [1 octet; set to 7 for VERSIONS]
|
||||
Length [2 octets; big-endian]
|
||||
Data [Length bytes]
|
||||
|
||||
The Data in the cell is a series of big-endian two-byte integers.
|
||||
|
||||
* It is not allowed to negotiate V1 conections once the v2 protocol
|
||||
has been used. If this happens, Tor instances should close the
|
||||
connection.
|
||||
|
||||
3. The rest of the "v2" protocol
|
||||
|
||||
Once a v2 protocol has been negotiated, NETINFO cells are exchanged
|
||||
as in proposal 105, and communications begin as per tor-spec.txt.
|
||||
Until NETINFO cells have been exchanged, the connection is not open.
|
||||
|
||||
|
@ -1,150 +0,0 @@
|
||||
Filename: 131-verify-tor-usage.txt
|
||||
Title: Help users to verify they are using Tor
|
||||
Version: $Revision$
|
||||
Last-Modified: $Date$
|
||||
Author: Steven J. Murdoch
|
||||
Created: 2008-01-25
|
||||
Status: Needs-Revision
|
||||
|
||||
Overview:
|
||||
|
||||
Websites for checking whether a user is accessing them via Tor are a
|
||||
very helpful aid to configuring web browsers correctly. Existing
|
||||
solutions have both false positives and false negatives when
|
||||
checking if Tor is being used. This proposal will discuss how to
|
||||
modify Tor so as to make testing more reliable.
|
||||
|
||||
Motivation:
|
||||
|
||||
Currently deployed websites for detecting Tor use work by comparing
|
||||
the client IP address for a request with a list of known Tor nodes.
|
||||
This approach is generally effective, but suffers from both false
|
||||
positives and false negatives.
|
||||
|
||||
If a user has a Tor exit node installed, or just happens to have
|
||||
been allocated an IP address previously used by a Tor exit node, any
|
||||
web requests will be incorrectly flagged as coming from Tor. If any
|
||||
customer of an ISP which implements a transparent proxy runs an exit
|
||||
node, all other users of the ISP will be flagged as Tor users.
|
||||
|
||||
Conversely, if the exit node chosen by a Tor user has not yet been
|
||||
recorded by the Tor checking website, requests will be incorrectly
|
||||
flagged as not coming via Tor.
|
||||
|
||||
The only reliable way to tell whether Tor is being used or not is for
|
||||
the Tor client to flag this to the browser.
|
||||
|
||||
Proposal:
|
||||
|
||||
A DNS name should be registered and point to an IP address
|
||||
controlled by the Tor project and likely to remain so for the
|
||||
useful lifetime of a Tor client. A web server should be placed
|
||||
at this IP address.
|
||||
|
||||
Tor should be modified to treat requests to port 80, at the
|
||||
specified DNS name or IP address specially. Instead of opening a
|
||||
circuit, it should respond to a HTTP request with a helpful web
|
||||
page:
|
||||
|
||||
- If the request to open a connection was to the domain name, the web
|
||||
page should state that Tor is working properly.
|
||||
- If the request was to the IP address, the web page should state
|
||||
that there is a DNS-leakage vulnerability.
|
||||
|
||||
If the request goes through to the real web server, the page
|
||||
should state that Tor has not been set up properly.
|
||||
|
||||
Extensions:
|
||||
|
||||
Identifying proxy server:
|
||||
|
||||
If needed, other applications between the web browser and Tor (e.g.
|
||||
Polipo and Privoxy) could piggyback on the same mechanism to flag
|
||||
whether they are in use. All three possible web pages should include
|
||||
a machine-readable placeholder, into which another program could
|
||||
insert their own message.
|
||||
|
||||
For example, the webpage returned by Tor to indicate a successful
|
||||
configuration could include the following HTML:
|
||||
<h2>Connection chain</h2>
|
||||
<ul>
|
||||
<li>Tor 0.1.2.14-alpha</li>
|
||||
<!-- Tor Connectivity Check: success -->
|
||||
</ul>
|
||||
|
||||
When the proxy server observes this string, in response to a request
|
||||
for the Tor connectivity check web page, it would prepend it's own
|
||||
message, resulting in the following being returned to the web
|
||||
browser:
|
||||
<h2>Connection chain
|
||||
<ul>
|
||||
<li>Tor 0.1.2.14-alpha</li>
|
||||
<li>Polipo version 1.0.4</li>
|
||||
<!-- Tor Connectivity Check: success -->
|
||||
</ul>
|
||||
|
||||
Checking external connectivity:
|
||||
|
||||
If Tor intercepts a request, and returns a response itself, the user
|
||||
will not actually confirm whether Tor is able to build a successful
|
||||
circuit. It may then be advantageous to include an image in the web
|
||||
page which is loaded from a different domain. If this is able to be
|
||||
loaded then the user will know that external connectivity through
|
||||
Tor works.
|
||||
|
||||
Automatic Firefox Notification:
|
||||
|
||||
All forms of the website should return valid XHTML and have a
|
||||
hidden link with an id attribute "TorCheckResult" and a target
|
||||
property that can be queried to determine the result. For example,
|
||||
a hidden link would convey success like this:
|
||||
|
||||
<a id="TorCheckResult" target="success" href="/"></a>
|
||||
|
||||
failure like this:
|
||||
|
||||
<a id="TorCheckResult" target="failure" href="/"></a>
|
||||
|
||||
and DNS leaks like this:
|
||||
|
||||
<a id="TorCheckResult" target="dnsleak" href="/"></a>
|
||||
|
||||
Firefox extensions such as Torbutton would then be able to
|
||||
issue an XMLHttpRequest for the page and query the result
|
||||
with resultXML.getElementById("TorCheckResult").target
|
||||
to automatically report the Tor status to the user when
|
||||
they first attempt to enable Tor activity, or whenever
|
||||
they request a check from the extension preferences window.
|
||||
|
||||
If the check website is to be themed with heavy graphics and/or
|
||||
extensive documentation, the check result itself should be
|
||||
contained in a seperate lightweight iframe that extensions can
|
||||
request via an alternate url.
|
||||
|
||||
Security and resiliency implications:
|
||||
|
||||
What attacks are possible?
|
||||
|
||||
If the IP address used for this feature moves there will be two
|
||||
consequences:
|
||||
- A new website at this IP address will remain inaccessible over
|
||||
Tor
|
||||
- Tor users who are leaking DNS will be informed that Tor is not
|
||||
working, rather than that it is active but leaking DNS
|
||||
We should thus attempt to find an IP address which we reasonably
|
||||
believe can remain static.
|
||||
|
||||
Open issues:
|
||||
|
||||
If a Tor version which does not support this extra feature is used,
|
||||
the webpage returned will indicate that Tor is not being used. Can
|
||||
this be safely fixed?
|
||||
|
||||
Related work:
|
||||
|
||||
The proposed mechanism is very similar to config.privoxy.org. The
|
||||
most significant difference is that if the web browser is
|
||||
misconfigured, Tor will only get an IP address. Even in this case,
|
||||
Tor should be able to respond with a webpage to notify the user of how
|
||||
to fix the problem. This also implies that Tor must be told of the
|
||||
special IP address, and so must be effectively permanent.
|
@ -1,147 +0,0 @@
|
||||
Filename: 132-browser-check-tor-service.txt
|
||||
Title: A Tor Web Service For Verifying Correct Browser Configuration
|
||||
Version: $Revision$
|
||||
Last-Modified: $Date$
|
||||
Author: Robert Hogan
|
||||
Created: 2008-03-08
|
||||
Status: Draft
|
||||
|
||||
Overview:
|
||||
|
||||
Tor should operate a primitive web service on the loopback network device
|
||||
that tests the operation of user's browser, privacy proxy and Tor client.
|
||||
The tests are performed by serving unique, randomly generated elements in
|
||||
image URLs embedded in static HTML. The images are only displayed if the DNS
|
||||
and HTTP requests for them are routed through Tor, otherwise the 'alt' text
|
||||
may be displayed. The proposal assumes that 'alt' text is not displayed on
|
||||
all browsers so suggests that text and links should accompany each image
|
||||
advising the user on next steps in case the test fails.
|
||||
|
||||
The service is primarily for the use of controllers, since presumably users
|
||||
aren't going to want to edit text files and then type something exotic like
|
||||
127.0.0.1:9999 into their address bar. In the main use case the controller
|
||||
will have configured the actual port for the webservice so will know where
|
||||
to direct the request. It would also be the responsibility of the controller
|
||||
to ensure the webservice is available, and tor is running, before allowing
|
||||
the user to access the page through their browser.
|
||||
|
||||
Motivation:
|
||||
|
||||
This is a complementary approach to proposal 131. It overcomes some of the
|
||||
limitations of the approach described in proposal 131: reliance
|
||||
on a permanent, real IP address and compatibility with older versions of
|
||||
Tor. Unlike 131, it is not as useful to Tor users who are not running a
|
||||
controller.
|
||||
|
||||
Objective:
|
||||
|
||||
Provide a reliable means of helping users to determine if their Tor
|
||||
installation, privacy proxy and browser are properly configured for
|
||||
anonymous browsing.
|
||||
|
||||
Proposal:
|
||||
|
||||
When configured to do so, Tor should run a basic web service available
|
||||
on a configured port on 127.0.0.1. The purpose of this web service is to
|
||||
serve a number of basic test images that will allow the user to determine
|
||||
if their browser is properly configured and that Tor is working normally.
|
||||
|
||||
The service can consist of a single web page with two columns. The left
|
||||
column contains images, the right column contains advice on what the
|
||||
display/non-display of the column means.
|
||||
|
||||
The rest of this proposal assumes that the service is running on port
|
||||
9999. The port should be configurable, and configuring the port enables the
|
||||
service. The service must run on 127.0.0.1.
|
||||
|
||||
In all the examples below [uniquesessionid] refers to a random, base64
|
||||
encoded string that is unique to the URL it is contained in. Tor only ever
|
||||
stores the most recently generated [uniquesessionid] for each URL, storing 3
|
||||
in total. Tor should generate a [uniquesessionid] for each of the test URLs
|
||||
below every time a HTTP GET is received at 127.0.0.1:9999 for index.htm.
|
||||
|
||||
The most suitable image for each test case is an implementation decision.
|
||||
Tor will need to store and serve images for the first and second test
|
||||
images, and possibly the third (see 'Open Issues').
|
||||
|
||||
1. DNS Request Test Image
|
||||
|
||||
This is a HTML element embedded in the page served by Tor at
|
||||
http://127.0.0.1:9999:
|
||||
|
||||
<IMG src="http://[uniquesessionid]:9999/torlogo.jpg" alt="If you can see
|
||||
this text, your browser's DNS requests are not being routed through Tor."
|
||||
width="200" height="200" align="middle" border="2">
|
||||
|
||||
If the browser's DNS request for [uniquesessionid] is routed through Tor,
|
||||
Tor will intercept the request and return 127.0.0.1 as the resolved IP
|
||||
address. This will shortly be followed by a HTTP request from the browser
|
||||
for http://127.0.0.1:9999/torlogo.jpg. This request should be served with
|
||||
the appropriate image.
|
||||
|
||||
If the browser's DNS request for [uniquesessionid] is not routed through Tor
|
||||
the browser may display the 'alt' text specified in the html element. The
|
||||
HTML served by Tor should also contain text accompanying the image to advise
|
||||
users what it means if they do not see an image. It should also provide a
|
||||
link to click that provides information on how to remedy the problem. This
|
||||
behaviour also applies to the images described in 2. and 3. below, so should
|
||||
be assumed there as well.
|
||||
|
||||
|
||||
2. Proxy Configuration Test Image
|
||||
|
||||
This is a HTML element embedded in the page served by Tor at
|
||||
http://127.0.0.1:9999:
|
||||
|
||||
<IMG src="http://torproject.org/[uniquesessionid].jpg" alt="If you can see
|
||||
this text, your browser is not configured to work with Tor." width="200"
|
||||
height="200" align="middle" border="2">
|
||||
|
||||
If the HTTP request for the resource [uniquesessionid].jpg is received by
|
||||
Tor it will serve the appropriate image in response. It should serve this
|
||||
image itself, without attempting to retrieve anything from the Internet.
|
||||
|
||||
If Tor can identify the name of the proxy application requesting the
|
||||
resource then it could store and serve an image identifying the proxy to the
|
||||
user.
|
||||
|
||||
3. Tor Connectivity Test Image
|
||||
|
||||
This is a HTML element embedded in the page served by Tor at
|
||||
http://127.0.0.1:9999:
|
||||
|
||||
<IMG src="http://torproject.org/[uniquesessionid]-torlogo.jpg" alt="If you
|
||||
can see this text, your Tor installation cannot connect to the Internet."
|
||||
width="200" height="200" align="middle" border="2">
|
||||
|
||||
The referenced image should actually exist on the Tor project website. If
|
||||
Tor receives the request for the above resource it should remove the random
|
||||
base64 encoded digest from the request (i.e. [uniquesessionid]-) and attempt
|
||||
to retrieve the real image.
|
||||
|
||||
Even on a fully operational Tor client this test may not always succeed. The
|
||||
user should be advised that one or more attempts to retrieve this image may
|
||||
be necessary to confirm a genuine problem.
|
||||
|
||||
Open Issues:
|
||||
|
||||
The final connectivity test relies on an externally maintained resource, if
|
||||
this resource becomes unavailable the connectivity test will always fail.
|
||||
Either the text accompanying the test should advise of this possibility or
|
||||
Tor clients should be advised of the location of the test resource in the
|
||||
main network directory listings.
|
||||
|
||||
Any number of misconfigurations may make the web service unreachable, it is
|
||||
the responsibility of the user's controller to recognize these and assist
|
||||
the user in eliminating them. Tor can mitigate against the specific
|
||||
misconfiguration of routing HTTP traffic to 127.0.0.1 to Tor itself by
|
||||
serving such requests through the SOCKS port as well as the configured web
|
||||
service report.
|
||||
|
||||
Now Tor is inspecting the URLs requested on its SOCKS port and 'dropping'
|
||||
them. It already inspects for raw IP addresses (to warn of DNS leaks) but
|
||||
maybe the behaviour proposed here is qualitatively different. Maybe this is
|
||||
an unwelcome precedent that can be used to beat the project over the head in
|
||||
future. Or maybe it's not such a bad thing, Tor is merely attempting to make
|
||||
normally invalid resource requests valid for a given purpose.
|
||||
|
@ -1,128 +0,0 @@
|
||||
Filename: 133-unreachable-ors.txt
|
||||
Title: Incorporate Unreachable ORs into the Tor Network
|
||||
Author: Robert Hogan
|
||||
Created: 2008-03-08
|
||||
Status: Draft
|
||||
|
||||
Overview:
|
||||
|
||||
Propose a scheme for harnessing the bandwidth of ORs who cannot currently
|
||||
participate in the Tor network because they can only make outbound
|
||||
TCP connections.
|
||||
|
||||
Motivation:
|
||||
|
||||
Restrictive local and remote firewalls are preventing many willing
|
||||
candidates from becoming ORs on the Tor network.These
|
||||
ORs have a casual interest in joining the network but their operator is not
|
||||
sufficiently motivated or adept to complete the necessary router or firewall
|
||||
configuration. The Tor network is losing out on their bandwidth. At the
|
||||
moment we don't even know how many such 'candidate' ORs there are.
|
||||
|
||||
|
||||
Objective:
|
||||
|
||||
1. Establish how many ORs are unable to qualify for publication because
|
||||
they cannot establish that their ORPort is reachable.
|
||||
|
||||
2. Devise a method for making such ORs available to clients for circuit
|
||||
building without prejudicing their anonymity.
|
||||
|
||||
Proposal:
|
||||
|
||||
ORs whose ORPort reachability testing fails a specified number of
|
||||
consecutive times should:
|
||||
1. Enlist themselves with the authorities setting a 'Fallback' flag. This
|
||||
flag indicates that the OR is up and running but cannot connect to
|
||||
itself.
|
||||
2. Open an orconn with all ORs whose fingerprint begins with the same
|
||||
byte as their own. The management of this orconn will be transferred
|
||||
entirely to the OR at the other end.
|
||||
2. The fallback OR should update it's router status to contain the
|
||||
'Running' flag if it has managed to open an orconn with 3/4 of the ORs
|
||||
with an FP beginning with the same byte as its own.
|
||||
|
||||
Tor ORs who are contacted by fallback ORs requesting an orconn should:
|
||||
1. Accept the orconn until they have reached a defined limit of orconn
|
||||
connections with fallback ORs.
|
||||
2. Should only accept such orconn requests from listed fallback ORs who
|
||||
have an FP beginning with the same byte as its own.
|
||||
|
||||
Tor clients can include fallback ORs in the network by doing the
|
||||
following:
|
||||
1. When building a circuit, observe the fingerprint of each node they
|
||||
wish to connect to.
|
||||
2. When randomly selecting a node from the set of all eligible nodes,
|
||||
add all published, running fallback nodes to the set where the first
|
||||
byte of the fingerprint matches the previous node in the circuit.
|
||||
|
||||
Anonymity Implications:
|
||||
|
||||
At least some, and possibly all, nodes on the network will have a set
|
||||
of nodes that only they and a few others can build circuits on.
|
||||
|
||||
1. This means that fallback ORs might be unsuitable for use as middlemen
|
||||
nodes, because if the exit node is the attacker it knows that the
|
||||
number of nodes that could be the entry guard in the circuit is
|
||||
reduced to roughly 1/256th of the network, or worse 1/256th of all
|
||||
nodes listed as Guards. For the same reason, fallback nodes would
|
||||
appear to be unsuitable for two-hop circuits.
|
||||
|
||||
2. This is not a problem if fallback ORs are always exit nodes. If
|
||||
the fallback OR is an attacker it will not be able to reduce the
|
||||
set of possible nodes for the entry guard any further than a normal,
|
||||
published OR.
|
||||
|
||||
Possible Attacks/Open Issues:
|
||||
|
||||
1. Gaming Node Selection
|
||||
Does running a fallback OR customized for a specific set of published ORs
|
||||
improve an attacker's chances of seeing traffic from that set of published
|
||||
ORs? Would such a strategy be any more effective than running published
|
||||
ORs with other 'attractive' properties?
|
||||
|
||||
2. DOS Attack
|
||||
An attacker could prevent all other legitimate fallback ORs with a
|
||||
given byte-1 in their FP from functioning by running 20 or 30 fallback ORs
|
||||
and monopolizing all available fallback slots on the published ORs.
|
||||
This same attacker would then be in a position to monopolize all the
|
||||
traffic of the fallback ORs on that byte-1 network segment. I'm not sure
|
||||
what this would allow such an attacker to do.
|
||||
|
||||
4. Circuit-Sniffing
|
||||
An observer watching exit traffic from a fallback server will know that the
|
||||
previous node in the circuit is one of a very small, identifiable
|
||||
subset of the total ORs in the network. To establish the full path of the
|
||||
circuit they would only have to watch the exit traffic from the fallback
|
||||
OR and all the traffic from the 20 or 30 ORs it is likely to be connected
|
||||
to. This means it is substantially easier to establish all members of a
|
||||
circuit which has a fallback OR as an exit (sniff and analyse 10-50 (i.e.
|
||||
1/256 varying) + 1 ORs) rather than a normal published OR (sniff all 2560
|
||||
or so ORs on the network). The same mechanism that allows the client to
|
||||
expect a specific fallback OR to be available from a specific published OR
|
||||
allows an attacker to prepare his ground.
|
||||
|
||||
Mitigant:
|
||||
In terms of the resources and access required to monitor 2000 to 3000
|
||||
nodes, the effort of the adversary is not significantly diminished when he
|
||||
is only interested in 20 or 30. It is hard to see how an adversary who can
|
||||
obtain access to a randomly selected portion of the Tor network would face
|
||||
any new or qualitatively different obstacles in attempting to access much
|
||||
of the rest of it.
|
||||
|
||||
|
||||
Implementation Issues:
|
||||
|
||||
The number of ORs this proposal would add to the Tor network is not known.
|
||||
This is because there is no mechanism at present for recording unsuccessful
|
||||
attempts to become an OR. If the proposal is considered promising it may be
|
||||
worthwhile to issue an alpha series release where candidate ORs post a
|
||||
primitive fallback descriptor to the authority directories. This fallback
|
||||
descriptor would not contain any other flag that would make it eligible for
|
||||
selection by clients. It would act solely as a means of sizing the number of
|
||||
Tor instances that try and fail to become ORs.
|
||||
|
||||
The upper limit on the number of orconns from fallback ORs a normal,
|
||||
published OR should be willing to accept is an open question. Is one
|
||||
hundred, mostly idle, such orconns too onerous?
|
||||
|
@ -1,105 +0,0 @@
|
||||
Filename: 134-robust-voting.txt
|
||||
Title: More robust consensus voting with diverse authority sets
|
||||
Author: Peter Palfrader
|
||||
Created: 2008-04-01
|
||||
Status: Accepted
|
||||
Target: 0.2.2.x
|
||||
|
||||
Overview:
|
||||
|
||||
A means to arrive at a valid directory consensus even when voters
|
||||
disagree on who is an authority.
|
||||
|
||||
|
||||
Motivation:
|
||||
|
||||
Right now there are about five authoritative directory servers in the
|
||||
Tor network, tho this number is expected to rise to about 15 eventually.
|
||||
|
||||
Adding a new authority requires synchronized action from all operators of
|
||||
directory authorities so that at any time during the update at least half of
|
||||
all authorities are running and agree on who is an authority. The latter
|
||||
requirement is there so that the authorities can arrive at a common
|
||||
consensus: Each authority builds the consensus based on the votes from
|
||||
all authorities it recognizes, and so a different set of recognized
|
||||
authorities will lead to a different consensus document.
|
||||
|
||||
|
||||
Objective:
|
||||
|
||||
The modified voting procedure outlined in this proposal obsoletes the
|
||||
requirement for most authorities to exactly agree on the list of
|
||||
authorities.
|
||||
|
||||
|
||||
Proposal:
|
||||
|
||||
The vote document each authority generates contains a list of
|
||||
authorities recognized by the generating authority. This will be
|
||||
a list of authority identity fingerprints.
|
||||
|
||||
Authorities will accept votes from and serve/mirror votes also for
|
||||
authorities they do not recognize. (Votes contain the signing,
|
||||
authority key, and the certificate linking them so they can be
|
||||
verified even without knowing the authority beforehand.)
|
||||
|
||||
Before building the consensus we will check which votes to use for
|
||||
building:
|
||||
|
||||
1) We build a directed graph of which authority/vote recognizes
|
||||
whom.
|
||||
2) (Parts of the graph that aren't reachable, directly or
|
||||
indirectly, from any authorities we recognize can be discarded
|
||||
immediately.)
|
||||
3) We find the largest fully connected subgraph.
|
||||
(Should there be more than one subgraph of the same size there
|
||||
needs to be some arbitrary ordering so we always pick the same.
|
||||
E.g. pick the one who has the smaller (XOR of all votes' digests)
|
||||
or something.)
|
||||
4) If we are part of that subgraph, great. This is the list of
|
||||
votes we build our consensus with.
|
||||
5) If we are not part of that subgraph, remove all the nodes that
|
||||
are part of it and go to 3.
|
||||
|
||||
Using this procedure authorities that are updated to recognize a
|
||||
new authority will continue voting with the old group until a
|
||||
sufficient number has been updated to arrive at a consensus with
|
||||
the recently added authority.
|
||||
|
||||
In fact, the old set of authorities will probably be voting among
|
||||
themselves until all but one has been updated to recognize the
|
||||
new authority. Then which set of votes is used for consensus
|
||||
building depends on which of the two equally large sets gets
|
||||
ordered before the other in step (3) above.
|
||||
|
||||
It is necessary to continue with the process in (5) even if we
|
||||
are not in the largest subgraph. Otherwise one rogue authority
|
||||
could create a number of extra votes (by new authorities) so that
|
||||
everybody stops at 5 and no consensus is built, even tho it would
|
||||
be trusted by all clients.
|
||||
|
||||
|
||||
Anonymity Implications:
|
||||
|
||||
The author does not believe this proposal to have anonymity
|
||||
implications.
|
||||
|
||||
|
||||
Possible Attacks/Open Issues/Some thinking required:
|
||||
|
||||
Q: Can a number (less or exactly half) of the authorities cause an honest
|
||||
authority to vote for "their" consensus rather than the one that would
|
||||
result were all authorities taken into account?
|
||||
|
||||
|
||||
Q: Can a set of votes from external authorities, i.e of whom we trust either
|
||||
none or at least not all, cause us to change the set of consensus makers we
|
||||
pick?
|
||||
A: Yes, if other authorities decide they rather build a consensus with them
|
||||
then they'll be thrown out in step 3. But that's ok since those other
|
||||
authorities will never vote with us anyway.
|
||||
If we trust none of them then we throw them out even sooner, so no harm done.
|
||||
|
||||
Q: Can this ever force us to build a consensus with authorities we do not
|
||||
recognize?
|
||||
A: No, we can never build a fully connected set with them in step 3.
|
@ -1,283 +0,0 @@
|
||||
Filename: 135-private-tor-networks.txt
|
||||
Title: Simplify Configuration of Private Tor Networks
|
||||
Version: $Revision$
|
||||
Last-Modified: $Date$
|
||||
Author: Karsten Loesing
|
||||
Created: 29-Apr-2008
|
||||
Status: Closed
|
||||
Target: 0.2.1.x
|
||||
Implemented-In: 0.2.1.2-alpha
|
||||
|
||||
Change history:
|
||||
|
||||
29-Apr-2008 Initial proposal for or-dev
|
||||
19-May-2008 Included changes based on comments by Nick to or-dev and
|
||||
added a section for test cases.
|
||||
18-Jun-2008 Changed testing-network-only configuration option names.
|
||||
|
||||
Overview:
|
||||
|
||||
Configuring a private Tor network has become a time-consuming and
|
||||
error-prone task with the introduction of the v3 directory protocol. In
|
||||
addition to that, operators of private Tor networks need to set an
|
||||
increasing number of non-trivial configuration options, and it is hard
|
||||
to keep FAQ entries describing this task up-to-date. In this proposal we
|
||||
(1) suggest to (optionally) accelerate timing of the v3 directory voting
|
||||
process and (2) introduce an umbrella config option specifically aimed at
|
||||
creating private Tor networks.
|
||||
|
||||
Design:
|
||||
|
||||
1. Accelerate Timing of v3 Directory Voting Process
|
||||
|
||||
Tor has reasonable defaults for setting up a large, Internet-scale
|
||||
network with comparably high latencies and possibly wrong server clocks.
|
||||
However, those defaults are bad when it comes to quickly setting up a
|
||||
private Tor network for testing, either on a single node or LAN (things
|
||||
might be different when creating a test network on PlanetLab or
|
||||
something). Some time constraints should be made configurable for private
|
||||
networks. The general idea is to accelerate everything that has to do
|
||||
with propagation of directory information, but nothing else, so that a
|
||||
private network is available as soon as possible. (As a possible
|
||||
safeguard, changing these configuration values could be made dependent on
|
||||
the umbrella configuration option introduced in 2.)
|
||||
|
||||
1.1. Initial Voting Schedule
|
||||
|
||||
When a v3 directory does not know any consensus, it assumes an initial,
|
||||
hard-coded VotingInterval of 30 minutes, VoteDelay of 5 minutes, and
|
||||
DistDelay of 5 minutes. This is important for multiple, simultaneously
|
||||
restarted directory authorities to meet at a common time and create an
|
||||
initial consensus. Unfortunately, this means that it may take up to half
|
||||
an hour (or even more) for a private Tor network to bootstrap.
|
||||
|
||||
We propose to make these three time constants configurable (note that
|
||||
V3AuthVotingInterval, V3AuthVoteDelay, and V3AuthDistDelay do not have an
|
||||
effect on the _initial_ voting schedule, but only on the schedule that a
|
||||
directory authority votes for). This can be achieved by introducing three
|
||||
new configuration options: TestingV3AuthInitialVotingInterval,
|
||||
TestingV3AuthInitialVoteDelay, and TestingV3AuthInitialDistDelay.
|
||||
|
||||
As first safeguards, Tor should only accept configuration values for
|
||||
TestingV3AuthInitialVotingInterval that divide evenly into the default
|
||||
value of 30 minutes. The effect is that even if people misconfigured
|
||||
their directory authorities, they would meet at the default values at the
|
||||
latest. The second safeguard is to allow configuration only when the
|
||||
umbrella configuration option TestingTorNetwork is set.
|
||||
|
||||
1.2. Immediately Provide Reachability Information (Running flag)
|
||||
|
||||
The default behavior of a directory authority is to provide the Running
|
||||
flag only after the authority is available for at least 30 minutes. The
|
||||
rationale is that before that time, an authority simply cannot deliver
|
||||
useful information about other running nodes. But for private Tor
|
||||
networks this may be different. This is currently implemented in the code
|
||||
as:
|
||||
|
||||
/** If we've been around for less than this amount of time, our
|
||||
* reachability information is not accurate. */
|
||||
#define DIRSERV_TIME_TO_GET_REACHABILITY_INFO (30*60)
|
||||
|
||||
There should be another configuration option
|
||||
TestingAuthDirTimeToLearnReachability with a default value of 30 minutes
|
||||
that can be changed when running testing Tor networks, e.g. to 0 minutes.
|
||||
The configuration value would simply replace the quoted constant. Again,
|
||||
changing this option could be safeguarded by requiring the umbrella
|
||||
configuration option TestingTorNetwork to be set.
|
||||
|
||||
1.3. Reduce Estimated Descriptor Propagation Time
|
||||
|
||||
Tor currently assumes that it takes up to 10 minutes until router
|
||||
descriptors are propagated from the authorities to directory caches.
|
||||
This is not very useful for private Tor networks, and we want to be able
|
||||
to reduce this time, so that clients can download router descriptors in a
|
||||
timely manner.
|
||||
|
||||
/** Clients don't download any descriptor this recent, since it will
|
||||
* probably not have propagated to enough caches. */
|
||||
#define ESTIMATED_PROPAGATION_TIME (10*60)
|
||||
|
||||
We suggest to introduce a new config option
|
||||
TestingEstimatedDescriptorPropagationTime which defaults to 10 minutes,
|
||||
but that can be set to any lower non-negative value, e.g. 0 minutes. The
|
||||
same safeguards as in 1.2 could be used here, too.
|
||||
|
||||
2. Umbrella Option for Setting Up Private Tor Networks
|
||||
|
||||
Setting up a private Tor network requires a number of specific settings
|
||||
that are not required or useful when running Tor in the public Tor
|
||||
network. Instead of writing down these options in a FAQ entry, there
|
||||
should be a single configuration option, e.g. TestingTorNetwork, that
|
||||
changes all required settings at once. Newer Tor versions would keep the
|
||||
set of configuration options up-to-date. It should still remain possible
|
||||
to manually overwrite the settings that the umbrella configuration option
|
||||
affects.
|
||||
|
||||
The following configuration options are set by TestingTorNetwork:
|
||||
|
||||
- ServerDNSAllowBrokenResolvConf 1
|
||||
Ignore the situation that private relays are not aware of any name
|
||||
servers.
|
||||
|
||||
- DirAllowPrivateAddresses 1
|
||||
Allow router descriptors containing private IP addresses.
|
||||
|
||||
- EnforceDistinctSubnets 0
|
||||
Permit building circuits with relays in the same subnet.
|
||||
|
||||
- AssumeReachable 1
|
||||
Omit self-testing for reachability.
|
||||
|
||||
- AuthDirMaxServersPerAddr 0
|
||||
- AuthDirMaxServersPerAuthAddr 0
|
||||
Permit an unlimited number of nodes on the same IP address.
|
||||
|
||||
- ClientDNSRejectInternalAddresses 0
|
||||
Believe in DNS responses resolving to private IP addresses.
|
||||
|
||||
- ExitPolicyRejectPrivate 0
|
||||
Allow exiting to private IP addresses. (This one is a matter of
|
||||
taste---it might be dangerous to make this a default in a private
|
||||
network, although people setting up private Tor networks should know
|
||||
what they are doing.)
|
||||
|
||||
- V3AuthVotingInterval 5 minutes
|
||||
- V3AuthVoteDelay 20 seconds
|
||||
- V3AuthDistDelay 20 seconds
|
||||
Accelerate voting schedule after first consensus has been reached.
|
||||
|
||||
- TestingV3AuthInitialVotingInterval 5 minutes
|
||||
- TestingV3AuthInitialVoteDelay 20 seconds
|
||||
- TestingV3AuthInitialDistDelay 20 seconds
|
||||
Accelerate initial voting schedule until first consensus is reached.
|
||||
|
||||
- TestingAuthDirTimeToLearnReachability 0 minutes
|
||||
Consider routers as Running from the start of running an authority.
|
||||
|
||||
- TestingEstimatedDescriptorPropagationTime 0 minutes
|
||||
Clients try downloading router descriptors from directory caches,
|
||||
even when they are not 10 minutes old.
|
||||
|
||||
In addition to changing the defaults for these configuration options,
|
||||
TestingTorNetwork can only be set when a user has manually configured
|
||||
DirServer lines.
|
||||
|
||||
Test:
|
||||
|
||||
The implementation of this proposal must pass the following tests:
|
||||
|
||||
1. Set TestingTorNetwork and see if dependent configuration options are
|
||||
correctly changed.
|
||||
|
||||
tor DataDirectory . ControlPort 9051 TestingTorNetwork 1 DirServer \
|
||||
"mydir 127.0.0.1:1234 0000000000000000000000000000000000000000"
|
||||
telnet 127.0.0.1 9051
|
||||
AUTHENTICATE
|
||||
GETCONF TestingTorNetwork TestingAuthDirTimeToLearnReachability
|
||||
250-TestingTorNetwork=1
|
||||
250 TestingAuthDirTimeToLearnReachability=0
|
||||
QUIT
|
||||
|
||||
2. Set TestingTorNetwork and a dependent configuration value to see if
|
||||
the provided value is used for the dependent option.
|
||||
|
||||
tor DataDirectory . ControlPort 9051 TestingTorNetwork 1 DirServer \
|
||||
"mydir 127.0.0.1:1234 0000000000000000000000000000000000000000" \
|
||||
TestingAuthDirTimeToLearnReachability 5
|
||||
telnet 127.0.0.1 9051
|
||||
AUTHENTICATE
|
||||
GETCONF TestingTorNetwork TestingAuthDirTimeToLearnReachability
|
||||
250-TestingTorNetwork=1
|
||||
250 TestingAuthDirTimeToLearnReachability=5
|
||||
QUIT
|
||||
|
||||
3. Start with TestingTorNetwork set and change a dependent configuration
|
||||
option later on.
|
||||
|
||||
tor DataDirectory . ControlPort 9051 TestingTorNetwork 1 DirServer \
|
||||
"mydir 127.0.0.1:1234 0000000000000000000000000000000000000000"
|
||||
telnet 127.0.0.1 9051
|
||||
AUTHENTICATE
|
||||
SETCONF TestingAuthDirTimeToLearnReachability=5
|
||||
GETCONF TestingAuthDirTimeToLearnReachability
|
||||
250 TestingAuthDirTimeToLearnReachability=5
|
||||
QUIT
|
||||
|
||||
4. Start with TestingTorNetwork set and a dependent configuration value,
|
||||
and reset that dependent configuration value. The result should be
|
||||
the testing-network specific default value.
|
||||
|
||||
tor DataDirectory . ControlPort 9051 TestingTorNetwork 1 DirServer \
|
||||
"mydir 127.0.0.1:1234 0000000000000000000000000000000000000000" \
|
||||
TestingAuthDirTimeToLearnReachability 5
|
||||
telnet 127.0.0.1 9051
|
||||
AUTHENTICATE
|
||||
GETCONF TestingAuthDirTimeToLearnReachability
|
||||
250 TestingAuthDirTimeToLearnReachability=5
|
||||
RESETCONF TestingAuthDirTimeToLearnReachability
|
||||
GETCONF TestingAuthDirTimeToLearnReachability
|
||||
250 TestingAuthDirTimeToLearnReachability=0
|
||||
QUIT
|
||||
|
||||
5. Leave TestingTorNetwork unset and check if dependent configuration
|
||||
options are left unchanged.
|
||||
|
||||
tor DataDirectory . ControlPort 9051 DirServer \
|
||||
"mydir 127.0.0.1:1234 0000000000000000000000000000000000000000"
|
||||
telnet 127.0.0.1 9051
|
||||
AUTHENTICATE
|
||||
GETCONF TestingTorNetwork TestingAuthDirTimeToLearnReachability
|
||||
250-TestingTorNetwork=0
|
||||
250 TestingAuthDirTimeToLearnReachability=1800
|
||||
QUIT
|
||||
|
||||
6. Leave TestingTorNetwork unset, but set dependent configuration option
|
||||
which should fail.
|
||||
|
||||
tor DataDirectory . ControlPort 9051 DirServer \
|
||||
"mydir 127.0.0.1:1234 0000000000000000000000000000000000000000" \
|
||||
TestingAuthDirTimeToLearnReachability 0
|
||||
[warn] Failed to parse/validate config:
|
||||
TestingAuthDirTimeToLearnReachability may only be changed in testing
|
||||
Tor networks!
|
||||
|
||||
7. Start with TestingTorNetwork unset and change dependent configuration
|
||||
option later on which should fail.
|
||||
|
||||
tor DataDirectory . ControlPort 9051 DirServer \
|
||||
"mydir 127.0.0.1:1234 0000000000000000000000000000000000000000"
|
||||
telnet 127.0.0.1 9051
|
||||
AUTHENTICATE
|
||||
SETCONF TestingAuthDirTimeToLearnReachability=0
|
||||
513 Unacceptable option value: TestingAuthDirTimeToLearnReachability
|
||||
may only be changed in testing Tor networks!
|
||||
|
||||
8. Start with TestingTorNetwork unset and set it later on which should
|
||||
fail.
|
||||
|
||||
tor DataDirectory . ControlPort 9051 DirServer \
|
||||
"mydir 127.0.0.1:1234 0000000000000000000000000000000000000000"
|
||||
telnet 127.0.0.1 9051
|
||||
AUTHENTICATE
|
||||
SETCONF TestingTorNetwork=1
|
||||
553 Transition not allowed: While Tor is running, changing
|
||||
TestingTorNetwork is not allowed.
|
||||
|
||||
9. Start with TestingTorNetwork set and unset it later on which should
|
||||
fail.
|
||||
|
||||
tor DataDirectory . ControlPort 9051 TestingTorNetwork 1 DirServer \
|
||||
"mydir 127.0.0.1:1234 0000000000000000000000000000000000000000"
|
||||
telnet 127.0.0.1 9051
|
||||
AUTHENTICATE
|
||||
RESETCONF TestingTorNetwork
|
||||
513 Unacceptable option value: TestingV3AuthInitialVotingInterval may
|
||||
only be changed in testing Tor networks!
|
||||
|
||||
10. Set TestingTorNetwork, but do not provide an alternate DirServer
|
||||
which should fail.
|
||||
|
||||
tor DataDirectory . ControlPort 9051 TestingTorNetwork 1
|
||||
[warn] Failed to parse/validate config: TestingTorNetwork may only be
|
||||
configured in combination with a non-default set of DirServers.
|
||||
|
@ -1,100 +0,0 @@
|
||||
Filename: 136-legacy-keys.txt
|
||||
Title: Mass authority migration with legacy keys
|
||||
Author: Nick Mathewson
|
||||
Created: 13-May-2008
|
||||
Status: Closed
|
||||
Implemented-In: 0.2.0.x
|
||||
|
||||
Overview:
|
||||
|
||||
This document describes a mechanism to change the keys of more than
|
||||
half of the directory servers at once without breaking old clients
|
||||
and caches immediately.
|
||||
|
||||
Motivation:
|
||||
|
||||
If a single authority's identity key is believed to be compromised,
|
||||
the solution is obvious: remove that authority from the list,
|
||||
generate a new certificate, and treat the new cert as belonging to a
|
||||
new authority. This approach works fine so long as less than 1/2 of
|
||||
the authority identity keys are bad.
|
||||
|
||||
Unfortunately, the mass-compromise case is possible if there is a
|
||||
sufficiently bad bug in Tor or in any OS used by a majority of v3
|
||||
authorities. Let's be prepared for it!
|
||||
|
||||
We could simply stop using the old keys and start using new ones,
|
||||
and tell all clients running insecure versions to upgrade.
|
||||
Unfortunately, this breaks our cacheing system pretty badly, since
|
||||
caches won't cache a consensus that they don't believe in. It would
|
||||
be nice to have everybody become secure the moment they upgrade to a
|
||||
version listing the new authority keys, _without_ breaking upgraded
|
||||
clients until the caches upgrade.
|
||||
|
||||
So, let's come up with a way to provide a time window where the
|
||||
consensuses are signed with the new keys and with the old.
|
||||
|
||||
Design:
|
||||
|
||||
We allow directory authorities to list a single "legacy key"
|
||||
fingerprint in their votes. Each authority may add a single legacy
|
||||
key. The format for this line is:
|
||||
|
||||
legacy-dir-key FINGERPRINT
|
||||
|
||||
We describe a new consensus method for generating directory
|
||||
consensuses. This method is consensus method "3".
|
||||
|
||||
When the authorities decide to use method "3" (as described in 3.4.1
|
||||
of dir-spec.txt), for every included vote with a legacy-dir-key line,
|
||||
the consensus includes an extra dir-source line. The fingerprint in
|
||||
this extra line is as in the legacy-dir-key line. The ports and
|
||||
addresses are in the dir-source line. The nickname is as in the
|
||||
dir-source line, with the string "-legacy" appended.
|
||||
|
||||
[We need to include this new dir-source line because the code
|
||||
won't accept or preserve signatures from authorities not listed
|
||||
as contributing to the consensus.]
|
||||
|
||||
Authorities using legacy dir keys include two signatures on their
|
||||
consensuses: one generated with a signing key signed with their real
|
||||
signing key, and another generated with a signing key signed with
|
||||
another signing key attested to by their identity key. These
|
||||
signing keys MUST be different. Authorities MUST serve both
|
||||
certificates if asked.
|
||||
|
||||
Process:
|
||||
|
||||
In the event of a mass key failure, we'll follow the following
|
||||
(ugly) procedure:
|
||||
- All affected authorities generate new certificates and identity
|
||||
keys, and circulate their new dirserver lines. They copy their old
|
||||
certificates and old broken keys, but put them in new "legacy
|
||||
key files".
|
||||
- At the earliest time that can be arranged, the authorities
|
||||
replace their signing keys, identity keys, and certificates
|
||||
with the new uncompromised versions, and update to the new list
|
||||
of dirserer lines.
|
||||
- They add an "V3DirAdvertiseLegacyKey 1" option to their torrc.
|
||||
- Now, new consensuses will be generated using the new keys, but
|
||||
the results will also be signed with the old keys.
|
||||
- Clients and caches are told they need to upgrade, and given a
|
||||
time window to do so.
|
||||
- At the end of the time window, authorities remove the
|
||||
V3DirAdvertiseLegacyKey option.
|
||||
|
||||
Notes:
|
||||
|
||||
It might be good to get caches to cache consensuses that they do not
|
||||
believe in. I'm not sure the best way of how to do this.
|
||||
|
||||
It's a superficially neat idea to have new signing keys and have
|
||||
them signed by the new and by the old authority identity keys. This
|
||||
breaks some code, though, and doesn't actually gain us anything,
|
||||
since we'd still need to include each signature twice.
|
||||
|
||||
It's also a superficially neat idea, if identity keys and signing
|
||||
keys are compromised, to at least replace all the signing keys.
|
||||
I don't think this achieves us anything either, though.
|
||||
|
||||
|
@ -1,237 +0,0 @@
|
||||
Filename: 137-bootstrap-phases.txt
|
||||
Title: Keep controllers informed as Tor bootstraps
|
||||
Version: $Revision$
|
||||
Last-Modified: $Date$
|
||||
Author: Roger Dingledine
|
||||
Created: 07-Jun-2008
|
||||
Status: Closed
|
||||
Implemented-In: 0.2.1.x
|
||||
|
||||
1. Overview.
|
||||
|
||||
Tor has many steps to bootstrapping directory information and
|
||||
initial circuits, but from the controller's perspective we just have
|
||||
a coarse-grained "CIRCUIT_ESTABLISHED" status event. Tor users with
|
||||
slow connections or with connectivity problems can wait a long time
|
||||
staring at the yellow onion, wondering if it will ever change color.
|
||||
|
||||
This proposal describes a new client status event so Tor can give
|
||||
more details to the controller. Section 2 describes the changes to the
|
||||
controller protocol; Section 3 describes Tor's internal bootstrapping
|
||||
phases when everything is going correctly; Section 4 describes when
|
||||
Tor detects a problem and issues a bootstrap warning; Section 5 covers
|
||||
suggestions for how controllers should display the results.
|
||||
|
||||
2. Controller event syntax.
|
||||
|
||||
The generic status event is:
|
||||
|
||||
"650" SP StatusType SP StatusSeverity SP StatusAction
|
||||
[SP StatusArguments] CRLF
|
||||
|
||||
So in this case we send
|
||||
650 STATUS_CLIENT NOTICE/WARN BOOTSTRAP \
|
||||
PROGRESS=num TAG=Keyword SUMMARY=String \
|
||||
[WARNING=String REASON=Keyword COUNT=num RECOMMENDATION=Keyword]
|
||||
|
||||
The arguments MAY appear in any order. Controllers MUST accept unrecognized
|
||||
arguments.
|
||||
|
||||
"Progress" gives a number between 0 and 100 for how far through
|
||||
the bootstrapping process we are. "Summary" is a string that can be
|
||||
displayed to the user to describe the *next* task that Tor will tackle,
|
||||
i.e., the task it is working on after sending the status event. "Tag"
|
||||
is an optional string that controllers can use to recognize bootstrap
|
||||
phases from Section 3, if they want to do something smarter than just
|
||||
blindly displaying the summary string.
|
||||
|
||||
The severity describes whether this is a normal bootstrap phase
|
||||
(severity notice) or an indication of a bootstrapping problem
|
||||
(severity warn). If severity warn, it should also include a "warning"
|
||||
argument string with any hints Tor has to offer about why it's having
|
||||
troubles bootstrapping, a "reason" string that lists one of the reasons
|
||||
allowed in the ORConn event, a "count" number that tells how many
|
||||
bootstrap problems there have been so far at this phase, and a
|
||||
"recommendation" keyword to indicate how the controller ought to react.
|
||||
|
||||
3. The bootstrap phases.
|
||||
|
||||
This section describes the various phases currently reported by
|
||||
Tor. Controllers should not assume that the percentages and tags listed
|
||||
here will continue to match up, or even that the tags will stay in
|
||||
the same order. Some phases might also be skipped (not reported) if the
|
||||
associated bootstrap step is already complete, or if the phase no longer
|
||||
is necessary. Only "starting" and "done" are guaranteed to exist in all
|
||||
future versions.
|
||||
|
||||
Current Tor versions enter these phases in order, monotonically;
|
||||
future Tors MAY revisit earlier stages.
|
||||
|
||||
Phase 0:
|
||||
tag=starting summary="starting"
|
||||
|
||||
Tor starts out in this phase.
|
||||
|
||||
Phase 5:
|
||||
tag=conn_dir summary="Connecting to directory mirror"
|
||||
|
||||
Tor sends this event as soon as Tor has chosen a directory mirror ---
|
||||
one of the authorities if bootstrapping for the first time or after
|
||||
a long downtime, or one of the relays listed in its cached directory
|
||||
information otherwise.
|
||||
|
||||
Tor will stay at this phase until it has successfully established
|
||||
a TCP connection with some directory mirror. Problems in this phase
|
||||
generally happen because Tor doesn't have a network connection, or
|
||||
because the local firewall is dropping SYN packets.
|
||||
|
||||
Phase 10
|
||||
tag=handshake_dir summary="Finishing handshake with directory mirror"
|
||||
|
||||
This event occurs when Tor establishes a TCP connection with a relay used
|
||||
as a directory mirror (or its https proxy if it's using one). Tor remains
|
||||
in this phase until the TLS handshake with the relay is finished.
|
||||
|
||||
Problems in this phase generally happen because Tor's firewall is
|
||||
doing more sophisticated MITM attacks on it, or doing packet-level
|
||||
keyword recognition of Tor's handshake.
|
||||
|
||||
Phase 15:
|
||||
tag=onehop_create summary="Establishing one-hop circuit for dir info"
|
||||
|
||||
Once TLS is finished with a relay, Tor will send a CREATE_FAST cell
|
||||
to establish a one-hop circuit for retrieving directory information.
|
||||
It will remain in this phase until it receives the CREATED_FAST cell
|
||||
back, indicating that the circuit is ready.
|
||||
|
||||
Phase 20:
|
||||
tag=requesting_status summary="Asking for networkstatus consensus"
|
||||
|
||||
Once we've finished our one-hop circuit, we will start a new stream
|
||||
for fetching the networkstatus consensus. We'll stay in this phase
|
||||
until we get the 'connected' relay cell back, indicating that we've
|
||||
established a directory connection.
|
||||
|
||||
Phase 25:
|
||||
tag=loading_status summary="Loading networkstatus consensus"
|
||||
|
||||
Once we've established a directory connection, we will start fetching
|
||||
the networkstatus consensus document. This could take a while; this
|
||||
phase is a good opportunity for using the "progress" keyword to indicate
|
||||
partial progress.
|
||||
|
||||
This phase could stall if the directory mirror we picked doesn't
|
||||
have a copy of the networkstatus consensus so we have to ask another,
|
||||
or it does give us a copy but we don't find it valid.
|
||||
|
||||
Phase 40:
|
||||
tag=loading_keys summary="Loading authority key certs"
|
||||
|
||||
Sometimes when we've finished loading the networkstatus consensus,
|
||||
we find that we don't have all the authority key certificates for the
|
||||
keys that signed the consensus. At that point we put the consensus we
|
||||
fetched on hold and fetch the keys so we can verify the signatures.
|
||||
|
||||
Phase 45
|
||||
tag=requesting_descriptors summary="Asking for relay descriptors"
|
||||
|
||||
Once we have a valid networkstatus consensus and we've checked all
|
||||
its signatures, we start asking for relay descriptors. We stay in this
|
||||
phase until we have received a 'connected' relay cell in response to
|
||||
a request for descriptors.
|
||||
|
||||
Phase 50:
|
||||
tag=loading_descriptors summary="Loading relay descriptors"
|
||||
|
||||
We will ask for relay descriptors from several different locations,
|
||||
so this step will probably make up the bulk of the bootstrapping,
|
||||
especially for users with slow connections. We stay in this phase until
|
||||
we have descriptors for at least 1/4 of the usable relays listed in
|
||||
the networkstatus consensus. This phase is also a good opportunity to
|
||||
use the "progress" keyword to indicate partial steps.
|
||||
|
||||
Phase 80:
|
||||
tag=conn_or summary="Connecting to entry guard"
|
||||
|
||||
Once we have a valid consensus and enough relay descriptors, we choose
|
||||
some entry guards and start trying to build some circuits. This step
|
||||
is similar to the "conn_dir" phase above; the only difference is
|
||||
the context.
|
||||
|
||||
If a Tor starts with enough recent cached directory information,
|
||||
its first bootstrap status event will be for the conn_or phase.
|
||||
|
||||
Phase 85:
|
||||
tag=handshake_or summary="Finishing handshake with entry guard"
|
||||
|
||||
This phase is similar to the "handshake_dir" phase, but it gets reached
|
||||
if we finish a TCP connection to a Tor relay and we have already reached
|
||||
the "conn_or" phase. We'll stay in this phase until we complete a TLS
|
||||
handshake with a Tor relay.
|
||||
|
||||
Phase 90:
|
||||
tag=circuit_create "Establishing circuits"
|
||||
|
||||
Once we've finished our TLS handshake with an entry guard, we will
|
||||
set about trying to make some 3-hop circuits in case we need them soon.
|
||||
|
||||
Phase 100:
|
||||
tag=done summary="Done"
|
||||
|
||||
A full 3-hop circuit has been established. Tor is ready to handle
|
||||
application connections now.
|
||||
|
||||
4. Bootstrap problem events.
|
||||
|
||||
When an OR Conn fails, we send a "bootstrap problem" status event, which
|
||||
is like the standard bootstrap status event except with severity warn.
|
||||
We include the same progress, tag, and summary values as we would for
|
||||
a normal bootstrap event, but we also include "warning", "reason",
|
||||
"count", and "recommendation" key/value combos.
|
||||
|
||||
The "reason" values are long-term-stable controller-facing tags to
|
||||
identify particular issues in a bootstrapping step. The warning
|
||||
strings, on the other hand, are human-readable. Controllers SHOULD
|
||||
NOT rely on the format of any warning string. Currently the possible
|
||||
values for "recommendation" are either "ignore" or "warn" -- if ignore,
|
||||
the controller can accumulate the string in a pile of problems to show
|
||||
the user if the user asks; if warn, the controller should alert the
|
||||
user that Tor is pretty sure there's a bootstrapping problem.
|
||||
|
||||
Currently Tor uses recommendation=ignore for the first nine bootstrap
|
||||
problem reports for a given phase, and then uses recommendation=warn
|
||||
for subsequent problems at that phase. Hopefully this is a good
|
||||
balance between tolerating occasional errors and reporting serious
|
||||
problems quickly.
|
||||
|
||||
5. Suggested controller behavior.
|
||||
|
||||
Controllers should start out with a yellow onion or the equivalent
|
||||
("starting"), and then watch for either a bootstrap status event
|
||||
(meaning the Tor they're using is sufficiently new to produce them,
|
||||
and they should load up the progress bar or whatever they plan to use
|
||||
to indicate progress) or a circuit_established status event (meaning
|
||||
bootstrapping is finished).
|
||||
|
||||
In addition to a progress bar in the display, controllers should also
|
||||
have some way to indicate progress even when no controller window is
|
||||
open. For example, folks using Tor Browser Bundle in hostile Internet
|
||||
cafes don't want a big splashy screen up. One way to let the user keep
|
||||
informed of progress in a more subtle way is to change the task tray
|
||||
icon and/or tooltip string as more bootstrap events come in.
|
||||
|
||||
Controllers should also have some mechanism to alert their user when
|
||||
bootstrapping problems are reported. Perhaps we should gather a set of
|
||||
help texts and the controller can send the user to the right anchor in a
|
||||
"bootstrapping problems" page in the controller's help subsystem?
|
||||
|
||||
6. Getting up to speed when the controller connects.
|
||||
|
||||
There's a new "GETINFO /status/bootstrap-phase" option, which returns
|
||||
the most recent bootstrap phase status event sent. Specifically,
|
||||
it returns a string starting with either "NOTICE BOOTSTRAP ..." or
|
||||
"WARN BOOTSTRAP ...".
|
||||
|
||||
Controllers should use this getinfo when they connect or attach to
|
||||
Tor to learn its current state.
|
||||
|
@ -1,51 +0,0 @@
|
||||
Filename: 138-remove-down-routers-from-consensus.txt
|
||||
Title: Remove routers that are not Running from consensus documents
|
||||
Version: $Revision$
|
||||
Last-Modified: $Date$
|
||||
Author: Peter Palfrader
|
||||
Created: 11-Jun-2008
|
||||
Status: Closed
|
||||
Implemented-In: 0.2.1.2-alpha
|
||||
|
||||
1. Overview.
|
||||
|
||||
Tor directory authorities hourly vote and agree on a consensus document
|
||||
which lists all the routers on the network together with some of their
|
||||
basic properties, like if a router is an exit node, whether it is
|
||||
stable or whether it is a version 2 directory mirror.
|
||||
|
||||
One of the properties given with each router is the 'Running' flag.
|
||||
Clients do not use routers that are not listed as running.
|
||||
|
||||
This proposal suggests that routers without the Running flag are not
|
||||
listed at all.
|
||||
|
||||
2. Current status
|
||||
|
||||
At a typical bootstrap a client downloads a 140KB consensus, about
|
||||
10KB of certificates to verify that consensus, and about 1.6MB of
|
||||
server descriptors, about 1/4 of which it requires before it will
|
||||
start building circuits.
|
||||
|
||||
Another proposal deals with how to get that huge 1.6MB fraction to
|
||||
effectively zero (by downloading only individual descriptors, on
|
||||
demand). Should that get successfully implemented that will leave the
|
||||
140KB compressed consensus as a large fraction of what a client needs
|
||||
to get in order to work.
|
||||
|
||||
About one third of the routers listed in a consensus are not running
|
||||
and will therefore never be used by clients who use this consensus.
|
||||
Not listing those routers will save about 30% to 40% in size.
|
||||
|
||||
3. Proposed change
|
||||
|
||||
Authority directory servers produce vote documents that include all
|
||||
the servers they know about, running or not, like they currently
|
||||
do. In addition these vote documents also state that the authority
|
||||
supports a new consensus forming method (method number 4).
|
||||
|
||||
If more than two thirds of votes that an authority has received claim
|
||||
they support method 4 then this new method will be used: The
|
||||
consensus document is formed like before but a new last step removes
|
||||
all routers from the listing that are not marked as Running.
|
||||
|
@ -1,94 +0,0 @@
|
||||
Filename: 139-conditional-consensus-download.txt
|
||||
Title: Download consensus documents only when it will be trusted
|
||||
Author: Peter Palfrader
|
||||
Created: 2008-04-13
|
||||
Status: Closed
|
||||
Implemented-In: 0.2.1.x
|
||||
|
||||
Overview:
|
||||
|
||||
Servers only provide consensus documents to clients when it is known that
|
||||
the client will trust it.
|
||||
|
||||
Motivation:
|
||||
|
||||
When clients[1] want a new network status consensus they request it
|
||||
from a Tor server using the URL path /tor/status-vote/current/consensus.
|
||||
Then after downloading the client checks if this consensus can be
|
||||
trusted. Whether the client trusts the consensus depends on the
|
||||
authorities that the client trusts and how many of those
|
||||
authorities signed the consensus document.
|
||||
|
||||
If the client cannot trust the consensus document it is disregarded
|
||||
and a new download is tried at a later time. Several hundred
|
||||
kilobytes of server bandwidth were wasted by this single client's
|
||||
request.
|
||||
|
||||
With hundreds of thousands of clients this will have undesirable
|
||||
consequences when the list of authorities has changed so much that a
|
||||
large number of established clients no longer can trust any consensus
|
||||
document formed.
|
||||
|
||||
Objective:
|
||||
|
||||
The objective of this proposal is to make clients not download
|
||||
consensuses they will not trust.
|
||||
|
||||
Proposal:
|
||||
|
||||
The list of authorities that are trusted by a client are encoded in
|
||||
the URL they send to the directory server when requesting a consensus
|
||||
document.
|
||||
|
||||
The directory server then only sends back the consensus when more than
|
||||
half of the authorities listed in the request have signed the
|
||||
consensus. If it is known that the consensus will not be trusted
|
||||
a 404 error code is sent back to the client.
|
||||
|
||||
This proposal does not require directory caches to keep more than one
|
||||
consensus document. This proposal also does not require authorities
|
||||
to verify the signature on the consensus document of authorities they
|
||||
do not recognize.
|
||||
|
||||
The new URL scheme to download a consensus is
|
||||
/tor/status-vote/current/consensus/<F> where F is a list of
|
||||
fingerprints, sorted in ascending order, and concatenated using a +
|
||||
sign.
|
||||
|
||||
Fingerprints are uppercase hexadecimal encodings of the authority
|
||||
identity key's digest. Servers should also accept requests that
|
||||
use lower case or mixed case hexadecimal encodings.
|
||||
|
||||
A .z URL for compressed versions of the consensus will be provided
|
||||
similarly to existing resources and is the URL that usually should
|
||||
be used by clients.
|
||||
|
||||
Migration:
|
||||
|
||||
The old location of the consensus should continue to work
|
||||
indefinitely. Not only is it used by old clients, but it is a useful
|
||||
resource for automated tools that do not particularly care which
|
||||
authorities have signed the consensus.
|
||||
|
||||
Authorities that are known to the client a priori by being shipped
|
||||
with the Tor code are assumed to handle this format.
|
||||
|
||||
When downloading a consensus document from caches that do not support this
|
||||
new format they fall back to the old download location.
|
||||
|
||||
Caches support the new format starting with Tor version 0.2.1.1-alpha.
|
||||
|
||||
Anonymity Implications:
|
||||
|
||||
By supplying the list of authorities a client trusts to the directory
|
||||
server we leak information (like likely version of Tor client) to the
|
||||
directory server. In the current system we also leak that we are
|
||||
very old - by re-downloading the consensus over and over again, but
|
||||
only when we are so old that we no longer can trust the consensus.
|
||||
|
||||
|
||||
|
||||
Footnotes:
|
||||
1. For the purpose of this proposal a client can be any Tor instance
|
||||
that downloads a consensus document. This includes relays,
|
||||
directory caches as well as end users.
|
@ -1,149 +0,0 @@
|
||||
Filename: 140-consensus-diffs.txt
|
||||
Title: Provide diffs between consensuses
|
||||
Version: $Revision$
|
||||
Last-Modified: $Date$
|
||||
Author: Peter Palfrader
|
||||
Created: 13-Jun-2008
|
||||
Status: Accepted
|
||||
Target: 0.2.2.x
|
||||
|
||||
1. Overview.
|
||||
|
||||
Tor clients and servers need a list of which relays are on the
|
||||
network. This list, the consensus, is created by authorities
|
||||
hourly and clients fetch a copy of it, with some delay, hourly.
|
||||
|
||||
This proposal suggests that clients download diffs of consensuses
|
||||
once they have a consensus instead of hourly downloading a full
|
||||
consensus.
|
||||
|
||||
2. Numbers
|
||||
|
||||
After implementing proposal 138 which removes nodes that are not
|
||||
running from the list a consensus document is about 92 kilobytes
|
||||
in size after compression.
|
||||
|
||||
The diff between two consecutive consensus, in ed format, is on
|
||||
average 13 kilobytes compressed.
|
||||
|
||||
3. Proposal
|
||||
|
||||
3.1 Clients
|
||||
|
||||
If a client has a consensus that is recent enough it SHOULD
|
||||
try to download a diff to get the latest consensus rather than
|
||||
fetching a full one.
|
||||
|
||||
[XXX: what is recent enough?
|
||||
time delta in hours / size of compressed diff
|
||||
0 20
|
||||
1 9650
|
||||
2 17011
|
||||
3 23150
|
||||
4 29813
|
||||
5 36079
|
||||
6 39455
|
||||
7 43903
|
||||
8 48907
|
||||
9 54549
|
||||
10 60057
|
||||
11 67810
|
||||
12 71171
|
||||
13 73863
|
||||
14 76048
|
||||
15 80031
|
||||
16 84686
|
||||
17 89862
|
||||
18 94760
|
||||
19 94868
|
||||
20 94223
|
||||
21 93921
|
||||
22 92144
|
||||
23 90228
|
||||
[ size of gzip compressed "diff -e" between the consensus on
|
||||
2008-06-01-00:00:00 and the following consensuses that day.
|
||||
Consensuses have been modified to exclude down routers per
|
||||
proposal 138. ]
|
||||
|
||||
Data suggests that for the first few hours diffs are very useful,
|
||||
saving about 60% for the first three hours, 30% for the first 10,
|
||||
and almost nothing once we are past 16 hours.
|
||||
]
|
||||
|
||||
3.2 Servers
|
||||
|
||||
Directory authorities and servers need to keep up to X [XXX: depends
|
||||
on how long clients try to download diffs per above] old consensus
|
||||
documents so they can build diffs. They should offer a diff to the
|
||||
most recent consensus at the URL
|
||||
|
||||
http://tor.noreply.org/tor/status-vote/current/consensus/diff/<HASH>/<FPRLIST>
|
||||
|
||||
where hash is the full digest of the consensus the client currently
|
||||
has, and FPRLIST is a list of (abbreviated) fingerprints of
|
||||
authorities the client trusts.
|
||||
|
||||
Servers will only return a consensus if more than half of the requested
|
||||
authorities have signed the document, otherwise a 404 error will be sent
|
||||
back. The fingerprints can be shortened to a length of any multiple of
|
||||
two, using only the leftmost part of the encoded fingerprint. Tor uses
|
||||
3 bytes (6 hex characters) of the fingerprint. (This is just like the
|
||||
conditional consensus downloads that Tor supports starting with
|
||||
0.1.2.1-alpha.)
|
||||
|
||||
If a server cannot offer a diff from the consensus identified by the
|
||||
hash but has a current consensus it MUST return the full consensus.
|
||||
|
||||
[XXX: what should we do when the client already has the latest
|
||||
consensus? I can think of the following options:
|
||||
- send back 3xx not modified
|
||||
- send back 200 ok and an empty diff
|
||||
- send back 404 nothing newer here.
|
||||
|
||||
I currently lean towards the empty diff.]
|
||||
|
||||
4. Diff Format
|
||||
|
||||
Diffs start with the token "network-status-diff-version" followed by a
|
||||
space and the version number, currently "1".
|
||||
|
||||
If a document does not start with network-status-diff it is assumed
|
||||
to be a full consensus download and would therefore currently start
|
||||
with "network-status-version 3".
|
||||
|
||||
Following the network-status-diff header line is a diff, or patch, in
|
||||
limited ed format. We choose this format because it is easy to create
|
||||
and process with standard tools (patch, diff -e, ed). This will help
|
||||
us in developing and testing this proposal and it should make future
|
||||
debugging easier.
|
||||
|
||||
[ If at one point in the future we decide that the space benefits from
|
||||
a custom diff format outweighs these benefits we can always
|
||||
introduce a new diff format and offer it at for instance
|
||||
../diff2/... ]
|
||||
|
||||
We support the following ed commands, each on a line by itself:
|
||||
- "<n1>d" Delete line n1
|
||||
- "<n1>,<n2>d" Delete lines n1 through n2, including
|
||||
- "<n1>c" Replace line n1 with the following block
|
||||
- "<n1>,<n2>c" Replace lines n1 through n2, including, with the
|
||||
following block.
|
||||
- "<n1>a" Append the following block after line n1.
|
||||
- "a" Append the following block after the current line.
|
||||
- "s/.//" Remove the first character in the current line.
|
||||
|
||||
Note that line numbers always apply to the file after all previous
|
||||
commands have already been applied.
|
||||
|
||||
The "current line" is either the first line of the file, if this is
|
||||
the first command, the last line of a block we added in an append or
|
||||
change command, or the line immediate following a set of lines we just
|
||||
deleted (or the last line of the file if there are no lines after
|
||||
that).
|
||||
|
||||
The replace and append command take blocks. These blocks are simply
|
||||
appended to the diff after the line with the command. A line with
|
||||
just a period (".") ends the block (and is not part of the lines
|
||||
to add). Note that it is impossible to insert a line with just
|
||||
a single dot. Recommended procedure is to insert a line with
|
||||
two dots, then remove the first character of that line using s/.//.
|
@ -1,325 +0,0 @@
|
||||
Filename: 141-jit-sd-downloads.txt
|
||||
Title: Download server descriptors on demand
|
||||
Version: $Revision$
|
||||
Last-Modified: $Date$
|
||||
Author: Peter Palfrader
|
||||
Created: 15-Jun-2008
|
||||
Status: Draft
|
||||
|
||||
1. Overview
|
||||
|
||||
Downloading all server descriptors is the most expensive part
|
||||
of bootstrapping a Tor client. These server descriptors currently
|
||||
amount to about 1.5 Megabytes of data, and this size will grow
|
||||
linearly with network size.
|
||||
|
||||
Fetching all these server descriptors takes a long while for people
|
||||
behind slow network connections. It is also a considerable load on
|
||||
our network of directory mirrors.
|
||||
|
||||
This document describes proposed changes to the Tor network and
|
||||
directory protocol so that clients will no longer need to download
|
||||
all server descriptors.
|
||||
|
||||
These changes consist of moving load balancing information into
|
||||
network status documents, implementing a means to download server
|
||||
descriptors on demand in an anonymity-preserving way, and dealing
|
||||
with exit node selection.
|
||||
|
||||
2. What is in a server descriptor
|
||||
|
||||
When a Tor client starts the first thing it will try to get is a
|
||||
current network status document: a consensus signed by a majority
|
||||
of directory authorities. This document is currently about 100
|
||||
Kilobytes in size, tho it will grow linearly with network size.
|
||||
This document lists all servers currently running on the network.
|
||||
The Tor client will then try to get a server descriptor for each
|
||||
of the running servers. All server descriptors currently amount
|
||||
to about 1.5 Megabytes of downloads.
|
||||
|
||||
A Tor client learns several things about a server from its descriptor.
|
||||
Some of these it already learned from the network status document
|
||||
published by the authorities, but the server descriptor contains it
|
||||
again in a single statement signed by the server itself, not just by
|
||||
the directory authorities.
|
||||
|
||||
Tor clients use the information from server descriptors for
|
||||
different purposes, which are considered in the following sections.
|
||||
|
||||
#three ways: One, to determine if a server will be able to handle
|
||||
#this client's request; two, to actually communicate or use the server;
|
||||
#three, for load balancing decisions.
|
||||
#
|
||||
#These three points are considered in the following subsections.
|
||||
|
||||
2.1 Load balancing
|
||||
|
||||
The Tor load balancing mechanism is quite complex in its details, but
|
||||
it has a simple goal: The more traffic a server can handle the more
|
||||
traffic it should get. That means the more traffic a server can
|
||||
handle the more likely a client will use it.
|
||||
|
||||
For this purpose each server descriptor has bandwidth information
|
||||
which tries to convey a server's capacity to clients.
|
||||
|
||||
Currently we weigh servers differently for different purposes. There
|
||||
is a weigh for when we use a server as a guard node (our entry to the
|
||||
Tor network), there is one weigh we assign servers for exit duties,
|
||||
and a third for when we need intermediate (middle) nodes.
|
||||
|
||||
2.2 Exit information
|
||||
|
||||
When a Tor wants to exit to some resource on the internet it will
|
||||
build a circuit to an exit node that allows access to that resource's
|
||||
IP address and TCP Port.
|
||||
|
||||
When building that circuit the client can make sure that the circuit
|
||||
ends at a server that will be able to fulfill the request because the
|
||||
client already learned of all the servers' exit policies from their
|
||||
descriptors.
|
||||
|
||||
2.3 Capability information
|
||||
|
||||
Server descriptors contain information about the specific version or
|
||||
the Tor protocol they understand [proposal 105].
|
||||
|
||||
Furthermore the server descriptor also contains the exact version of
|
||||
the Tor software that the server is running and some decisions are
|
||||
made based on the server version number (for instance a Tor client
|
||||
will only make conditional consensus requests [proposal 139] when
|
||||
talking to Tor servers version 0.2.1.1-alpha or later).
|
||||
|
||||
2.4 Contact/key information
|
||||
|
||||
A server descriptor lists a server's IP address and TCP ports on which
|
||||
it accepts onion and directory connections. Furthermore it contains
|
||||
the onion key (a short lived RSA key to which clients encrypt CREATE
|
||||
cells).
|
||||
|
||||
2.5 Identity information
|
||||
|
||||
A Tor client learns the digest of a server's key from the network
|
||||
status document. Once it has a server descriptor this descriptor
|
||||
contains the full RSA identity key of the server. Clients verify
|
||||
that 1) the digest of the identity key matches the expected digest
|
||||
it got from the consensus, and 2) that the signature on the descriptor
|
||||
from that key is valid.
|
||||
|
||||
|
||||
3. No longer require clients to have copies of all SDs
|
||||
|
||||
3.1 Load balancing info in consensus documents
|
||||
|
||||
One of the reasons why clients download all server descriptors is for
|
||||
doing load proper load balancing as described in 2.1. In order for
|
||||
clients to not require all server descriptors this information will
|
||||
have to move into the network status document.
|
||||
|
||||
Consensus documents will have a new line per router similar
|
||||
to the "r", "s", and "v" lines that already exist. This line
|
||||
will convey weight information to clients.
|
||||
|
||||
"w Bandwidth=193"
|
||||
|
||||
The bandwidth number is the lesser of observed bandwidth and bandwidth
|
||||
rate limit from the server descriptor that the "r" line referenced by
|
||||
digest (1st and 3rd field of the bandwidth line in the descriptor).
|
||||
It is given in kilobytes per second so the byte value in the
|
||||
descriptor has to be divided by 1024 (and is then truncated, i.e.
|
||||
rounded down).
|
||||
|
||||
Authorities will cap the bandwidth number at some arbitrary value,
|
||||
currently 10MB/sec. If a router claims a larger bandwidth an
|
||||
authority's vote will still only show Bandwidth=10240.
|
||||
|
||||
The consensus value for bandwidth is the median of all bandwidth
|
||||
numbers given in votes. In case of an even number of votes we use
|
||||
the lower median. (Using this procedure allows us to change the
|
||||
cap value more easily.)
|
||||
|
||||
Clients should believe the bandwidth as presented in the consensus,
|
||||
not capping it again.
|
||||
|
||||
3.2 Fetching descriptors on demand
|
||||
|
||||
As described in 2.4 a descriptor lists IP address, OR- and Dir-Port,
|
||||
and the onion key for a server.
|
||||
|
||||
A client already knows the IP address and the ports from the consensus
|
||||
documents, but without the onion key it will not be able to send
|
||||
CREATE/EXTEND cells for that server. Since the client needs the onion
|
||||
key it needs the descriptor.
|
||||
|
||||
If a client only downloaded a few descriptors in an observable manner
|
||||
then that would leak which nodes it was going to use.
|
||||
|
||||
This proposal suggests the following:
|
||||
|
||||
1) when connecting to a guard node for which the client does not
|
||||
yet have a cached descriptor it requests the descriptor it
|
||||
expects by hash. (The consensus document that the client holds
|
||||
has a hash for the descriptor of this server. We want exactly
|
||||
that descriptor, not a different one.)
|
||||
|
||||
It does that by sending a RELAY_REQUEST_SD cell.
|
||||
|
||||
A client MAY cache the descriptor of the guard node so that it does
|
||||
not need to request it every single time it contacts the guard.
|
||||
|
||||
2) when a client wants to extend a circuit that currently ends in
|
||||
server B to a new next server C, the client will send a
|
||||
RELAY_REQUEST_SD cell to server B. This cell contains in its
|
||||
payload the hash of a server descriptor the client would like
|
||||
to obtain (C's server descriptor). The server sends back the
|
||||
descriptor and the client can now form a valid EXTEND/CREATE cell
|
||||
encrypted to C's onion key.
|
||||
|
||||
Clients MUST NOT cache such descriptors. If they did they might
|
||||
leak that they already extended to that server at least once
|
||||
before.
|
||||
|
||||
Replies to RELAY_REQUEST_SD requests need to be padded to some
|
||||
constant upper limit in order to conceal a client's destination
|
||||
from anybody who might be counting cells/bytes.
|
||||
|
||||
RELAY_REQUEST_SD cells contain the following information:
|
||||
- hash of the server descriptor requested
|
||||
- hash of the identity digest of the server for which we want the SD
|
||||
- IP address and OR-port or the server for which we want the SD
|
||||
- padding factor - the number of cells we want the answer
|
||||
padded to.
|
||||
[XXX this just occured to me and it might be smart. or it might
|
||||
be stupid. clients would learn the padding factor they want
|
||||
to use from the consensus document. This allows us to grow
|
||||
the replies later on should SDs become larger.]
|
||||
[XXX: figure out a decent padding size]
|
||||
|
||||
3.3 Protocol versions
|
||||
|
||||
Server descriptors contain optional information of supported
|
||||
link-level and circuit-level protocols in the form of
|
||||
"opt protocols Link 1 2 Circuit 1". These are not currently needed
|
||||
and will probably eventually move into the "v" (version) line in
|
||||
the consensus. This proposal does not deal with them.
|
||||
|
||||
Similarly a server descriptor contains the version number of
|
||||
a Tor node. This information is already present in the consensus
|
||||
and is thus available to all clients immediately.
|
||||
|
||||
3.4 Exit selection
|
||||
|
||||
Currently finding an appropriate exit node for a user's request is
|
||||
easy for a client because it has complete knowledge of all the exit
|
||||
policies of all servers on the network.
|
||||
|
||||
The consensus document will once again be extended to contain the
|
||||
information required by clients. This information will be a summary
|
||||
of each node's exit policy. The exit policy summary will only contain
|
||||
the list of ports to which a node exits to most destination IP
|
||||
addresses.
|
||||
|
||||
A summary should claim a router exits to a specific TCP port if,
|
||||
ignoring private IP addresses, the exit policy indicates that the
|
||||
router would exit to this port to most IP address. either two /8
|
||||
netblocks, or one /8 and a couple of /12s or any other combination).
|
||||
The exact algorith used is this: Going through all exit policy items
|
||||
- ignore any accept that is not for all IP addresses ("*"),
|
||||
- ignore rejects for these netblocks (exactly, no subnetting):
|
||||
0.0.0.0/8, 169.254.0.0/16, 127.0.0.0/8, 192.168.0.0/16, 10.0.0.0/8,
|
||||
and 172.16.0.0/12m
|
||||
- for each reject count the number of IP addresses rejected against
|
||||
the affected ports,
|
||||
- once we hit an accept for all IP addresses ("*") add the ports in
|
||||
that policy item to the list of accepted ports, if they don't have
|
||||
more than 2^25 IP addresses (that's two /8 networks) counted
|
||||
against them (i.e. if the router exits to a port to everywhere but
|
||||
at most two /8 networks).
|
||||
|
||||
An exit policy summary will be included in votes and consensus as a
|
||||
new line attached to each exit node. The line will have the format
|
||||
"p" <space> "accept"|"reject" <portlist>
|
||||
where portlist is a comma seperated list of single port numbers or
|
||||
portranges (e.g. "22,80-88,1024-6000,6667").
|
||||
|
||||
Whether the summary shows the list of accepted ports or the list of
|
||||
rejected ports depends on which list is shorter (has a shorter string
|
||||
representation). In case of ties we choose the list of accepted
|
||||
ports. As an exception to this rule an allow-all policy is
|
||||
represented as "accept 1-65535" instead of "reject " and a reject-all
|
||||
policy is similarly given as "reject 1-65535".
|
||||
|
||||
Summary items are compressed, that is instead of "80-88,89-100" there
|
||||
only is a single item of "80-100", similarly instead of "20,21" a
|
||||
summary will say "20-21".
|
||||
|
||||
Port lists are sorted in ascending order.
|
||||
|
||||
The maximum allowed length of a policy summary (including the "accept "
|
||||
or "reject ") is 1000 characters. If a summary exceeds that length we
|
||||
use an accept-style summary and list as much of the port list as is
|
||||
possible within these 1000 bytes.
|
||||
|
||||
3.4.1 Consensus selection
|
||||
|
||||
When building a consensus, authorities have to agree on a digest of
|
||||
the server descriptor to list in the router line for each router.
|
||||
This is documented in dir-spec section 3.4.
|
||||
|
||||
All authorities that listed that agreed upon descriptor digest in
|
||||
their vote should also list the same exit policy summary - or list
|
||||
none at all if the authority has not been upgraded to list that
|
||||
information in their vote.
|
||||
|
||||
If we have votes with matching server descriptor digest of which at
|
||||
least one of them has an exit policy then we differ between two cases:
|
||||
a) all authorities agree (or abstained) on the policy summary, and we
|
||||
use the exit policy summary that they all listed in their vote,
|
||||
b) something went wrong (or some authority is playing foul) and we
|
||||
have different policy summaries. In that case we pick the one
|
||||
that is most commonly listed in votes with the matching
|
||||
descriptor. We break ties in favour of the lexigraphically larger
|
||||
vote.
|
||||
|
||||
If none one of the votes with a matching server descriptor digest has
|
||||
an exit policy summary we use the most commonly listed one in all
|
||||
votes, breaking ties like in case b above.
|
||||
|
||||
3.4.2 Client behaviour
|
||||
|
||||
When choosing an exit node for a specific request a Tor client will
|
||||
choose from the list of nodes that exit to the requested port as given
|
||||
by the consensus document. If a client has additional knowledge (like
|
||||
cached full descriptors) that indicates the so chosen exit node will
|
||||
reject the request then it MAY use that knowledge (or not include such
|
||||
nodes in the selection to begin with). However, clients MUST NOT use
|
||||
nodes that do not list the port as accepted in the summary (but for
|
||||
which they know that the node would exit to that address from other
|
||||
sources, like a cached descriptor).
|
||||
|
||||
An exception to this is exit enclave behaviour: A client MAY use the
|
||||
node at a specific IP address to exit to any port on the same address
|
||||
even if that node is not listed as exiting to the port in the summary.
|
||||
|
||||
4. Migration
|
||||
|
||||
4.1 Consensus document changes.
|
||||
|
||||
The consensus will need to include
|
||||
- bandwidth information (see 3.1)
|
||||
- exit policy summaries (3.4)
|
||||
|
||||
A new consensus method (number TBD) will be chosen for this.
|
||||
|
||||
5. Future possibilities
|
||||
|
||||
This proposal still requires that all servers have the descriptors of
|
||||
every other node in the network in order to answer RELAY_REQUEST_SD
|
||||
cells. These cells are sent when a circuit is extended from ending at
|
||||
node B to a new node C. In that case B would have to answer a
|
||||
RELAY_REQUEST_SD cell that asks for C's server descriptor (by SD digest).
|
||||
|
||||
In order to answer that request B obviously needs a copy of C's server
|
||||
descriptor. The RELAY_REQUEST_SD cell already has all the info that
|
||||
B needs to contact C so it can ask about the descriptor before passing it
|
||||
back to the client.
|
||||
|
@ -1,279 +0,0 @@
|
||||
Filename: 142-combine-intro-and-rend-points.txt
|
||||
Title: Combine Introduction and Rendezvous Points
|
||||
Version: $Revision$
|
||||
Last-Modified: $Date$
|
||||
Author: Karsten Loesing, Christian Wilms
|
||||
Created: 27-Jun-2008
|
||||
Status: Dead
|
||||
|
||||
Change history:
|
||||
|
||||
27-Jun-2008 Initial proposal for or-dev
|
||||
04-Jul-2008 Give first security property the new name "Responsibility"
|
||||
and change new cell formats according to rendezvous protocol
|
||||
version 3 draft.
|
||||
19-Jul-2008 Added comment by Nick (but no solution, yet) that sharing of
|
||||
circuits between multiple clients is not supported by Tor.
|
||||
|
||||
Overview:
|
||||
|
||||
Establishing a connection to a hidden service currently involves two Tor
|
||||
relays, introduction and rendezvous point, and 10 more relays distributed
|
||||
over four circuits to connect to them. The introduction point is
|
||||
established in the mid-term by a hidden service to transfer introduction
|
||||
requests from client to the hidden service. The rendezvous point is set
|
||||
up by the client for a single hidden service request and actually
|
||||
transfers end-to-end encrypted application data between client and hidden
|
||||
service.
|
||||
|
||||
There are some reasons for separating the two roles of introduction and
|
||||
rendezvous point: (1) Responsibility: A relay shall not be made
|
||||
responsible that it relays data for a certain hidden service; in the
|
||||
original design as described in [1] an introduction point relays no
|
||||
application data, and a rendezvous points neither knows the hidden
|
||||
service nor can it decrypt the data. (2) Scalability: The hidden service
|
||||
shall not have to maintain a number of open circuits proportional to the
|
||||
expected number of client requests. (3) Attack resistance: The effect of
|
||||
an attack on the only visible parts of a hidden service, its introduction
|
||||
points, shall be as small as possible.
|
||||
|
||||
However, elimination of a separate rendezvous connection as proposed by
|
||||
Øverlier and Syverson [2] is the most promising approach to improve the
|
||||
delay in connection establishment. From all substeps of connection
|
||||
establishment extending a circuit by only a single hop is responsible for
|
||||
a major part of delay. Reducing on-demand circuit extensions from two to
|
||||
one results in a decrease of mean connection establishment times from 39
|
||||
to 29 seconds [3]. Particularly, eliminating the delay on hidden-service
|
||||
side allows the client to better observe progress of connection
|
||||
establishment, thus allowing it to use smaller timeouts. Proposal 114
|
||||
introduced new introduction keys for introduction points and provides for
|
||||
user authorization data in hidden service descriptors; it will be shown
|
||||
in this proposal that introduction keys in combination with new
|
||||
introduction cookies provide for the first security property
|
||||
responsibility. Further, eliminating the need for a separate introduction
|
||||
connection benefits the overall network load by decreasing the number of
|
||||
circuit extensions. After all, having only one connection between client
|
||||
and hidden service reduces the overall protocol complexity.
|
||||
|
||||
Design:
|
||||
|
||||
1. Hidden Service Configuration
|
||||
|
||||
Hidden services should be able to choose whether they would like to use
|
||||
this protocol. This might be opt-in for 0.2.1.x and opt-out for later
|
||||
major releases.
|
||||
|
||||
2. Contact Point Establishment
|
||||
|
||||
When preparing a hidden service, a Tor client selects a set of relays to
|
||||
act as contact points instead of introduction points. The contact point
|
||||
combines both roles of introduction and rendezvous point as proposed in
|
||||
[2]. The only requirement for a relay to be picked as contact point is
|
||||
its capability of performing this role. This can be determined from the
|
||||
Tor version number that needs to be equal or higher than the first
|
||||
version that implements this proposal.
|
||||
|
||||
The easiest way to implement establishment of contact points is to
|
||||
introduce v2 ESTABLISH_INTRO cells. By convention, the relay recognizes
|
||||
version 2 ESTABLISH_INTRO cells as requests to establish a contact point
|
||||
rather than an introduction point.
|
||||
|
||||
V Format byte: set to 255 [1 octet]
|
||||
V Version byte: set to 2 [1 octet]
|
||||
KLEN Key length [2 octets]
|
||||
PK Public introduction key [KLEN octets]
|
||||
HS Hash of session info [20 octets]
|
||||
SIG Signature of above information [variable]
|
||||
|
||||
The hidden service does not create a fixed number of contact points, like
|
||||
3 in the current protocol. It uses a minimum of 3 contact points, but
|
||||
increases this number depending on the history of client requests within
|
||||
the last hour. The hidden service also increases this number depending on
|
||||
the frequency of failing contact points in order to defend against
|
||||
attacks on its contact points. When client authorization as described in
|
||||
proposal 121 is used, a hidden service can also use the number of
|
||||
authorized clients as first estimate for the required number of contact
|
||||
points.
|
||||
|
||||
3. Hidden Service Descriptor Creation
|
||||
|
||||
A hidden service needs to issue a fresh introduction cookie for each
|
||||
established introduction point. By requiring clients to use this cookie
|
||||
in a later connection establishment, an introduction point cannot access
|
||||
the hidden service that it works for. Together with the fresh
|
||||
introduction key that was introduced in proposal 114, this reduces
|
||||
responsibility of a contact point for a specific hidden service.
|
||||
|
||||
The v2 hidden service descriptor format contains an
|
||||
"intro-authentication" field that may contain introduction-point specific
|
||||
keys. The hidden service creates a random string, comparable to the
|
||||
rendezvous cookie, and includes it in the descriptor as introduction
|
||||
cookie for auth-type "1". By convention, clients recognize existence of
|
||||
auth-type 1 as possibility to connect to a hidden service via a contact
|
||||
point rather than an introduction point. Older clients that do not
|
||||
understand this new protocol simply ignore that cookie.
|
||||
|
||||
4. Connection Establishment
|
||||
|
||||
When establishing a connection to a hidden service a client learns about
|
||||
the capability of using the new protocol from the hidden service
|
||||
descriptor. It may choose whether to use this new protocol or not,
|
||||
whereas older clients cannot understand the new capability and can only
|
||||
use the current protocol. Client using version 0.2.1.x should be able to
|
||||
opt-in for using the new protocol, which should change to opt-out for
|
||||
later major releases.
|
||||
|
||||
When using the new capability the client creates a v2 INTRODUCE1 cell
|
||||
that extends an unversioned INTRODUCE1 cell by adding the content of an
|
||||
ESTABLISH_RENDEZVOUS cell. Further, the client sends this cell using the
|
||||
new cell type 41 RELAY_INTRODUCE1_VERSIONED to the introduction point,
|
||||
because unversioned and versioned INTRODUCE1 cells are indistinguishable:
|
||||
|
||||
Cleartext
|
||||
V Version byte: set to 2 [1 octet]
|
||||
PK_ID Identifier for Bob's PK [20 octets]
|
||||
RC Rendezvous cookie [20 octets]
|
||||
Encrypted to introduction key:
|
||||
VER Version byte: set to 3. [1 octet]
|
||||
AUTHT The auth type that is supported [1 octet]
|
||||
AUTHL Length of auth data [2 octets]
|
||||
AUTHD Auth data [variable]
|
||||
RC Rendezvous cookie [20 octets]
|
||||
g^x Diffie-Hellman data, part 1 [128 octets]
|
||||
|
||||
The cleartext part contains the rendezvous cookie that the contact point
|
||||
remembers just as a rendezvous point would do.
|
||||
|
||||
The encrypted part contains the introduction cookie as auth data for the
|
||||
auth type 1. The rendezvous cookie is contained as before, but there is
|
||||
no further rendezvous point information, as there is no separate
|
||||
rendezvous point.
|
||||
|
||||
5. Rendezvous Establishment
|
||||
|
||||
The contact point recognizes a v2 INTRODUCE1 cell with auth type 1 as a
|
||||
request to be used in the new protocol. It remembers the contained
|
||||
rendezvous cookie, replies to the client with an INTRODUCE_ACK cell
|
||||
(omitting the RENDEZVOUS_ESTABLISHED cell), and forwards the encrypted
|
||||
part of the INTRODUCE1 cell as INTRODUCE2 cell to the hidden service.
|
||||
|
||||
6. Introduction at Hidden Service
|
||||
|
||||
The hidden services recognizes an INTRODUCE2 cell containing an
|
||||
introduction cookie as authorization data. In this case, it does not
|
||||
extend a circuit to a rendezvous point, but sends a RENDEZVOUS1 cell
|
||||
directly back to its contact point as usual.
|
||||
|
||||
7. Rendezvous at Contact Point
|
||||
|
||||
The contact point processes a RENDEZVOUS1 cell just as a rendezvous point
|
||||
does. The only difference is that the hidden-service-side circuit is not
|
||||
exclusive for the client connection, but shared among multiple client
|
||||
connections.
|
||||
|
||||
[Tor does not allow sharing of a single circuit among multiple client
|
||||
connections easily. We need to think about a smart and efficient way to
|
||||
implement this. Comment by Nick. -KL]
|
||||
|
||||
Security Implications:
|
||||
|
||||
(1) Responsibility
|
||||
|
||||
One of the original reasons for the separation of introduction and
|
||||
rendezvous points is that a relay shall not be made responsible that it
|
||||
relays data for a certain hidden service. In the current design an
|
||||
introduction point relays no application data and a rendezvous points
|
||||
neither knows the hidden service nor can it decrypt the data.
|
||||
|
||||
This property is also fulfilled in this new design. A contact point only
|
||||
learns a fresh introduction key instead of the hidden service key, so
|
||||
that it cannot recognize a hidden service. Further, the introduction
|
||||
cookie, which is unknown to the contact point, prevents it from accessing
|
||||
the hidden service itself. The only way for a contact point to access a
|
||||
hidden service is to look up whether it is contained in the descriptors
|
||||
of known hidden services. A contact point cannot directly be made
|
||||
responsible for which hidden service it is working. In addition to that,
|
||||
it cannot learn the data that it transfers, because all communication
|
||||
between client and hidden service are end-to-end encrypted.
|
||||
|
||||
(2) Scalability
|
||||
|
||||
Another goal of the existing hidden service protocol is that a hidden
|
||||
service does not have to maintain a number of open circuits proportional
|
||||
to the expected number of client requests. The rationale behind this is
|
||||
better scalability.
|
||||
|
||||
The new protocol eliminates the need for a hidden service to extend
|
||||
circuits on demand, which has a positive effect on circuits establishment
|
||||
times and overall network load. The solution presented here to establish
|
||||
a number of contact points proportional to the history of connection
|
||||
requests reduces the number of circuits to a minimum number that fits the
|
||||
hidden service's needs.
|
||||
|
||||
(3) Attack resistance
|
||||
|
||||
The third goal of separating introduction and rendezvous points is to
|
||||
limit the effect of an attack on the only visible parts of a hidden
|
||||
service which are the contact points in this protocol.
|
||||
|
||||
In theory, the new protocol is more vulnerable to this attack. An
|
||||
attacker who can take down a contact point does not only eliminate an
|
||||
access point to the hidden service, but also breaks current client
|
||||
connections to the hidden service using that contact point.
|
||||
|
||||
Øverlier and Syverson proposed the concept of valet nodes as additional
|
||||
safeguard for introduction/contact points [4]. Unfortunately, this
|
||||
increases hidden service protocol complexity conceptually and from an
|
||||
implementation point of view. Therefore, it is not included in this
|
||||
proposal.
|
||||
|
||||
However, in practice attacking a contact point (or introduction point) is
|
||||
not as rewarding as it might appear. The cost for a hidden service to set
|
||||
up a new contact point and publish a new hidden service descriptor is
|
||||
minimal compared to the efforts necessary for an attacker to take a Tor
|
||||
relay down. As a countermeasure to further frustrate this attack, the
|
||||
hidden service raises the number of contact points as a function of
|
||||
previous contact point failures.
|
||||
|
||||
Further, the probability of breaking client connections due to attacking
|
||||
a contact point is minimal. It can be assumed that the probability of one
|
||||
of the other five involved relays in a hidden service connection failing
|
||||
or being shut down is higher than that of a successful attack on a
|
||||
contact point.
|
||||
|
||||
(4) Resistance against Locating Attacks
|
||||
|
||||
Clients are no longer able to force a hidden service to create or extend
|
||||
circuits. This further reduces an attacker's capabilities of locating a
|
||||
hidden server as described by Øverlier and Syverson [5].
|
||||
|
||||
Compatibility:
|
||||
|
||||
The presented protocol does not raise compatibility issues with current
|
||||
Tor versions. New relay versions support both, the existing and the
|
||||
proposed protocol as introduction/rendezvous/contact points. A contact
|
||||
point acts as introduction point simultaneously. Hidden services and
|
||||
clients can opt-in to use the new protocol which might change to opt-out
|
||||
some time in the future.
|
||||
|
||||
References:
|
||||
|
||||
[1] Roger Dingledine, Nick Mathewson, and Paul Syverson, Tor: The
|
||||
Second-Generation Onion Router. In the Proceedings of the 13th USENIX
|
||||
Security Symposium, August 2004.
|
||||
|
||||
[2] Lasse Øverlier and Paul Syverson, Improving Efficiency and Simplicity
|
||||
of Tor Circuit Establishment and Hidden Services. In the Proceedings of
|
||||
the Seventh Workshop on Privacy Enhancing Technologies (PET 2007),
|
||||
Ottawa, Canada, June 2007.
|
||||
|
||||
[3] Christian Wilms, Improving the Tor Hidden Service Protocol Aiming at
|
||||
Better Performance, diploma thesis, June 2008, University of Bamberg.
|
||||
|
||||
[4] Lasse Øverlier and Paul Syverson, Valet Services: Improving Hidden
|
||||
Servers with a Personal Touch. In the Proceedings of the Sixth Workshop
|
||||
on Privacy Enhancing Technologies (PET 2006), Cambridge, UK, June 2006.
|
||||
|
||||
[5] Lasse Øverlier and Paul Syverson, Locating Hidden Servers. In the
|
||||
Proceedings of the 2006 IEEE Symposium on Security and Privacy, May 2006.
|
||||
|
@ -1,196 +0,0 @@
|
||||
Filename: 143-distributed-storage-improvements.txt
|
||||
Title: Improvements of Distributed Storage for Tor Hidden Service Descriptors
|
||||
Version: $Revision$
|
||||
Last-Modified: $Date$
|
||||
Author: Karsten Loesing
|
||||
Created: 28-Jun-2008
|
||||
Status: Open
|
||||
Target: 0.2.1.x
|
||||
|
||||
Change history:
|
||||
|
||||
28-Jun-2008 Initial proposal for or-dev
|
||||
|
||||
Overview:
|
||||
|
||||
An evaluation of the distributed storage for Tor hidden service
|
||||
descriptors and subsequent discussions have brought up a few improvements
|
||||
to proposal 114. All improvements are backwards compatible to the
|
||||
implementation of proposal 114.
|
||||
|
||||
Design:
|
||||
|
||||
1. Report Bad Directory Nodes
|
||||
|
||||
Bad hidden service directory nodes could deny existence of previously
|
||||
stored descriptors. A bad directory node that does this with all stored
|
||||
descriptors causes harm to the distributed storage in general, but
|
||||
replication will cope with this problem in most cases. However, an
|
||||
adversary that attempts to make a specific hidden service unavailable by
|
||||
running relays that become responsible for all of a service's
|
||||
descriptors poses a more serious threat. The distributed storage needs to
|
||||
defend against this attack by detecting and removing bad directory nodes.
|
||||
|
||||
As a countermeasure hidden services try to download their descriptors
|
||||
every hour at random times from the hidden service directories that are
|
||||
responsible for storing it. If a directory node replies with 404 (Not
|
||||
found), the hidden service reports the supposedly bad directory node to
|
||||
a random selection of half of the directory authorities (with version
|
||||
numbers equal to or higher than the first version that implements this
|
||||
proposal). The hidden service posts a complaint message using HTTP 'POST'
|
||||
to a URL "/tor/rendezvous/complain" with the following message format:
|
||||
|
||||
"hidden-service-directory-complaint" identifier NL
|
||||
|
||||
[At start, exactly once]
|
||||
|
||||
The identifier of the hidden service directory node to be
|
||||
investigated.
|
||||
|
||||
"rendezvous-service-descriptor" descriptor NL
|
||||
|
||||
[At end, Excatly once]
|
||||
|
||||
The hidden service descriptor that the supposedly bad directory node
|
||||
does not serve.
|
||||
|
||||
The directory authority checks if the descriptor is valid and the hidden
|
||||
service directory responsible for storing it. It waits for a random time
|
||||
of up to 30 minutes before posting the descriptor to the hidden service
|
||||
directory. If the publication is acknowledged, the directory authority
|
||||
waits another random time of up to 30 minutes before attempting to
|
||||
request the descriptor that it has posted. If the directory node replies
|
||||
with 404 (Not found), it will be blacklisted for being a hidden service
|
||||
directory node for the next 48 hours.
|
||||
|
||||
A blacklisted hidden service directory is assigned the new flag BadHSDir
|
||||
instead of the HSDir flag in the vote that a directory authority creates.
|
||||
In a consensus a relay is only assigned a HSDir flag if the majority of
|
||||
votes contains a HSDir flag and no more than one third of votes contains
|
||||
a BadHSDir flag. As a result, clients do not have to learn about the
|
||||
BadHSDir flag. A blacklisted directory node will simply not be assigned
|
||||
the HSDir flag in the consensus.
|
||||
|
||||
In order to prevent an attacker from setting up new nodes as replacement
|
||||
for blacklisted directory nodes, all directory nodes in the same /24
|
||||
subnet are blacklisted, too. Furthermore, if two or more directory nodes
|
||||
are blacklisted in the same /16 subnet concurrently, all other directory
|
||||
nodes in that /16 subnet are blacklisted, too. Blacklisting holds for at
|
||||
most 48 hours.
|
||||
|
||||
2. Publish Fewer Replicas
|
||||
|
||||
The evaluation has shown that the probability of a directory node to
|
||||
serve a previously stored descriptor is 85.7% (more precisely, this is
|
||||
the 0.001-quantile of the empirical distribution with the rationale that
|
||||
it holds for 99.9% of all empirical cases). If descriptors are replicated
|
||||
to x directory nodes, the probability of at least one of the replicas to
|
||||
be available for clients is 1 - (1 - 85.7%) ^ x. In order to achieve an
|
||||
overall availability of 99.9%, x = 3.55 replicas need to be stored. From
|
||||
this follows that 4 replicas are sufficient, rather than the currently
|
||||
stored 6 replicas.
|
||||
|
||||
Further, the current design stores 2 sets of descriptors on 3 directory
|
||||
nodes with consecutive identities. Originally, this was meant to
|
||||
facilitate replication between directory nodes, which has not been and
|
||||
will not be implemented (the selection criterion of 24 hours uptime does
|
||||
not make it necessary). As a result, storing descriptors on directory
|
||||
nodes with consecutive identities is not required. In fact it should be
|
||||
avoided to enable an attacker to create "black holes" in the identifier
|
||||
ring.
|
||||
|
||||
Hidden services should store their descriptors on 4 non-consecutive
|
||||
directory nodes, and clients should request descriptors from these
|
||||
directory nodes only. For compatibility reasons, hidden services also
|
||||
store their descriptors on 2 consecutive directory nodes. Hence, 0.2.0.x
|
||||
clients will be able to retrieve 4 out of 6 descriptors, but will fail
|
||||
for the remaining 2 descriptors, which is sufficient for reliability. As
|
||||
soon as 0.2.0.x is deprecated, hidden services can stop publishing the
|
||||
additional 2 replicas.
|
||||
|
||||
3. Change Default Value of Being Hidden Service Directory
|
||||
|
||||
The requirements for becoming a hidden service directory node are an open
|
||||
directory port and an uptime of at least 24 hours. The evaluation has
|
||||
shown that there are 300 hidden service directory candidates in the mean,
|
||||
but only 6 of them are configured to act as hidden service directories.
|
||||
This is bad, because those 6 nodes need to serve a large share of all
|
||||
hidden service descriptors. Optimally, there should be hundreds of hidden
|
||||
service directories. Having a large number of 0.2.1.x directory nodes
|
||||
also has a positive effect on 0.2.0.x hidden services and clients.
|
||||
|
||||
Therefore, the new default of HidServDirectoryV2 should be 1, so that a
|
||||
Tor relay that has an open directory port automatically accepts and
|
||||
serves v2 hidden service descriptors. A relay operator can still opt-out
|
||||
running a hidden service directory by changing HidServDirectoryV2 to 0.
|
||||
The additional bandwidth requirements for running a hidden service
|
||||
directory node in addition to being a directory cache are negligible.
|
||||
|
||||
4. Make Descriptors Persistent on Directory Nodes
|
||||
|
||||
Hidden service directories that are restarted by their operators or after
|
||||
a failure will not be selected as hidden service directories within the
|
||||
next 24 hours. However, some clients might still think that these nodes
|
||||
are responsible for certain descriptors, because they work on the basis
|
||||
of network consensuses that are up to three hours old. The directory
|
||||
nodes should be able to serve the previously received descriptors to
|
||||
these clients. Therefore, directory nodes make all received descriptors
|
||||
persistent and load previously received descriptors on startup.
|
||||
|
||||
5. Store and Serve Descriptors Regardless of Responsibility
|
||||
|
||||
Currently, directory nodes only accept descriptors for which they think
|
||||
they are responsible. This may lead to problems when a directory node
|
||||
uses an older or newer network consensus than hidden service or client
|
||||
or when a directory node has been restarted recently. In fact, there are
|
||||
no security issues in storing or serving descriptors for which a
|
||||
directory node thinks it is not responsible. To the contrary, doing so
|
||||
may improve reliability in border cases. As a result, a directory node
|
||||
does not pay attention to responsibilty when receiving a publication or
|
||||
fetch request, but stores or serves the requested descriptor. Likewise,
|
||||
the directory node does not remove descriptors when it thinks it is not
|
||||
responsible for them any more.
|
||||
|
||||
6. Avoid Periodic Descriptor Re-Publication
|
||||
|
||||
In the current implementation a hidden service re-publishes its
|
||||
descriptor either when its content changes or an hour elapses. However,
|
||||
the evaluation has shown that failures of hidden service directory nodes,
|
||||
i.e. of nodes that have not failed within the last 24 hours, are very
|
||||
rare. Together with making descriptors persistent on directory nodes,
|
||||
there is no necessity to re-publish descriptors hourly.
|
||||
|
||||
The only two events leading to descriptor re-publication should be a
|
||||
change of the descriptor content and a new directory node becoming
|
||||
responsible for the descriptor. Hidden services should therefore consider
|
||||
re-publication every time they learn about a new network consensus
|
||||
instead of hourly.
|
||||
|
||||
7. Discard Expired Descriptors
|
||||
|
||||
The current implementation lets directory nodes keep a descriptor for two
|
||||
days before discarding it. However, with the v2 design, descriptors are
|
||||
only valid for at most one day. Directory nodes should determine the
|
||||
validity of stored descriptors and discard them one hour after they have
|
||||
expired (to compensate wrong clocks on clients).
|
||||
|
||||
8. Shorten Client-Side Descriptor Fetch History
|
||||
|
||||
When clients try to download a hidden service descriptor, they memorize
|
||||
fetch requests to directory nodes for up to 15 minutes. This allows them
|
||||
to request all replicas of a descriptor to avoid bad or failing directory
|
||||
nodes, but without querying the same directory node twice.
|
||||
|
||||
The downside is that a client that has requested a descriptor without
|
||||
success, will not be able to find a hidden service that has been started
|
||||
during the following 15 minutes after the client's last request.
|
||||
|
||||
This can be improved by shortening the fetch history to only 5 minutes.
|
||||
This time should be sufficient to complete requests for all replicas of a
|
||||
descriptor, but without ending in an infinite request loop.
|
||||
|
||||
Compatibility:
|
||||
|
||||
All proposed improvements are compatible to the currently implemented
|
||||
design as described in proposal 114.
|
||||
|
@ -1,165 +0,0 @@
|
||||
Filename: 144-enforce-distinct-providers.txt
|
||||
Title: Increase the diversity of circuits by detecting nodes belonging the
|
||||
same provider
|
||||
Author: Mfr
|
||||
Created: 2008-06-15
|
||||
Status: Draft
|
||||
|
||||
Overview:
|
||||
|
||||
Increase network security by reducing the capacity of the relay or
|
||||
ISPs monitoring personally or requisition, a large part of traffic
|
||||
Tor trying to break circuits privacy. A way to increase the
|
||||
diversity of circuits without killing the network performance.
|
||||
|
||||
Motivation:
|
||||
|
||||
Since 2004, Roger an Nick publication about diversity [1], very fast
|
||||
relays Tor running are focused among an half dozen of providers,
|
||||
controlling traffic of some dozens of routers [2].
|
||||
|
||||
In the same way the generalization of VMs clonables paid by hour,
|
||||
allowing starting in few minutes and for a small cost, a set of very
|
||||
high-speed relay whose in a few hours can attract a big traffic that
|
||||
can be analyzed, increasing the vulnerability of the network.
|
||||
|
||||
Whether ISPs or domU providers, these usually have several groups of
|
||||
IP Class B. Also the restriction in place EnforceDistinctSubnets
|
||||
automatically excluding IP subnet class B is only partially
|
||||
effective. By contrast a restriction at the class A will be too
|
||||
restrictive.
|
||||
|
||||
Therefore it seems necessary to consider another approach.
|
||||
|
||||
Proposal:
|
||||
|
||||
Add a provider control based on AS number added by the router on is
|
||||
descriptor, controlled by Directories Authorities, and used like the
|
||||
declarative family field for circuit creating.
|
||||
|
||||
Design:
|
||||
|
||||
Step 1 :
|
||||
|
||||
Add to the router descriptor a provider information get request [4]
|
||||
by the router itself.
|
||||
|
||||
"provider" name NL
|
||||
|
||||
'names' is the AS number of the router formated like this:
|
||||
'ASxxxxxx' where AS is fixed and xxxxxx is the AS number,
|
||||
left aligned ( ex: AS98304 , AS4096,AS1 ) or if AS number
|
||||
is missing the network A class number is used like that:
|
||||
'ANxxx' where AN is fixed and xxx is the first 3 digits of
|
||||
the IP (ex: for the IP 1.1.1.2 AN1) or an 'L' value is set
|
||||
if it's a local network IP.
|
||||
|
||||
If two ORs list one another in their "provider" entries,
|
||||
then OPs should treat them as a single OR for the purpose
|
||||
of path selection.
|
||||
|
||||
For example, if node A's descriptor contains "provider B",
|
||||
and node B's descriptor contains "provider A", then node A
|
||||
and node B should never be used on the same circuit.
|
||||
|
||||
Add the regarding config option in torrc
|
||||
|
||||
EnforceDistinctProviders set to 1 by default.
|
||||
Permit building circuits with relays in the same provider
|
||||
if set to 0.
|
||||
Regarding to proposal 135 if TestingTorNetwork is set
|
||||
need to be EnforceDistinctProviders is unset.
|
||||
|
||||
Control by Authorities Directories of the AS numbers
|
||||
|
||||
The Directories Authority control the AS numbers of the new node
|
||||
descriptor uploaded.
|
||||
|
||||
If an old version is operated by the node this test is
|
||||
bypassed.
|
||||
|
||||
If AS number get by request is different from the
|
||||
description, router is flagged as non-Valid by the testing
|
||||
Authority for the voting process.
|
||||
|
||||
Step 2 When a ' significant number of nodes' of valid routers are
|
||||
generating descriptor with provider information.
|
||||
|
||||
Add missing provider information get by DNS request
|
||||
functionality for the circuit user:
|
||||
|
||||
During circuit building, computing, OP apply first
|
||||
family check and EnforceDistinctSubnets directives for
|
||||
performance, then if provider info is needed and
|
||||
missing in router descriptor try to get AS provider
|
||||
info by DNS request [4]. This information could be
|
||||
DNS cached. AN ( class A number) is never generated
|
||||
during this process to prevent DNS block problems. If
|
||||
DNS request fails ignore and continue building
|
||||
circuit.
|
||||
|
||||
Step 3 When the 'whole majority' of valid Tor clients are providing
|
||||
DNS request.
|
||||
|
||||
Older versions are deprecated and mark as no-Valid.
|
||||
|
||||
EnforceDistinctProviders replace EnforceDistinctSubnets functionnality.
|
||||
|
||||
EnforceDistinctSubnets is removed.
|
||||
|
||||
Functionalities deployed in step 2 are removed.
|
||||
|
||||
Security implications:
|
||||
|
||||
This providermeasure will increase the number of providers
|
||||
addresses that an attacker must use in order to carry out
|
||||
traffic analysis.
|
||||
|
||||
Compatibility:
|
||||
|
||||
The presented protocol does not raise compatibility issues
|
||||
with current Tor versions. The compatibility is preserved by
|
||||
implementing this functionality in 3 steps, giving time to
|
||||
network users to upgrade clients and routers.
|
||||
|
||||
Performance and scalability notes:
|
||||
|
||||
Provider change for all routers could reduce a little
|
||||
performance if the circuit to long.
|
||||
|
||||
During step 2 Get missing provider information could increase
|
||||
building path time and should have a time out.
|
||||
|
||||
Possible Attacks/Open Issues/Some thinking required:
|
||||
|
||||
These proposal seems be compatible with proposal 135 Simplify
|
||||
Configuration of Private Tor Networks.
|
||||
|
||||
This proposal does not resolve multiples AS owners and top
|
||||
providers traffic monitoring attacks [5].
|
||||
|
||||
Unresolved AS number are treated as a Class A network. Perhaps
|
||||
should be marked as invalid. But there's only fives items on
|
||||
last check see [2].
|
||||
|
||||
Need to define what's a 'significant number of nodes' and
|
||||
'whole majority' ;-)
|
||||
|
||||
References:
|
||||
[1] Location Diversity in Anonymity Networks by Nick Feamster and Roger
|
||||
Dingledine.
|
||||
In the Proceedings of the Workshop on Privacy in the Electronic Society
|
||||
(WPES 2004), Washington, DC, USA, October 2004
|
||||
http://freehaven.net/anonbib/#feamster:wpes2004
|
||||
[2] http://as4jtw5gc6efb267.onion/IPListbyAS.txt
|
||||
[3] see Goodell Tor Exit Page
|
||||
http://cassandra.eecs.harvard.edu/cgi-bin/exit.py
|
||||
[4] see the great IP to ASN DNS Tool
|
||||
http://www.team-cymru.org/Services/ip-to-asn.html
|
||||
[5] Sampled Traffic Analysis by Internet-Exchange-Level Adversaries by
|
||||
Steven J. Murdoch and Piotr Zielinski.
|
||||
In the Proceedings of the Seventh Workshop on Privacy Enhancing Technologies
|
||||
|
||||
(PET 2007), Ottawa, Canada, June 2007.
|
||||
http://freehaven.net/anonbib/#murdoch-pet2007
|
||||
[5] http://bugs.noreply.org/flyspray/index.php?do=details&id=690
|
@ -1,41 +0,0 @@
|
||||
Filename: 145-newguard-flag.txt
|
||||
Title: Separate "suitable as a guard" from "suitable as a new guard"
|
||||
Version: $Revision$
|
||||
Last-Modified: $Date$
|
||||
Author: Nick Mathewson
|
||||
Created: 1-Jul-2008
|
||||
Status: Open
|
||||
Target: 0.2.1.x
|
||||
|
||||
[This could be obsoleted by proposal 141, which could replace NewGuard
|
||||
with a Guard weight.]
|
||||
|
||||
Overview
|
||||
|
||||
Right now, Tor has one flag that clients use both to tell which
|
||||
nodes should be kept as guards, and which nodes should be picked
|
||||
when choosing new guards. This proposal separates this flag into
|
||||
two.
|
||||
|
||||
Motivation
|
||||
|
||||
Balancing clients amoung guards is not done well by our current
|
||||
algorithm. When a new guard appears, it is chosen by clients
|
||||
looking for a new guard with the same probability as all existing
|
||||
guards... but new guards are likelier to be under capacity, whereas
|
||||
old guards are likelier to be under more use.
|
||||
|
||||
Implementation
|
||||
|
||||
We add a new flag, NewGuard. Clients will change so that when they
|
||||
are choosing new guards, they only consider nodes with the NewGuard
|
||||
flag set.
|
||||
|
||||
For now, authorities will always set NewGuard if they are setting
|
||||
the Guard flag. Later, it will be easy to migrate authorities to
|
||||
set NewGuard for underused guards.
|
||||
|
||||
Alternatives
|
||||
|
||||
We might instead have authorities list weights with which nodes
|
||||
should be picked as guards.
|
@ -1,86 +0,0 @@
|
||||
Filename: 146-long-term-stability.txt
|
||||
Title: Add new flag to reflect long-term stability
|
||||
Version: $Revision$
|
||||
Last-Modified: $Date$
|
||||
Author: Nick Mathewson
|
||||
Created: 19-Jun-2008
|
||||
Status: Open
|
||||
Target: 0.2.1.x
|
||||
|
||||
Overview
|
||||
|
||||
This document proposes a new flag to indicate that a router has
|
||||
existed at the same address for a long time, describes how to
|
||||
implement it, and explains what it's good for.
|
||||
|
||||
Motivation
|
||||
|
||||
Tor has had three notions of "stability" for servers. Older
|
||||
directory protocols based a server's stability on its
|
||||
(self-reported) uptime: a server that had been running for a day was
|
||||
more stable than a server that had been running for five minutes,
|
||||
regardless of their past history. Current directory protocols track
|
||||
weighted mean time between failure (WMTBF) and weighted fractional
|
||||
uptime (WFU). WFU is computed as the fraction of time for which the
|
||||
server is running, with measurements weighted to exponentially
|
||||
decay such that old days count less. WMTBF is computed as the
|
||||
average length of intervals for which the server runs between
|
||||
downtime, with old intervals weighted to count less.
|
||||
|
||||
WMTBF is useful in answering the question: "If a server is running
|
||||
now, how long is it likely to stay running?" This makes it a good
|
||||
choice for picking servers for streams that need to be long-lived.
|
||||
WFU is useful in answering the question: "If I try connecting to
|
||||
this server at an arbitrary time, is it likely to be running?" This
|
||||
makes it an important factor for picking guard nodes, since we want
|
||||
guard nodes to be usually-up.
|
||||
|
||||
There are other questions that clients want to answer, however, for
|
||||
which the current flags aren't very useful. The one that this
|
||||
proposal addresses is,
|
||||
|
||||
"If I found this server in an old consensus, is it likely to
|
||||
still be running at the same address?"
|
||||
|
||||
This one is useful when we're trying to find directory mirrors in a
|
||||
fallback-consensus file. This property is equivalent to,
|
||||
|
||||
"If I find this server in a current consensus, how long is it
|
||||
likely to exist on the network?"
|
||||
|
||||
This one is useful if we're trying to pick introduction points or
|
||||
something and care more about churn rate than about whether every IP
|
||||
will be up all the time.
|
||||
|
||||
Implementation:
|
||||
|
||||
I propose we add a new flag, called "Longterm." Authorities should
|
||||
set this flag for routers if their Longevity is in the upper
|
||||
quartile of all routers. A router's Longevity is computed as the
|
||||
total amount of days in the last year or so[*] for which the router has
|
||||
been Running at least once at its current IP:orport pair.
|
||||
|
||||
Clients should use directory servers from a fallback-consensus only
|
||||
if they have the Longterm flag set.
|
||||
|
||||
Authority ops should be able to mark particular routers as not
|
||||
Longterm, regardless of history. (For instance, it makes sense to
|
||||
remove the Longterm flag from a router whose op says that it will
|
||||
need to shutdown in a month.)
|
||||
|
||||
[*] This is deliberately vague, to permit efficient implementations.
|
||||
|
||||
Compatibility and migration issues:
|
||||
|
||||
The voting protocol already acts gracefully when new flags are
|
||||
added, so no change to the voting protocol is needed.
|
||||
|
||||
Tor won't have collected this data, however. It might be desirable
|
||||
to bootstrap it from historical consensuses. Alternatively, we can
|
||||
just let the algorithm run for a month or two.
|
||||
|
||||
Issues and future possibilities:
|
||||
|
||||
Longterm is a really awkward name.
|
||||
|
||||
|
@ -1,60 +0,0 @@
|
||||
Filename: 147-prevoting-opinions.txt
|
||||
Title: Eliminate the need for v2 directories in generating v3 directories
|
||||
Version: $Revision$
|
||||
Last-Modified: $Date$
|
||||
Author: Nick Mathewson
|
||||
Created: 2-Jul-2008
|
||||
Status: Accepted
|
||||
Target: 0.2.1.x
|
||||
|
||||
Overview
|
||||
|
||||
We propose a new v3 vote document type to replace the role of v2
|
||||
networkstatus information in generating v3 consensuses.
|
||||
|
||||
Motivation
|
||||
|
||||
When authorities vote on which descriptors are to be listed in the
|
||||
next consensus, it helps if they all know about the same descriptors
|
||||
as one another. But a hostile, confused, or out-of-date server may
|
||||
upload a descriptor to only some authorities. In the current v3
|
||||
directory design, the authorities don't have a good way to tell one
|
||||
another about the new descriptor until they exchange votes... but by
|
||||
the time this happens, they are already committed to their votes,
|
||||
and they can't add anybody they learn about from other authorities
|
||||
until the next voting cycle. That's no good!
|
||||
|
||||
The current Tor implementation avoids this problem by having
|
||||
authorities also look at v2 networkstatus documents, but we'd like
|
||||
in the long term to eliminate these, once 0.1.2.x is obsolete.
|
||||
|
||||
Design:
|
||||
|
||||
We add a new value for vote-status in v3 consensus documents in
|
||||
addition to "consensus" and "vote": "opinion". Authorities generate
|
||||
and sign an opinion document as if they were generating a vote,
|
||||
except that they generate opinions earlier than they generate votes.
|
||||
|
||||
Authorities don't need to generate more than one opinion document
|
||||
per voting interval, but may. They should send it to the other
|
||||
authorities they know about, at the regular vote upload URL, before
|
||||
the authorities begin voting, so that enough time remains for the
|
||||
authorities to fetch new descriptors.
|
||||
|
||||
Additionally, authories make their opinions available at
|
||||
http://<hostname>/tor/status-vote/next/opinion.z
|
||||
and download opinions from authorities they haven't heard from in a
|
||||
while.
|
||||
|
||||
Authorities MAY generate opinions on demand.
|
||||
|
||||
Upon receiving an opinion document, authorities scan it for any
|
||||
descriptors that:
|
||||
- They might accept.
|
||||
- Are for routers they don't know about, or are published more
|
||||
recently than any descriptor they have for that router.
|
||||
Authorities then begin downloading such descriptors from authorities
|
||||
that claim to have them.
|
||||
|
||||
Authorities MAY cache opinion documents, but don't need to.
|
||||
|
@ -1,59 +0,0 @@
|
||||
Filename: 148-uniform-client-end-reason.txt
|
||||
Title: Stream end reasons from the client side should be uniform
|
||||
Version: $Revision$
|
||||
Last-Modified: $Date$
|
||||
Author: Roger Dingledine
|
||||
Created: 2-Jul-2008
|
||||
Status: Closed
|
||||
Implemented-In: 0.2.1.9-alpha
|
||||
|
||||
Overview
|
||||
|
||||
When a stream closes before it's finished, the end relay cell that's
|
||||
sent includes an "end stream reason" to tell the other end why it
|
||||
closed. It's useful for the exit relay to send a reason to the client,
|
||||
so the client can choose a different circuit, inform the user, etc. But
|
||||
there's no reason to include it from the client to the exit relay,
|
||||
and in some cases it can even harm anonymity.
|
||||
|
||||
We should pick a single reason for the client-to-exit-relay direction
|
||||
and always just send that.
|
||||
|
||||
Motivation
|
||||
|
||||
Back when I first deployed the Tor network, it was useful to have
|
||||
the Tor relays learn why a stream closed, so I could debug both ends
|
||||
of the stream at once. Now that streams have worked for many years,
|
||||
there's no need to continue telling the exit relay whether the client
|
||||
gave up on a stream because of "timeout" or "misc" or what.
|
||||
|
||||
Then in Tor 0.2.0.28-rc, I fixed this bug:
|
||||
- Fix a bug where, when we were choosing the 'end stream reason' to
|
||||
put in our relay end cell that we send to the exit relay, Tor
|
||||
clients on Windows were sometimes sending the wrong 'reason'. The
|
||||
anonymity problem is that exit relays may be able to guess whether
|
||||
the client is running Windows, thus helping partition the anonymity
|
||||
set. Down the road we should stop sending reasons to exit relays,
|
||||
or otherwise prevent future versions of this bug.
|
||||
|
||||
It turned out that non-Windows clients were choosing their reason
|
||||
correctly, whereas Windows clients were potentially looking at errno
|
||||
wrong and so always choosing 'misc'.
|
||||
|
||||
I fixed that particular bug, but I think we should prevent future
|
||||
versions of the bug too.
|
||||
|
||||
(We already fixed it so *circuit* end reasons don't get sent from
|
||||
the client to the exit relay. But we appear to be have skipped over
|
||||
stream end reasons thus far.)
|
||||
|
||||
Design:
|
||||
|
||||
One option would be to no longer include any 'reason' field in end
|
||||
relay cells. But that would introduce a partitioning attack ("users
|
||||
running the old version" vs "users running the new version").
|
||||
|
||||
Instead I suggest that clients all switch to sending the "misc" reason,
|
||||
like most of the Windows clients currently do and like the non-Windows
|
||||
clients already do sometimes.
|
||||
|
@ -1,44 +0,0 @@
|
||||
Filename: 149-using-netinfo-data.txt
|
||||
Title: Using data from NETINFO cells
|
||||
Version: $Revision$
|
||||
Last-Modified: $Date$
|
||||
Author: Nick Mathewson
|
||||
Created: 2-Jul-2008
|
||||
Status: Open
|
||||
Target: 0.2.1.x
|
||||
|
||||
Overview
|
||||
|
||||
Current Tor versions send signed IP and timestamp information in
|
||||
NETINFO cells, but don't use them to their fullest. This proposal
|
||||
describes how they should start using this info in 0.2.1.x.
|
||||
|
||||
Motivation
|
||||
|
||||
Our directory system relies on clients and routers having
|
||||
reasonably accurate clocks to detect replayed directory info, and
|
||||
to set accurate timestamps on directory info they publish
|
||||
themselves. NETINFO cells contain timestamps.
|
||||
|
||||
Also, the directory system relies on routers having a reasonable
|
||||
idea of their own IP addresses, so they can publish correct
|
||||
descriptors. This is also in NETINFO cells.
|
||||
|
||||
Learning the time and IP
|
||||
|
||||
We need to think about attackers here. Just because a router tells
|
||||
us that we have a given IP or a given clock skew doesn't mean that
|
||||
it's true. We believe this information only if we've heard it from
|
||||
a majority of the routers we've connected to recently, including at
|
||||
least 3 routers. Routers only believe this information if the
|
||||
majority inclues at least one authority.
|
||||
|
||||
Avoiding MITM attacks
|
||||
|
||||
Current Tors use the IP addresses published in the other router's
|
||||
NETINFO cells to see whether the connection is "canonical". Right
|
||||
now, we prefer to extend circuits over "canonical" connections. In
|
||||
0.2.1.x, we should refuse to extend circuits over non-canonical
|
||||
connections without first trying to build a canonical one.
|
||||
|
||||
|
@ -1,48 +0,0 @@
|
||||
Filename: 150-exclude-exit-nodes.txt
|
||||
Title: Exclude Exit Nodes from a circuit
|
||||
Version: $Revision$
|
||||
Author: Mfr
|
||||
Created: 2008-06-15
|
||||
Status: Closed
|
||||
Implemented-In: 0.2.1.3-alpha
|
||||
|
||||
Overview
|
||||
|
||||
Right now, Tor users can manually exclude a node from all positions
|
||||
in their circuits created using the directive ExcludeNodes.
|
||||
This proposal makes this exclusion less restrictive, allowing users to
|
||||
exclude a node only from the exit part of a circuit.
|
||||
|
||||
Motivation
|
||||
|
||||
This feature would Help the integration into vidalia (tor exit
|
||||
branch) or other tools, of features to exclude a country for exit
|
||||
without reducing circuits possibilities, and privacy. This feature
|
||||
could help people from a country were many sites are blocked to
|
||||
exclude this country for browsing, giving them a more stable
|
||||
navigation. It could also add the possibility for the user to
|
||||
exclude a currently used exit node.
|
||||
|
||||
Implementation
|
||||
|
||||
ExcludeExitNodes is similar to ExcludeNodes except it's only
|
||||
the exit node which is excluded for circuit build.
|
||||
|
||||
Tor doesn't warn if node from this list is not an exit node.
|
||||
|
||||
Security implications:
|
||||
|
||||
Open also possibilities for a future user bad exit reporting
|
||||
|
||||
Risks:
|
||||
|
||||
Use of this option can make users partitionable under certain attack
|
||||
assumptions. However, ExitNodes already creates this possibility,
|
||||
so there isn't much increased risk in ExcludeExitNodes.
|
||||
|
||||
We should still encourage people who exclude an exit node because
|
||||
of bad behavior to report it instead of just adding it to their
|
||||
ExcludeExit list. It would be unfortunate if we didn't find out
|
||||
about broken exits because of this option. This issue can probably
|
||||
be addressed sufficiently with documentation.
|
||||
|
@ -1,147 +0,0 @@
|
||||
Filename: 151-path-selection-improvements.txt
|
||||
Title: Improving Tor Path Selection
|
||||
Version:
|
||||
Last-Modified:
|
||||
Author: Fallon Chen, Mike Perry
|
||||
Created: 5-Jul-2008
|
||||
Status: Draft
|
||||
|
||||
Overview
|
||||
|
||||
The performance of paths selected can be improved by adjusting the
|
||||
CircuitBuildTimeout and avoiding failing guard nodes. This proposal
|
||||
describes a method of tracking buildtime statistics at the client, and
|
||||
using those statistics to adjust the CircuitBuildTimeout.
|
||||
|
||||
Motivation
|
||||
|
||||
Tor's performance can be improved by excluding those circuits that
|
||||
have long buildtimes (and by extension, high latency). For those Tor
|
||||
users who require better performance and have lower requirements for
|
||||
anonymity, this would be a very useful option to have.
|
||||
|
||||
Implementation
|
||||
|
||||
Storing Build Times
|
||||
|
||||
Circuit build times will be stored in the circular array
|
||||
'circuit_build_times' consisting of uint16_t elements as milliseconds.
|
||||
The total size of this array will be based on the number of circuits
|
||||
it takes to converge on a good fit of the long term distribution of
|
||||
the circuit builds for a fixed link. We do not want this value to be
|
||||
too large, because it will make it difficult for clients to adapt to
|
||||
moving between different links.
|
||||
|
||||
From our initial observations, this value appears to be on the order
|
||||
of 1000, but will be configurable in a #define NCIRCUITS_TO_OBSERVE.
|
||||
The exact value for this #define will be determined by performing
|
||||
goodness of fit tests using measurments obtained from the shufflebt.py
|
||||
script from TorFlow.
|
||||
|
||||
Long Term Storage
|
||||
|
||||
The long-term storage representation will be implemented by storing a
|
||||
histogram with BUILDTIME_BIN_WIDTH millisecond buckets (default 50) when
|
||||
writing out the statistics to disk. The format of this histogram on disk
|
||||
is yet to be finalized, but it will likely be of the format
|
||||
'CircuitBuildTime <bin> <count>', with the total specified as
|
||||
'TotalBuildTimes <total>'
|
||||
Example:
|
||||
|
||||
TotalBuildTimes 100
|
||||
CircuitBuildTimeBin 1 50
|
||||
CircuitBuildTimeBin 2 25
|
||||
CircuitBuildTimeBin 3 13
|
||||
...
|
||||
|
||||
Reading the histogram in will entail multiplying each bin by the
|
||||
BUILDTIME_BIN_WIDTH and then inserting <count> values into the
|
||||
circuit_build_times array each with the value of
|
||||
<bin>*BUILDTIME_BIN_WIDTH. In order to evenly distribute the
|
||||
values in the circular array, a form of index skipping must
|
||||
be employed. Values from bin #N with bin count C and total T
|
||||
will occupy indexes specified by N+((T/C)*k)-1, where k is the
|
||||
set of integers ranging from 0 to C-1.
|
||||
|
||||
For example, this would mean that the values from bin 1 would
|
||||
occupy indexes 1+(100/50)*k-1, or 0, 2, 4, 6, 8, 10 and so on.
|
||||
The values for bin 2 would occupy positions 1, 5, 9, 13. Collisions
|
||||
will be inserted at the first empty position in the array greater
|
||||
than the selected index (which may requiring looping around the
|
||||
array back to index 0).
|
||||
|
||||
Learning the CircuitBuildTimeout
|
||||
|
||||
Based on studies of build times, we found that the distribution of
|
||||
circuit buildtimes appears to be a Pareto distribution.
|
||||
|
||||
We will calculate the parameters for a Pareto distribution
|
||||
fitting the data using the estimators at
|
||||
http://en.wikipedia.org/wiki/Pareto_distribution#Parameter_estimation.
|
||||
|
||||
The timeout itself will be calculated by solving the CDF for the
|
||||
a percentile cutoff BUILDTIME_PERCENT_CUTOFF. This value
|
||||
represents the percentage of paths the Tor client will accept out of
|
||||
the total number of paths. We have not yet determined a good
|
||||
cutoff for this mathematically, but 85% seems a good choice for now.
|
||||
|
||||
From http://en.wikipedia.org/wiki/Pareto_distribution#Definition,
|
||||
the calculation we need is pow(BUILDTIME_PERCENT_CUTOFF/100.0, k)/Xm.
|
||||
|
||||
Testing
|
||||
|
||||
After circuit build times, storage, and learning are implemented,
|
||||
the resulting histogram should be checked for consistency by
|
||||
verifying it persists across successive Tor invocations where
|
||||
no circuits are built. In addition, we can also use the existing
|
||||
buildtime scripts to record build times, and verify that the histogram
|
||||
the python produces matches that which is output to the state file in Tor,
|
||||
and verify that the Pareto parameters and cutoff points also match.
|
||||
|
||||
Soft timeout vs Hard Timeout
|
||||
|
||||
At some point, it may be desirable to change the cutoff from a
|
||||
single hard cutoff that destroys the circuit to a soft cutoff and
|
||||
a hard cutoff, where the soft cutoff merely triggers the building
|
||||
of a new circuit, and the hard cutoff triggers destruction of the
|
||||
circuit.
|
||||
|
||||
Good values for hard and soft cutoffs seem to be 85% and 65%
|
||||
respectively, but we should eventually justify this with observation.
|
||||
|
||||
When to Begin Calculation
|
||||
|
||||
The number of circuits to observe (NCIRCUITS_TO_CUTOFF) before
|
||||
changing the CircuitBuildTimeout will be tunable via a #define. From
|
||||
our measurements, a good value for NCIRCUITS_TO_CUTOFF appears to be
|
||||
on the order of 100.
|
||||
|
||||
Dealing with Timeouts
|
||||
|
||||
Timeouts should be counted as the expectation of the region of
|
||||
of the Pareto distribution beyond the cutoff. The proposal will
|
||||
be updated with this value soon.
|
||||
|
||||
Also, in the event of network failure, the observation mechanism
|
||||
should stop collecting timeout data.
|
||||
|
||||
Client Hints
|
||||
|
||||
Some research still needs to be done to provide initial values
|
||||
for CircuitBuildTimeout based on values learned from modem
|
||||
users, DSL users, Cable Modem users, and dedicated links. A
|
||||
radiobutton in Vidalia should eventually be provided that
|
||||
sets CircuitBuildTimeout to one of these values and also
|
||||
provide the option of purging all learned data, should any exist.
|
||||
|
||||
These values can either be published in the directory, or
|
||||
shipped hardcoded for a particular Tor version.
|
||||
|
||||
Issues
|
||||
|
||||
Impact on anonymity
|
||||
|
||||
Since this follows a Pareto distribution, large reductions on the
|
||||
timeout can be achieved without cutting off a great number of the
|
||||
total paths. This will eliminate a great deal of the performance
|
||||
variation of Tor usage.
|
@ -1,64 +0,0 @@
|
||||
Filename: 152-single-hop-circuits.txt
|
||||
Title: Optionally allow exit from single-hop circuits
|
||||
Version:
|
||||
Last-Modified:
|
||||
Author: Geoff Goodell
|
||||
Created: 13-Jul-2008
|
||||
Status: Closed
|
||||
Implemented-In: 0.2.1.6-alpha
|
||||
|
||||
Overview
|
||||
|
||||
Provide a special configuration option that adds a line to descriptors
|
||||
indicating that a router can be used as an exit for one-hop circuits,
|
||||
and allow clients to attach streams to one-hop circuits provided
|
||||
that the descriptor for the router in the circuit includes this
|
||||
configuration option.
|
||||
|
||||
Motivation
|
||||
|
||||
At some point, code was added to restrict the attachment of streams
|
||||
to one-hop circuits.
|
||||
|
||||
The idea seems to be that we can use the cost of forking and
|
||||
maintaining a patch as a lever to prevent people from writing
|
||||
controllers that jeopardize the operational security of routers
|
||||
and the anonymity properties of the Tor network by creating and
|
||||
using one-hop circuits rather than the standard three-hop circuits.
|
||||
It may be, for example, that some users do not actually seek true
|
||||
anonymity but simply reachability through network perspectives
|
||||
afforded by the Tor network, and since anonymity is stronger in
|
||||
numbers, forcing users to contribute to anonymity and decrease the
|
||||
risk to server operators by using full-length paths may be reasonable.
|
||||
|
||||
As presently implemented, the sweeping restriction of one-hop circuits
|
||||
for all routers limits the usefulness of Tor as a general-purpose
|
||||
technology for building circuits. In particular, we should allow
|
||||
for controllers, such as Blossom, that create and use single-hop
|
||||
circuits involving routers that are not part of the Tor network.
|
||||
|
||||
Design
|
||||
|
||||
Introduce a configuration option for Tor servers that, when set,
|
||||
indicates that a router is willing to provide exit from one-hop
|
||||
circuits. Routers with this policy will not require that a circuit
|
||||
has at least two hops when it is used as an exit.
|
||||
|
||||
In addition, routers for which this configuration option
|
||||
has been set will have a line in their descriptors, "opt
|
||||
exit-from-single-hop-circuits". Clients will keep track of which
|
||||
routers have this option and allow streams to be attached to
|
||||
single-hop circuits that include such routers.
|
||||
|
||||
Security Considerations
|
||||
|
||||
This approach seems to eliminate the worry about operational router
|
||||
security, since server operators will not set the configuraiton
|
||||
option unless they are willing to take on such risk.
|
||||
|
||||
To reduce the impact on anonymity of the network resulting
|
||||
from including such "risky" routers in regular Tor path
|
||||
selection, clients may systematically exclude routers with "opt
|
||||
exit-from-single-hop-circuits" when choosing random paths through
|
||||
the Tor network.
|
||||
|
@ -1,177 +0,0 @@
|
||||
Filename: 153-automatic-software-update-protocol.txt
|
||||
Title: Automatic software update protocol
|
||||
Version: $Revision$
|
||||
Last-Modified: $Date$
|
||||
Author: Jacob Appelbaum
|
||||
Created: 14-July-2008
|
||||
Status: Superseded
|
||||
|
||||
[Superseded by thandy-spec.txt]
|
||||
|
||||
|
||||
Automatic Software Update Protocol Proposal
|
||||
|
||||
0.0 Introduction
|
||||
|
||||
The Tor project and its users require a robust method to update shipped
|
||||
software bundles. The software bundles often includes Vidalia, Privoxy, Polipo,
|
||||
Torbutton and of course Tor itself. It is not inconcievable that an update
|
||||
could include all of the Tor Browser Bundle. It seems reasonable to make this
|
||||
a standalone program that can be called in shell scripts, cronjobs or by
|
||||
various Tor controllers.
|
||||
|
||||
0.1 Minimal Tasks To Implement Automatic Updating
|
||||
|
||||
At the most minimal, an update must be able to do the following:
|
||||
|
||||
0 - Detect the curent Tor version, note the working status of Tor.
|
||||
1 - Detect the latest Tor version.
|
||||
2 - Fetch the latest version in the form of a platform specific package(s).
|
||||
3 - Verify the itegrity of the downloaded package(s).
|
||||
4 - Install the verified package(s).
|
||||
5 - Test that the new package(s) works properly.
|
||||
|
||||
0.2 Specific Enumeration Of Minimal Tasks
|
||||
|
||||
To implement requirement 0, we need to detect the current Tor version of both
|
||||
the updater and the current running Tor. The update program itself should be
|
||||
versioned internally. This requirement should also test connecting through Tor
|
||||
itself and note if such connections are possible.
|
||||
|
||||
To implement requirement 1, we need to learn the concensus from the directory
|
||||
authorities or fail back to a known good URL with cryptographically signed
|
||||
content.
|
||||
|
||||
To implement requirement 2, we need to download Tor - hopefully over Tor.
|
||||
|
||||
To implement requirement 3, we need to verify the package signature.
|
||||
|
||||
To implement requirement 4, we need to use a platform specific method of
|
||||
installation. The Tor controller performing the update perform these platform
|
||||
specific methods.
|
||||
|
||||
To implement requirement 5, we need to be able to extend circuits and reach
|
||||
the internet through Tor.
|
||||
|
||||
0.x Implementation Goals
|
||||
|
||||
The update system will be cross platform and rely on as little external code
|
||||
as possible. If the update system uses it, it must be updated by the update
|
||||
system itself. It will consist only of free software and will not rely on any
|
||||
non-free components until the actual installation phase. If a package manager
|
||||
is in use, it will be platform specific and thus only invoked by the update
|
||||
system implementing the update protocol.
|
||||
|
||||
The update system itself will attempt to perform update related network
|
||||
activity over Tor. Possibly it will attempt to use a hidden service first.
|
||||
It will attempt to use novel and not so novel caching
|
||||
when possible, it will always verify cryptographic signatures before any
|
||||
remotely fetched code is executed. In the event of an unusable Tor system,
|
||||
it will be able to attempt to fetch updates without Tor. This should be user
|
||||
configurable, some users will be unwilling to update without the protection of
|
||||
using Tor - others will simply be unable because of blocking of the main Tor
|
||||
website.
|
||||
|
||||
The update system will track current version numbers of Tor and supporting
|
||||
software. The update system will also track known working versions to assist
|
||||
with automatic The update system itself will be a standalone library. It will be
|
||||
strongly versioned internally to match the Tor bundle it was shiped with. The
|
||||
update system will keep track of the given platform, cpu architecture, lsb_release,
|
||||
package management functionality and any other platform specific metadata.
|
||||
|
||||
We have referenced two popular automatic update systems, though neither fit
|
||||
our needs, both are useful as an idea of what others are doing in the same
|
||||
area.
|
||||
|
||||
The first is sparkle[0] but it is sadly only available for Cocoa
|
||||
environments and is written in Objective C. This doesn't meet our requirements
|
||||
because it is directly tied into the private Apple framework.
|
||||
|
||||
The second is the Mozilla Automatic Update System[1]. It is possibly useful
|
||||
as an idea of how other free software projects automatically update. It is
|
||||
however not useful in its currently documented form.
|
||||
|
||||
|
||||
[0] http://sparkle.andymatuschak.org/documentation/
|
||||
[1] http://wiki.mozilla.org/AUS:Manual
|
||||
|
||||
0.x Previous methods of Tor and related software update
|
||||
|
||||
Previously, Tor users updated their Tor related software by hand. There has
|
||||
been no fully automatic method for any user to update. In addition, there
|
||||
hasn't been any specific way to find out the most current stable version of Tor
|
||||
or related software as voted on by the directory authority concensus.
|
||||
|
||||
0.x Changes to the directory specification
|
||||
|
||||
We will want to supplement client-versions and server-versions in the
|
||||
concensus voting with another version identifier known as
|
||||
'auto-update-versions'. This will keep track of the current concensus of
|
||||
specific versions that are best per platform and per architecture. It should
|
||||
be noted that while the Mac OS X universal binary may be the best for x86
|
||||
processers with Tiger, it may not be the best for PPC users on Panther. This
|
||||
goes for all of the package updates. We want to prevent updates that cause Tor
|
||||
to break even if the updating program can recover gracefully.
|
||||
|
||||
x.x Assumptions About Operating System Package Management
|
||||
|
||||
It is assumed that users will use their package manager unless they are on
|
||||
Microsoft Windows (any version) or Mac OS X (any version). Microsoft Windows
|
||||
users will have integration with the normal "add/remove program" functionality
|
||||
that said users would expect.
|
||||
|
||||
x.x Package Update System Failure Modes
|
||||
|
||||
The package update will try to ensure that a user always has a working Tor at
|
||||
the very least. It will keep state to remember versions of Tor that were able
|
||||
to bootstrap properly and reach the rest of the Tor network. It will also keep
|
||||
note of which versions broke. It will select the best Tor that works for the
|
||||
user. It will also allow for anonymized bug reporting on the packages
|
||||
available and tested by the auto-update system.
|
||||
|
||||
x.x Package Signature Verification
|
||||
|
||||
The update system will be aware of replay attacks against the update signature
|
||||
system itself. It will not allow package update signatures that are radically
|
||||
out of date. It will be a multi-key system to prevent any single party from
|
||||
forging an update. The key will be updated regularly. This is like authority
|
||||
key (see proposal 103) usage.
|
||||
|
||||
x.x Package Caching
|
||||
|
||||
The update system will iterate over different update methods. Whichever method
|
||||
is picked will have caching functionality. Each Tor server itself should be
|
||||
able to serve cached update files. This will be an option that friendly server
|
||||
administrators can turn on should they wish to support caching. In addition,
|
||||
it is possible to cache the full contents of a package in an
|
||||
authoratative DNS zone. Users can then query the DNS zone for their package.
|
||||
If we wish to further distribute the update load, we can also offer packages
|
||||
with encrypted bittorrent. Clients who wish to share the updates but do not
|
||||
wish to be a server can help distribute Tor updates. This can be tied together
|
||||
with the DNS caching[2][3] if needed.
|
||||
|
||||
[2] http://www.netrogenic.com/dnstorrent/
|
||||
[3] http://www.doxpara.com/ozymandns_src_0.1.tgz
|
||||
|
||||
x.x Helping Our Users Spread Tor
|
||||
|
||||
There should be a way for a user to participate in the packaging caching as
|
||||
described in section x.x. This option should be presented by the Tor
|
||||
controller.
|
||||
|
||||
x.x Simple HTTP Proxy To The Tor Project Website
|
||||
|
||||
It has been suggested that we should provide a simple proxy that allows a user
|
||||
to visit the main Tor website to download packages. This was part of a
|
||||
previous proposal and has not been closely examined.
|
||||
|
||||
x.x Package Installation
|
||||
|
||||
Platform specific methods for proper package installation will be left to the
|
||||
controller that is calling for an update. Each platform is different, the
|
||||
installation options and user interface will be specific to the controller in
|
||||
question.
|
||||
|
||||
x.x Other Things
|
||||
|
||||
Other things should be added to this proposal. What are they?
|
@ -1,379 +0,0 @@
|
||||
Filename: 154-automatic-updates.txt
|
||||
Title: Automatic Software Update Protocol
|
||||
Version: $Revision$
|
||||
Last-Modified: $Date$
|
||||
Author: Matt Edman
|
||||
Created: 30-July-2008
|
||||
Status: Superseded
|
||||
Target: 0.2.1.x
|
||||
|
||||
Superseded by thandy-spec.txt
|
||||
|
||||
Scope
|
||||
|
||||
This proposal specifies the method by which an automatic update client can
|
||||
determine the most recent recommended Tor installation package for the
|
||||
user's platform, download the package, and then verify that the package was
|
||||
downloaded successfully. While this proposal focuses on only the Tor
|
||||
software, the protocol defined is sufficiently extensible such that other
|
||||
components of the Tor bundles, like Vidalia, Polipo, and Torbutton, can be
|
||||
managed and updated by the automatic update client as well.
|
||||
|
||||
The initial target platform for the automatic update framework is Windows,
|
||||
given that's the platform used by a majority of our users and that it lacks
|
||||
a sane package management system that many Linux distributions already have.
|
||||
Our second target platform will be Mac OS X, and so the protocol will be
|
||||
designed with this near-future direction in mind.
|
||||
|
||||
Other client-side aspects of the automatic update process, such as user
|
||||
interaction, the interface presented, and actual package installation
|
||||
procedure, are outside the scope of this proposal.
|
||||
|
||||
|
||||
Motivation
|
||||
|
||||
Tor releases new versions frequently, often with important security,
|
||||
anonymity, and stability fixes. Thus, it is important for users to be able
|
||||
to promptly recognize when new versions are available and to easily
|
||||
download, authenticate, and install updated Tor and Tor-related software
|
||||
packages.
|
||||
|
||||
Tor's control protocol [2] provides a method by which controllers can
|
||||
identify when the user's Tor software is obsolete or otherwise no longer
|
||||
recommended. Currently, however, no mechanism exists for clients to
|
||||
automatically download and install updated Tor and Tor-related software for
|
||||
the user.
|
||||
|
||||
|
||||
Design Overview
|
||||
|
||||
The core of the automatic update framework is a well-defined file called a
|
||||
"recommended-packages" file. The recommended-packages file is accessible via
|
||||
HTTP[S] at one or more well-defined URLs. An example recommended-packages
|
||||
URL may be:
|
||||
|
||||
https://updates.torproject.org/recommended-packages
|
||||
|
||||
The recommended-packages document is formatted according to Section 1.2
|
||||
below and specifies the most recent recommended installation package
|
||||
versions for Tor or Tor-related software, as well as URLs at which the
|
||||
packages and their signatures can be downloaded.
|
||||
|
||||
An automatic update client process runs on the Tor user's computer and
|
||||
periodically retrieves the recommended-packages file according to the method
|
||||
described in Section 2.0. As described further in Section 1.2, the
|
||||
recommended-packages file is signed and can be verified by the automatic
|
||||
update client with one or more public keys included in the client software.
|
||||
Since it is signed, the recommended-packages file can be mirrored by
|
||||
multiple hosts (e.g., Tor directory authorities), whose URLs are included in
|
||||
the automatic update client's configuration.
|
||||
|
||||
After retrieving and verifying the recommended-packages file, the automatic
|
||||
update client compares the versions of the recommended software packages
|
||||
listed in the file with those currently installed on the end-user's
|
||||
computer. If one or more of the installed packages is determined to be out
|
||||
of date, an updated package and its signature will be downloaded from one of
|
||||
the package URLs listed in the recommended-packages file as described in
|
||||
Section 2.2.
|
||||
|
||||
The automatic update system uses a multilevel signing key scheme for package
|
||||
signatures. There are a small number of entities we call "packaging
|
||||
authorities" that each have their own signing key. A packaging authority is
|
||||
responsible for signing and publishing the recommended-packages file.
|
||||
Additionally, each individual packager responsible for producing an
|
||||
installation package for one or more platforms has their own signing key.
|
||||
Every packager's signing key must be signed by at least one of the packaging
|
||||
authority keys.
|
||||
|
||||
|
||||
Specification
|
||||
|
||||
1. recommended-packages Specification
|
||||
|
||||
In this section we formally specify the format of the published
|
||||
recommended-packages file.
|
||||
|
||||
1.1. Document Meta-format
|
||||
|
||||
The recommended-packages document follows the lightweight extensible
|
||||
information format defined in Tor's directory protocol specification [1]. In
|
||||
the interest of self-containment, we have reproduced the relevant portions
|
||||
of that format's specification in this Section. (Credits to Nick Mathewson
|
||||
for much of the original format definition language.)
|
||||
|
||||
The highest level object is a Document, which consists of one or more
|
||||
Items. Every Item begins with a KeywordLine, followed by zero or more
|
||||
Objects. A KeywordLine begins with a Keyword, optionally followed by
|
||||
whitespace and more non-newline characters, and ends with a newline. A
|
||||
Keyword is a sequence of one or more characters in the set [A-Za-z0-9-].
|
||||
An Object is a block of encoded data in pseudo-Open-PGP-style
|
||||
armor. (cf. RFC 2440)
|
||||
|
||||
More formally:
|
||||
|
||||
Document ::= (Item | NL)+
|
||||
Item ::= KeywordLine Object*
|
||||
KeywordLine ::= Keyword NL | Keyword WS ArgumentChar+ NL
|
||||
Keyword ::= KeywordChar+
|
||||
KeywordChar ::= 'A' ... 'Z' | 'a' ... 'z' | '0' ... '9' | '-'
|
||||
ArgumentChar ::= any printing ASCII character except NL.
|
||||
WS ::= (SP | TAB)+
|
||||
Object ::= BeginLine Base-64-encoded-data EndLine
|
||||
BeginLine ::= "-----BEGIN " Keyword "-----" NL
|
||||
EndLine ::= "-----END " Keyword "-----" NL
|
||||
|
||||
The BeginLine and EndLine of an Object must use the same keyword.
|
||||
|
||||
In our Document description below, we also tag Items with a multiplicity in
|
||||
brackets. Possible tags are:
|
||||
|
||||
"At start, exactly once": These items MUST occur in every instance of the
|
||||
document type, and MUST appear exactly once, and MUST be the first item in
|
||||
their documents.
|
||||
|
||||
"Exactly once": These items MUST occur exactly one time in every
|
||||
instance of the document type.
|
||||
|
||||
"Once or more": These items MUST occur at least once in any instance
|
||||
of the document type, and MAY occur more than once.
|
||||
|
||||
"At end, exactly once": These items MUST occur in every instance of
|
||||
the document type, and MUST appear exactly once, and MUST be the
|
||||
last item in their documents.
|
||||
|
||||
1.2. recommended-packages Document Format
|
||||
|
||||
When interpreting a recommended-packages Document, software MUST ignore
|
||||
any KeywordLine that starts with a keyword it doesn't recognize; future
|
||||
implementations MUST NOT require current automatic update clients to
|
||||
understand any KeywordLine not currently described.
|
||||
|
||||
In lines that take multiple arguments, extra arguments SHOULD be
|
||||
accepted and ignored.
|
||||
|
||||
The currently defined Items contained in a recommended-packages document
|
||||
are:
|
||||
|
||||
"recommended-packages-format" SP number NL
|
||||
|
||||
[Exactly once]
|
||||
|
||||
This Item specifies the version of the recommended-packages format that
|
||||
is contained in the subsequent document. The version defined in this
|
||||
proposal is version "1". Subsequent iterations of this protocol MUST
|
||||
increment this value if they introduce incompatible changes to the
|
||||
document format and MAY increment this value if they only introduce
|
||||
additional Keywords.
|
||||
|
||||
"published" SP YYYY-MM-DD SP HH:MM:SS NL
|
||||
|
||||
[Exactly once]
|
||||
|
||||
The time, in GMT, when this recommended-packages document was generated.
|
||||
Automatic update clients SHOULD ignore Documents over 60 days old.
|
||||
|
||||
"tor-stable-win32-version" SP TorVersion NL
|
||||
|
||||
[Exactly once]
|
||||
|
||||
This keyword specifies the latest recommended release of Tor's "stable"
|
||||
branch for the Windows platform that has an installation package
|
||||
available. Note that this version does not necessarily correspond to the
|
||||
most recently tagged stable Tor version, since that version may not yet
|
||||
have an installer package available, or may have known issues on
|
||||
Windows.
|
||||
|
||||
The TorVersion field is formatted according to Section 2 of Tor's
|
||||
version specification [3].
|
||||
|
||||
"tor-stable-win32-package" SP Url NL
|
||||
|
||||
[Once or more]
|
||||
|
||||
This Item specifies the location from which the most recent
|
||||
recommended Windows installation package for Tor's stable branch can be
|
||||
downloaded.
|
||||
|
||||
When this Item appears multiple times within the Document, automatic
|
||||
update clients SHOULD select randomly from the available package
|
||||
mirrors.
|
||||
|
||||
"tor-dev-win32-version" SP TorVersion NL
|
||||
|
||||
[Exactly once]
|
||||
|
||||
This Item specifies the latest recommended release of Tor's
|
||||
"development" branch for the Windows platform that has an installation
|
||||
package available. The same caveats from the description of
|
||||
"tor-stable-win32-version" also apply to this keyword.
|
||||
|
||||
The TorVersion field is formatted according to Section 2 of Tor's
|
||||
version specification [3].
|
||||
|
||||
"tor-dev-win32-package" SP Url NL
|
||||
|
||||
[Once or more]
|
||||
|
||||
This Item specifies the location from which the most recent recommended
|
||||
Windows installation package and its signature for Tor's development
|
||||
branch can be downloaded.
|
||||
|
||||
When this Keyword appears multiple times within the Document, automatic
|
||||
update clients SHOULD select randomly from the available package
|
||||
mirrors.
|
||||
|
||||
"signature" NL SIGNATURE NL
|
||||
|
||||
[At end, exactly once]
|
||||
|
||||
The "SIGNATURE" Object contains a PGP signature (using a packaging
|
||||
authority signing key) of the entire document, taken from the beginning
|
||||
of the "recommended-packages-format" keyword, through the newline after
|
||||
the "signature" Keyword.
|
||||
|
||||
|
||||
2. Automatic Update Client Behavior
|
||||
|
||||
The client-side component of the automatic update framework is an
|
||||
application that runs on the end-user's machine. It is responsible for
|
||||
fetching and verifying a recommended-packages document, as well as
|
||||
downloading, verifying, and subsequently installing any necessary updated
|
||||
software packages.
|
||||
|
||||
2.1. Download and verify a recommended-packages document
|
||||
|
||||
The first step in the automatic update process is for the client to download
|
||||
a copy of the recommended-packages file. The automatic update client
|
||||
contains a (hardcoded and/or user-configurable) list of URLs from which it
|
||||
will attempt to retrieve a recommended-packages file.
|
||||
|
||||
Connections to each of the recommended-packages URLs SHOULD be attempted in
|
||||
the following order:
|
||||
|
||||
1) HTTPS over Tor
|
||||
2) HTTP over Tor
|
||||
3) Direct HTTPS
|
||||
4) Direct HTTP
|
||||
|
||||
If the client fails to retrieve a recommended-packages document via any of
|
||||
the above connection methods from any of the configured URLs, the client
|
||||
SHOULD retry its download attempts following an exponential back-off
|
||||
algorithm. After the first failed attempt, the client SHOULD delay one hour
|
||||
before attempting again, up to a maximum of 24 hours delay between retry
|
||||
attempts.
|
||||
|
||||
After successfully downloading a recommended-packages file, the automatic
|
||||
update client will verify the signature using one of the public keys
|
||||
distributed with the client software. If more than one recommended-packages
|
||||
file is downloaded and verified, the file with the most recent "published"
|
||||
date that is verified will be retained and the rest discarded.
|
||||
|
||||
2.2. Download and verify the updated packages
|
||||
|
||||
The automatic update client next compares the latest recommended package
|
||||
version from the recommended-packages document with the currently installed
|
||||
Tor version. If the user currently has installed a Tor version from Tor's
|
||||
"development" branch, then the version specified in "tor-dev-*-version" Item
|
||||
is used for comparison. Similarly, if the user currently has installed a Tor
|
||||
version from Tor's "stable" branch, then the version specified in the
|
||||
"tor-stable-*version" Item is used for comparison. Version comparisons are
|
||||
done according to Tor's version specification [3].
|
||||
|
||||
If the automatic update client determines an installation package newer than
|
||||
the user's currently installed version is available, it will attempt to
|
||||
download a package appropriate for the user's platform and Tor branch from a
|
||||
URL specified by a "tor-[branch]-[platform]-package" Item. If more than one
|
||||
mirror for the selected package is available, a mirror will be chosen at
|
||||
random from all those available.
|
||||
|
||||
The automatic update client must also download a ".asc" signature file for
|
||||
the retrieved package. The URL for the package signature is the same as that
|
||||
for the package itself, except with the extension ".asc" appended to the
|
||||
package URL.
|
||||
|
||||
Connections to download the updated package and its signature SHOULD be
|
||||
attempted in the same order described in Section 2.1.
|
||||
|
||||
After completing the steps described in Sections 2.1 and 2.2, the automatic
|
||||
update client will have downloaded and verified a copy of the latest Tor
|
||||
installation package. It can then take whatever subsequent platform-specific
|
||||
steps are necessary to install the downloaded software updates.
|
||||
|
||||
2.3. Periodic checking for updates
|
||||
|
||||
The automatic update client SHOULD maintain a local state file in which it
|
||||
records (at a minimum) the timestamp at which it last retrieved a
|
||||
recommended-packages file and the timestamp at which the client last
|
||||
successfully downloaded and installed a software update.
|
||||
|
||||
Automatic update clients SHOULD check for an updated recommended-packages
|
||||
document at most once per day but at least once every 30 days.
|
||||
|
||||
|
||||
3. Future Extensions
|
||||
|
||||
There are several possible areas for future extensions of this framework.
|
||||
The extensions below are merely suggestions and should be the subject of
|
||||
their own proposal before being implemented.
|
||||
|
||||
3.1. Additional Software Updates
|
||||
|
||||
There are several software packages often included in Tor bundles besides
|
||||
Tor, such as Vidalia, Privoxy or Polipo, and Torbutton. The versions and
|
||||
download locations of updated installation packages for these bundle
|
||||
components can be easily added to the recommended-packages document
|
||||
specification above.
|
||||
|
||||
3.2. Including ChangeLog Information
|
||||
|
||||
It may be useful for automatic update clients to be able to display for
|
||||
users a summary of the changes made in the latest Tor or Tor-related
|
||||
software release, before the user chooses to install the update. In the
|
||||
future, we can add keywords to the specification in Section 1.2 that specify
|
||||
the location of a ChangeLog file for the latest recommended package
|
||||
versions. It may also be desirable to allow localized ChangeLog information,
|
||||
so that the automatic update client can fetch release notes in the
|
||||
end-user's preferred language.
|
||||
|
||||
3.3. Weighted Package Mirror Selection
|
||||
|
||||
We defined in Section 1.2 a method by which automatic update clients can
|
||||
select from multiple available package mirrors. We may want to add a Weight
|
||||
argument to the "*-package" Items that allows the recommended-packages file
|
||||
to suggest to clients the probability with which a package mirror should be
|
||||
chosen. This will allow clients to more appropriately distribute package
|
||||
downloads across available mirrors proportional to their approximate
|
||||
bandwidth.
|
||||
|
||||
|
||||
Implementation
|
||||
|
||||
Implementation of this proposal will consist of two separate components.
|
||||
|
||||
The first component is a small "au-publish" tool that takes as input a
|
||||
configuration file specifying the information described in Section 1.2 and a
|
||||
private key. The tool is run by a "packaging authority" (someone responsible
|
||||
for publishing updated installation packages), who will be prompted to enter
|
||||
the passphrase for the private key used to sign the recommended-packages
|
||||
document. The output of the tool is a document formatted according to
|
||||
Section 1.2, with a signature appended at the end. The resulting document
|
||||
can then be published to any of the update mirrors.
|
||||
|
||||
The second component is an "au-client" tool that is run on the end-user's
|
||||
machine. It periodically checks for updated installation packages according
|
||||
to Section 2 and fetches the packages if necessary. The public keys used
|
||||
to sign the recommended-packages file and any of the published packages are
|
||||
included in the "au-client" tool.
|
||||
|
||||
|
||||
References
|
||||
|
||||
[1] Tor directory protocol (version 3),
|
||||
https://tor-svn.freehaven.net/svn/tor/trunk/doc/spec/dir-spec.txt
|
||||
|
||||
[2] Tor control protocol (version 2),
|
||||
https://tor-svn.freehaven.net/svn/tor/trunk/doc/spec/control-spec.txt
|
||||
|
||||
[3] Tor version specification,
|
||||
https://tor-svn.freehaven.net/svn/tor/trunk/doc/spec/version-spec.txt
|
||||
|
@ -1,122 +0,0 @@
|
||||
Filename: 155-four-hidden-service-improvements.txt
|
||||
Title: Four Improvements of Hidden Service Performance
|
||||
Version: $Revision$
|
||||
Last-Modified: $Date$
|
||||
Author: Karsten Loesing, Christian Wilms
|
||||
Created: 25-Sep-2008
|
||||
Status: Finished
|
||||
Implemented-In: 0.2.1.x
|
||||
|
||||
Change history:
|
||||
|
||||
25-Sep-2008 Initial proposal for or-dev
|
||||
|
||||
Overview:
|
||||
|
||||
A performance analysis of hidden services [1] has brought up a few
|
||||
possible design changes to reduce advertisement time of a hidden service
|
||||
in the network as well as connection establishment time. Some of these
|
||||
design changes have side-effects on anonymity or overall network load
|
||||
which had to be weighed up against individual performance gains. A
|
||||
discussion of seven possible design changes [2] has led to a selection
|
||||
of four changes [3] that are proposed to be implemented here.
|
||||
|
||||
Design:
|
||||
|
||||
1. Shorter Circuit Extension Timeout
|
||||
|
||||
When establishing a connection to a hidden service a client cannibalizes
|
||||
an existing circuit and extends it by one hop to one of the service's
|
||||
introduction points. In most cases this can be accomplished within a few
|
||||
seconds. Therefore, the current timeout of 60 seconds for extending a
|
||||
circuit is far too high.
|
||||
|
||||
Assuming that the timeout would be reduced to a lower value, for example
|
||||
30 seconds, a second (or third) attempt to cannibalize and extend would
|
||||
be started earlier. With the current timeout of 60 seconds, 93.42% of all
|
||||
circuits can be established, whereas this fraction would have been only
|
||||
0.87% smaller at 92.55% with a timeout of 30 seconds.
|
||||
|
||||
For a timeout of 30 seconds the performance gain would be approximately 2
|
||||
seconds in the mean as opposed to the current timeout of 60 seconds. At
|
||||
the same time a smaller timeout leads to discarding an increasing number
|
||||
of circuits that might have been completed within the current timeout of
|
||||
60 seconds.
|
||||
|
||||
Measurements with simulated low-bandwidth connectivity have shown that
|
||||
there is no significant effect of client connectivity on circuit
|
||||
extension times. The reason for this might be that extension messages are
|
||||
small and thereby independent of the client bandwidth. Further, the
|
||||
connection between client and entry node only constitutes a single hop of
|
||||
a circuit, so that its influence on the whole circuit is limited.
|
||||
|
||||
The exact value of the new timeout does not necessarily have to be 30
|
||||
seconds, but might also depend on the results of circuit build timeout
|
||||
measurements as described in proposal 151.
|
||||
|
||||
2. Parallel Connections to Introduction Points
|
||||
|
||||
An additional approach to accelerate extension of introduction circuits
|
||||
is to extend a second circuit in parallel to a different introduction
|
||||
point. Such parallel extension attempts should be started after a short
|
||||
delay of, e.g., 15 seconds in order to prevent unnecessary circuit
|
||||
extensions and thereby save network resources. Whichever circuit
|
||||
extension succeeds first is used for introduction, while the other
|
||||
attempt is aborted.
|
||||
|
||||
An evaluation has been performed for the more resource-intensive approach
|
||||
of starting two parallel circuits immediately instead of waiting for a
|
||||
short delay. The result was a reduction of connection establishment times
|
||||
from 27.4 seconds in the original protocol to 22.5 seconds.
|
||||
|
||||
While the effect of the proposed approach of delayed parallelization on
|
||||
mean connection establishment times is expected to be smaller,
|
||||
variability of connection attempt times can be reduced significantly.
|
||||
|
||||
3. Increase Count of Internal Circuits
|
||||
|
||||
Hidden services need to create or cannibalize and extend a circuit to a
|
||||
rendezvous point for every client request. Really popular hidden services
|
||||
require more than two internal circuits in the pool to answer multiple
|
||||
client requests at the same time. This scenario was not yet analyzed, but
|
||||
will probably exhibit worse performance than measured in the previous
|
||||
analysis. The number of preemptively built internal circuits should be a
|
||||
function of connection requests in the past to adapt to changing needs.
|
||||
Furthermore, an increased number of internal circuits on client side
|
||||
would allow clients to establish connections to more than one hidden
|
||||
service at a time.
|
||||
|
||||
Under the assumption that a popular hidden service cannot make use of
|
||||
cannibalization for connecting to rendezvous points, the circuit creation
|
||||
time needs to be added to the current results. In the mean, the
|
||||
connection establishment time to a popular hidden service would increase
|
||||
by 4.7 seconds.
|
||||
|
||||
4. Build More Introduction Circuits
|
||||
|
||||
When establishing introduction points, a hidden service should launch 5
|
||||
instead of 3 introduction circuits at the same time and use only the
|
||||
first 3 that could be established. The remaining two circuits could still
|
||||
be used for other purposes afterwards.
|
||||
|
||||
The effect has been simulated using previously measured data, too.
|
||||
Therefore, circuit establishment times were derived from log files and
|
||||
written to an array. Afterwards, a simulation with 10,000 runs was
|
||||
performed picking 5 (4, 6) random values and using the 3 lowest values in
|
||||
contrast to picking only 3 values at random. The result is that the mean
|
||||
time of the 3-out-of-3 approach is 8.1 seconds, while the mean time of
|
||||
the 3-out-of-5 approach is 4.4 seconds.
|
||||
|
||||
The effect on network load is minimal, because the hidden service can
|
||||
reuse the slower internal circuits for other purposes, e.g., rendezvous
|
||||
circuits. The only change is that a hidden service starts establishing
|
||||
more circuits at once instead of subsequently doing so.
|
||||
|
||||
References:
|
||||
|
||||
[1] http://freehaven.net/~karsten/hidserv/perfanalysis-2008-06-15.pdf
|
||||
|
||||
[2] http://freehaven.net/~karsten/hidserv/discussion-2008-07-15.pdf
|
||||
|
||||
[3] http://freehaven.net/~karsten/hidserv/design-2008-08-15.pdf
|
||||
|
@ -1,529 +0,0 @@
|
||||
Filename: 156-tracking-blocked-ports.txt
|
||||
Title: Tracking blocked ports on the client side
|
||||
Version: $Revision$
|
||||
Last-Modified: $Date$
|
||||
Author: Robert Hogan
|
||||
Created: 14-Oct-2008
|
||||
Status: Open
|
||||
Target: 0.2.?
|
||||
|
||||
Motivation:
|
||||
Tor clients that are behind extremely restrictive firewalls can end up
|
||||
waiting a while for their first successful OR connection to a node on the
|
||||
network. Worse, the more restrictive their firewall the more susceptible
|
||||
they are to an attacker guessing their entry nodes. Tor routers that
|
||||
are behind extremely restrictive firewalls can only offer a limited,
|
||||
'partitioned' service to other routers and clients on the network. Exit
|
||||
nodes behind extremely restrictive firewalls may advertise ports that they
|
||||
are actually not able to connect to, wasting network resources in circuit
|
||||
constructions that are doomed to fail at the last hop on first use.
|
||||
|
||||
Proposal:
|
||||
|
||||
When a client attempts to connect to an entry guard it should avoid
|
||||
further attempts on ports that fail once until it has connected to at
|
||||
least one entry guard successfully. (Maybe it should wait for more than
|
||||
one failure to reduce the skew on the first node selection.) Thereafter
|
||||
it should select entry guards regardless of port and warn the user if
|
||||
it observes that connections to a given port have failed every multiple
|
||||
of 5 times without success or since the last success.
|
||||
|
||||
Tor should warn the operators of exit, middleman and entry nodes if it
|
||||
observes that connections to a given port have failed a multiple of 5
|
||||
times without success or since the last success. If attempts on a port
|
||||
fail 20 or more times without or since success, Tor should add the port
|
||||
to a 'blocked-ports' entry in its descriptor's extra-info. Some thought
|
||||
needs to be given to what the authorities might do with this information.
|
||||
|
||||
Related TODO item:
|
||||
"- Automatically determine what ports are reachable and start using
|
||||
those, if circuits aren't working and it's a pattern we
|
||||
recognize ("port 443 worked once and port 9001 keeps not
|
||||
working")."
|
||||
|
||||
|
||||
I've had a go at implementing all of this in the attached.
|
||||
|
||||
Addendum:
|
||||
Just a note on the patch, storing the digest of each router that uses the port
|
||||
is a bit of a memory hog, and its only real purpose is to provide a count of
|
||||
routers using that port when warning the user. That could be achieved when
|
||||
warning the user by iterating through the routerlist instead.
|
||||
|
||||
Index: src/or/connection_or.c
|
||||
===================================================================
|
||||
--- src/or/connection_or.c (revision 17104)
|
||||
+++ src/or/connection_or.c (working copy)
|
||||
@@ -502,6 +502,9 @@
|
||||
connection_or_connect_failed(or_connection_t *conn,
|
||||
int reason, const char *msg)
|
||||
{
|
||||
+ if ((reason == END_OR_CONN_REASON_NO_ROUTE) ||
|
||||
+ (reason == END_OR_CONN_REASON_REFUSED))
|
||||
+ or_port_hist_failure(conn->identity_digest,TO_CONN(conn)->port);
|
||||
control_event_or_conn_status(conn, OR_CONN_EVENT_FAILED, reason);
|
||||
if (!authdir_mode_tests_reachability(get_options()))
|
||||
control_event_bootstrap_problem(msg, reason);
|
||||
@@ -580,6 +583,7 @@
|
||||
/* already marked for close */
|
||||
return NULL;
|
||||
}
|
||||
+
|
||||
return conn;
|
||||
}
|
||||
|
||||
@@ -909,6 +913,7 @@
|
||||
control_event_or_conn_status(conn, OR_CONN_EVENT_CONNECTED, 0);
|
||||
|
||||
if (started_here) {
|
||||
+ or_port_hist_success(TO_CONN(conn)->port);
|
||||
rep_hist_note_connect_succeeded(conn->identity_digest, now);
|
||||
if (entry_guard_register_connect_status(conn->identity_digest,
|
||||
1, now) < 0) {
|
||||
Index: src/or/rephist.c
|
||||
===================================================================
|
||||
--- src/or/rephist.c (revision 17104)
|
||||
+++ src/or/rephist.c (working copy)
|
||||
@@ -18,6 +18,7 @@
|
||||
static void bw_arrays_init(void);
|
||||
static void predicted_ports_init(void);
|
||||
static void hs_usage_init(void);
|
||||
+static void or_port_hist_init(void);
|
||||
|
||||
/** Total number of bytes currently allocated in fields used by rephist.c. */
|
||||
uint64_t rephist_total_alloc=0;
|
||||
@@ -89,6 +90,25 @@
|
||||
digestmap_t *link_history_map;
|
||||
} or_history_t;
|
||||
|
||||
+/** or_port_hist_t contains our router/client's knowledge of
|
||||
+ all OR ports offered on the network, and how many servers with each port we
|
||||
+ have succeeded or failed to connect to. */
|
||||
+typedef struct {
|
||||
+ /** The port this entry is tracking. */
|
||||
+ uint16_t or_port;
|
||||
+ /** Have we ever connected to this port on another OR?. */
|
||||
+ unsigned int success:1;
|
||||
+ /** The ORs using this port. */
|
||||
+ digestmap_t *ids;
|
||||
+ /** The ORs using this port we have failed to connect to. */
|
||||
+ digestmap_t *failure_ids;
|
||||
+ /** Are we excluding ORs with this port during entry selection?*/
|
||||
+ unsigned int excluded;
|
||||
+} or_port_hist_t;
|
||||
+
|
||||
+static unsigned int still_searching = 0;
|
||||
+static smartlist_t *or_port_hists;
|
||||
+
|
||||
/** When did we last multiply all routers' weighted_run_length and
|
||||
* total_run_weights by STABILITY_ALPHA? */
|
||||
static time_t stability_last_downrated = 0;
|
||||
@@ -164,6 +184,16 @@
|
||||
tor_free(hist);
|
||||
}
|
||||
|
||||
+/** Helper: free storage held by a single OR port history entry. */
|
||||
+static void
|
||||
+or_port_hist_free(or_port_hist_t *p)
|
||||
+{
|
||||
+ tor_assert(p);
|
||||
+ digestmap_free(p->ids,NULL);
|
||||
+ digestmap_free(p->failure_ids,NULL);
|
||||
+ tor_free(p);
|
||||
+}
|
||||
+
|
||||
/** Update an or_history_t object <b>hist</b> so that its uptime/downtime
|
||||
* count is up-to-date as of <b>when</b>.
|
||||
*/
|
||||
@@ -1639,7 +1669,7 @@
|
||||
tmp_time = smartlist_get(predicted_ports_times, i);
|
||||
if (*tmp_time + PREDICTED_CIRCS_RELEVANCE_TIME < now) {
|
||||
tmp_port = smartlist_get(predicted_ports_list, i);
|
||||
- log_debug(LD_CIRC, "Expiring predicted port %d", *tmp_port);
|
||||
+ log_debug(LD_HIST, "Expiring predicted port %d", *tmp_port);
|
||||
smartlist_del(predicted_ports_list, i);
|
||||
smartlist_del(predicted_ports_times, i);
|
||||
rephist_total_alloc -= sizeof(uint16_t)+sizeof(time_t);
|
||||
@@ -1821,6 +1851,12 @@
|
||||
tor_free(last_stability_doc);
|
||||
built_last_stability_doc_at = 0;
|
||||
predicted_ports_free();
|
||||
+ if (or_port_hists) {
|
||||
+ SMARTLIST_FOREACH(or_port_hists, or_port_hist_t *, p,
|
||||
+ or_port_hist_free(p));
|
||||
+ smartlist_free(or_port_hists);
|
||||
+ or_port_hists = NULL;
|
||||
+ }
|
||||
}
|
||||
|
||||
/****************** hidden service usage statistics ******************/
|
||||
@@ -2356,3 +2392,225 @@
|
||||
tor_free(fname);
|
||||
}
|
||||
|
||||
+/** Create a new entry in the port tracking cache for the or_port in
|
||||
+ * <b>ri</b>. */
|
||||
+void
|
||||
+or_port_hist_new(const routerinfo_t *ri)
|
||||
+{
|
||||
+ or_port_hist_t *result;
|
||||
+ const char *id=ri->cache_info.identity_digest;
|
||||
+
|
||||
+ if (!or_port_hists)
|
||||
+ or_port_hist_init();
|
||||
+
|
||||
+ SMARTLIST_FOREACH(or_port_hists, or_port_hist_t *, tp,
|
||||
+ {
|
||||
+ /* Cope with routers that change their advertised OR port or are
|
||||
+ dropped from the networkstatus. We don't discard the failures of
|
||||
+ dropped routers because they are still valid when counting
|
||||
+ consecutive failures on a port.*/
|
||||
+ if (digestmap_get(tp->ids, id) && (tp->or_port != ri->or_port)) {
|
||||
+ digestmap_remove(tp->ids, id);
|
||||
+ }
|
||||
+ if (tp->or_port == ri->or_port) {
|
||||
+ if (!(digestmap_get(tp->ids, id)))
|
||||
+ digestmap_set(tp->ids, id, (void*)1);
|
||||
+ return;
|
||||
+ }
|
||||
+ });
|
||||
+
|
||||
+ result = tor_malloc_zero(sizeof(or_port_hist_t));
|
||||
+ result->or_port=ri->or_port;
|
||||
+ result->success=0;
|
||||
+ result->ids=digestmap_new();
|
||||
+ digestmap_set(result->ids, id, (void*)1);
|
||||
+ result->failure_ids=digestmap_new();
|
||||
+ result->excluded=0;
|
||||
+ smartlist_add(or_port_hists, result);
|
||||
+}
|
||||
+
|
||||
+/** Create the port tracking cache. */
|
||||
+/*XXX: need to call this when we rebuild/update our network status */
|
||||
+static void
|
||||
+or_port_hist_init(void)
|
||||
+{
|
||||
+ routerlist_t *rl = router_get_routerlist();
|
||||
+
|
||||
+ if (!or_port_hists)
|
||||
+ or_port_hists=smartlist_create();
|
||||
+
|
||||
+ if (rl && rl->routers) {
|
||||
+ SMARTLIST_FOREACH(rl->routers, routerinfo_t *, ri,
|
||||
+ {
|
||||
+ or_port_hist_new(ri);
|
||||
+ });
|
||||
+ }
|
||||
+}
|
||||
+
|
||||
+#define NOT_BLOCKED 0
|
||||
+#define FAILURES_OBSERVED 1
|
||||
+#define POSSIBLY_BLOCKED 5
|
||||
+#define PROBABLY_BLOCKED 10
|
||||
+/** Return the list of blocked ports for our router's extra-info.*/
|
||||
+char *
|
||||
+or_port_hist_get_blocked_ports(void)
|
||||
+{
|
||||
+ char blocked_ports[2048];
|
||||
+ char *bp;
|
||||
+
|
||||
+ tor_snprintf(blocked_ports,sizeof(blocked_ports),"blocked-ports");
|
||||
+ SMARTLIST_FOREACH(or_port_hists, or_port_hist_t *, tp,
|
||||
+ {
|
||||
+ if (digestmap_size(tp->failure_ids) >= PROBABLY_BLOCKED)
|
||||
+ tor_snprintf(blocked_ports+strlen(blocked_ports),
|
||||
+ sizeof(blocked_ports)," %u,",tp->or_port);
|
||||
+ });
|
||||
+ if (strlen(blocked_ports) == 13)
|
||||
+ return NULL;
|
||||
+ bp=tor_strdup(blocked_ports);
|
||||
+ bp[strlen(bp)-1]='\n';
|
||||
+ bp[strlen(bp)]='\0';
|
||||
+ return bp;
|
||||
+}
|
||||
+
|
||||
+/** Revert to client-only mode if we have seen to many failures on a port or
|
||||
+ * range of ports.*/
|
||||
+static void
|
||||
+or_port_hist_report_block(unsigned int min_severity)
|
||||
+{
|
||||
+ or_options_t *options=get_options();
|
||||
+ char failures_observed[2048],possibly_blocked[2048],probably_blocked[2048];
|
||||
+ char port[1024];
|
||||
+
|
||||
+ memset(failures_observed,0,sizeof(failures_observed));
|
||||
+ memset(possibly_blocked,0,sizeof(possibly_blocked));
|
||||
+ memset(probably_blocked,0,sizeof(probably_blocked));
|
||||
+
|
||||
+ SMARTLIST_FOREACH(or_port_hists, or_port_hist_t *, tp,
|
||||
+ {
|
||||
+ unsigned int failures = digestmap_size(tp->failure_ids);
|
||||
+ if (failures >= min_severity) {
|
||||
+ tor_snprintf(port, sizeof(port), " %u (%u failures %s out of %u on the"
|
||||
+ " network)",tp->or_port,failures,
|
||||
+ (!tp->success)?"and no successes": "since last success",
|
||||
+ digestmap_size(tp->ids));
|
||||
+ if (failures >= PROBABLY_BLOCKED) {
|
||||
+ strlcat(probably_blocked, port, sizeof(probably_blocked));
|
||||
+ } else if (failures >= POSSIBLY_BLOCKED)
|
||||
+ strlcat(possibly_blocked, port, sizeof(possibly_blocked));
|
||||
+ else if (failures >= FAILURES_OBSERVED)
|
||||
+ strlcat(failures_observed, port, sizeof(failures_observed));
|
||||
+ }
|
||||
+ });
|
||||
+
|
||||
+ log_warn(LD_HIST,"%s%s%s%s%s%s%s%s",
|
||||
+ server_mode(options) &&
|
||||
+ ((min_severity==FAILURES_OBSERVED) || strlen(probably_blocked))?
|
||||
+ "You should consider disabling your Tor server.":"",
|
||||
+ (min_severity==FAILURES_OBSERVED)?
|
||||
+ "Tor appears to be blocked from connecting to a range of ports "
|
||||
+ "with the result that it cannot connect to one tenth of the Tor "
|
||||
+ "network. ":"",
|
||||
+ strlen(failures_observed)?
|
||||
+ "Tor has observed failures on the following ports: ":"",
|
||||
+ failures_observed,
|
||||
+ strlen(possibly_blocked)?
|
||||
+ "Tor is possibly blocked on the following ports: ":"",
|
||||
+ possibly_blocked,
|
||||
+ strlen(probably_blocked)?
|
||||
+ "Tor is almost certainly blocked on the following ports: ":"",
|
||||
+ probably_blocked);
|
||||
+
|
||||
+}
|
||||
+
|
||||
+/** Record the success of our connection to <b>digest</b>'s
|
||||
+ * OR port. */
|
||||
+void
|
||||
+or_port_hist_success(uint16_t or_port)
|
||||
+{
|
||||
+ SMARTLIST_FOREACH(or_port_hists, or_port_hist_t *, tp,
|
||||
+ {
|
||||
+ if (tp->or_port != or_port)
|
||||
+ continue;
|
||||
+ /*Reset our failure stats so we can notice if this port ever gets
|
||||
+ blocked again.*/
|
||||
+ tp->success=1;
|
||||
+ if (digestmap_size(tp->failure_ids)) {
|
||||
+ digestmap_free(tp->failure_ids,NULL);
|
||||
+ tp->failure_ids=digestmap_new();
|
||||
+ }
|
||||
+ if (still_searching) {
|
||||
+ still_searching=0;
|
||||
+ SMARTLIST_FOREACH(or_port_hists,or_port_hist_t *,t,t->excluded=0;);
|
||||
+ }
|
||||
+ return;
|
||||
+ });
|
||||
+}
|
||||
+/** Record the failure of our connection to <b>digest</b>'s
|
||||
+ * OR port. Warn, exclude the port from future entry guard selection, or
|
||||
+ * add port to blocked-ports in our server's extra-info as appropriate. */
|
||||
+void
|
||||
+or_port_hist_failure(const char *digest, uint16_t or_port)
|
||||
+{
|
||||
+ int total_failures=0, ports_excluded=0, report_block=0;
|
||||
+ int total_routers=smartlist_len(router_get_routerlist()->routers);
|
||||
+
|
||||
+ SMARTLIST_FOREACH(or_port_hists, or_port_hist_t *, tp,
|
||||
+ {
|
||||
+ ports_excluded += tp->excluded;
|
||||
+ total_failures+=digestmap_size(tp->failure_ids);
|
||||
+ if (tp->or_port != or_port)
|
||||
+ continue;
|
||||
+ /* We're only interested in unique failures */
|
||||
+ if (digestmap_get(tp->failure_ids, digest))
|
||||
+ return;
|
||||
+
|
||||
+ total_failures++;
|
||||
+ digestmap_set(tp->failure_ids, digest, (void*)1);
|
||||
+ if (still_searching && !tp->success) {
|
||||
+ tp->excluded=1;
|
||||
+ ports_excluded++;
|
||||
+ }
|
||||
+ if ((digestmap_size(tp->ids) >= POSSIBLY_BLOCKED) &&
|
||||
+ !(digestmap_size(tp->failure_ids) % POSSIBLY_BLOCKED))
|
||||
+ report_block=POSSIBLY_BLOCKED;
|
||||
+ });
|
||||
+
|
||||
+ if (total_failures >= (int)(total_routers/10))
|
||||
+ or_port_hist_report_block(FAILURES_OBSERVED);
|
||||
+ else if (report_block)
|
||||
+ or_port_hist_report_block(report_block);
|
||||
+
|
||||
+ if (ports_excluded >= smartlist_len(or_port_hists)) {
|
||||
+ log_warn(LD_HIST,"During entry node selection Tor tried every port "
|
||||
+ "offered on the network on at least one server "
|
||||
+ "and didn't manage a single "
|
||||
+ "successful connection. This suggests you are behind an "
|
||||
+ "extremely restrictive firewall. Tor will keep trying to find "
|
||||
+ "a reachable entry node.");
|
||||
+ SMARTLIST_FOREACH(or_port_hists, or_port_hist_t *, tp, tp->excluded=0;);
|
||||
+ }
|
||||
+}
|
||||
+
|
||||
+/** Add any ports marked as excluded in or_port_hist_t to <b>rt</b> */
|
||||
+void
|
||||
+or_port_hist_exclude(routerset_t *rt)
|
||||
+{
|
||||
+ SMARTLIST_FOREACH(or_port_hists, or_port_hist_t *, tp,
|
||||
+ {
|
||||
+ char portpolicy[9];
|
||||
+ if (tp->excluded) {
|
||||
+ tor_snprintf(portpolicy,sizeof(portpolicy),"*:%u", tp->or_port);
|
||||
+ log_warn(LD_HIST,"Port %u may be blocked, excluding it temporarily "
|
||||
+ "from entry guard selection.", tp->or_port);
|
||||
+ routerset_parse(rt, portpolicy, "Ports");
|
||||
+ }
|
||||
+ });
|
||||
+}
|
||||
+
|
||||
+/** Allow the exclusion of ports during our search for an entry node. */
|
||||
+void
|
||||
+or_port_hist_search_again(void)
|
||||
+{
|
||||
+ still_searching=1;
|
||||
+}
|
||||
Index: src/or/or.h
|
||||
===================================================================
|
||||
--- src/or/or.h (revision 17104)
|
||||
+++ src/or/or.h (working copy)
|
||||
@@ -3864,6 +3864,13 @@
|
||||
int any_predicted_circuits(time_t now);
|
||||
int rep_hist_circbuilding_dormant(time_t now);
|
||||
|
||||
+void or_port_hist_failure(const char *digest, uint16_t or_port);
|
||||
+void or_port_hist_success(uint16_t or_port);
|
||||
+void or_port_hist_new(const routerinfo_t *ri);
|
||||
+void or_port_hist_exclude(routerset_t *rt);
|
||||
+void or_port_hist_search_again(void);
|
||||
+char *or_port_hist_get_blocked_ports(void);
|
||||
+
|
||||
/** Possible public/private key operations in Tor: used to keep track of where
|
||||
* we're spending our time. */
|
||||
typedef enum {
|
||||
Index: src/or/routerparse.c
|
||||
===================================================================
|
||||
--- src/or/routerparse.c (revision 17104)
|
||||
+++ src/or/routerparse.c (working copy)
|
||||
@@ -1401,6 +1401,8 @@
|
||||
goto err;
|
||||
}
|
||||
|
||||
+ or_port_hist_new(router);
|
||||
+
|
||||
if (!router->platform) {
|
||||
router->platform = tor_strdup("<unknown>");
|
||||
}
|
||||
Index: src/or/router.c
|
||||
===================================================================
|
||||
--- src/or/router.c (revision 17104)
|
||||
+++ src/or/router.c (working copy)
|
||||
@@ -1818,6 +1818,7 @@
|
||||
char published[ISO_TIME_LEN+1];
|
||||
char digest[DIGEST_LEN];
|
||||
char *bandwidth_usage;
|
||||
+ char *blocked_ports;
|
||||
int result;
|
||||
size_t len;
|
||||
|
||||
@@ -1825,7 +1826,6 @@
|
||||
extrainfo->cache_info.identity_digest, DIGEST_LEN);
|
||||
format_iso_time(published, extrainfo->cache_info.published_on);
|
||||
bandwidth_usage = rep_hist_get_bandwidth_lines(1);
|
||||
-
|
||||
result = tor_snprintf(s, maxlen,
|
||||
"extra-info %s %s\n"
|
||||
"published %s\n%s",
|
||||
@@ -1835,6 +1835,16 @@
|
||||
if (result<0)
|
||||
return -1;
|
||||
|
||||
+ blocked_ports = or_port_hist_get_blocked_ports();
|
||||
+ if (blocked_ports) {
|
||||
+ result = tor_snprintf(s+strlen(s), maxlen-strlen(s),
|
||||
+ "%s",
|
||||
+ blocked_ports);
|
||||
+ tor_free(blocked_ports);
|
||||
+ if (result<0)
|
||||
+ return -1;
|
||||
+ }
|
||||
+
|
||||
if (should_record_bridge_info(options)) {
|
||||
static time_t last_purged_at = 0;
|
||||
char *geoip_summary;
|
||||
Index: src/or/circuitbuild.c
|
||||
===================================================================
|
||||
--- src/or/circuitbuild.c (revision 17104)
|
||||
+++ src/or/circuitbuild.c (working copy)
|
||||
@@ -62,6 +62,7 @@
|
||||
|
||||
static void entry_guards_changed(void);
|
||||
static time_t start_of_month(time_t when);
|
||||
+static int num_live_entry_guards(void);
|
||||
|
||||
/** Iterate over values of circ_id, starting from conn-\>next_circ_id,
|
||||
* and with the high bit specified by conn-\>circ_id_type, until we get
|
||||
@@ -1627,12 +1628,14 @@
|
||||
smartlist_t *excluded;
|
||||
or_options_t *options = get_options();
|
||||
router_crn_flags_t flags = 0;
|
||||
+ routerset_t *_ExcludeNodes;
|
||||
|
||||
if (state && options->UseEntryGuards &&
|
||||
(purpose != CIRCUIT_PURPOSE_TESTING || options->BridgeRelay)) {
|
||||
return choose_random_entry(state);
|
||||
}
|
||||
|
||||
+ _ExcludeNodes = routerset_new();
|
||||
excluded = smartlist_create();
|
||||
|
||||
if (state && (r = build_state_get_exit_router(state))) {
|
||||
@@ -1670,12 +1673,18 @@
|
||||
if (options->_AllowInvalid & ALLOW_INVALID_ENTRY)
|
||||
flags |= CRN_ALLOW_INVALID;
|
||||
|
||||
+ if (options->ExcludeNodes)
|
||||
+ routerset_union(_ExcludeNodes,options->ExcludeNodes);
|
||||
+
|
||||
+ or_port_hist_exclude(_ExcludeNodes);
|
||||
+
|
||||
choice = router_choose_random_node(
|
||||
NULL,
|
||||
excluded,
|
||||
- options->ExcludeNodes,
|
||||
+ _ExcludeNodes,
|
||||
flags);
|
||||
smartlist_free(excluded);
|
||||
+ routerset_free(_ExcludeNodes);
|
||||
return choice;
|
||||
}
|
||||
|
||||
@@ -2727,6 +2736,7 @@
|
||||
entry_guards_update_state(or_state_t *state)
|
||||
{
|
||||
config_line_t **next, *line;
|
||||
+ unsigned int have_reachable_entry=0;
|
||||
if (! entry_guards_dirty)
|
||||
return;
|
||||
|
||||
@@ -2740,6 +2750,7 @@
|
||||
char dbuf[HEX_DIGEST_LEN+1];
|
||||
if (!e->made_contact)
|
||||
continue; /* don't write this one to disk */
|
||||
+ have_reachable_entry=1;
|
||||
*next = line = tor_malloc_zero(sizeof(config_line_t));
|
||||
line->key = tor_strdup("EntryGuard");
|
||||
line->value = tor_malloc(HEX_DIGEST_LEN+MAX_NICKNAME_LEN+2);
|
||||
@@ -2785,6 +2796,11 @@
|
||||
if (!get_options()->AvoidDiskWrites)
|
||||
or_state_mark_dirty(get_or_state(), 0);
|
||||
entry_guards_dirty = 0;
|
||||
+
|
||||
+ /* XXX: Is this the place to decide that we no longer have any reachable
|
||||
+ guards? */
|
||||
+ if (!have_reachable_entry)
|
||||
+ or_port_hist_search_again();
|
||||
}
|
||||
|
||||
/** If <b>question</b> is the string "entry-guards", then dump
|
||||
|
@ -1,104 +0,0 @@
|
||||
Filename: 157-specific-cert-download.txt
|
||||
Title: Make certificate downloads specific
|
||||
Version: $Revision$
|
||||
Last-Modified: $Date$
|
||||
Author: Nick Mathewson
|
||||
Created: 2-Dec-2008
|
||||
Status: Accepted
|
||||
Target: 0.2.1.x
|
||||
|
||||
History:
|
||||
|
||||
2008 Dec 2, 22:34
|
||||
Changed name of cross certification field to match the other authority
|
||||
certificate fields.
|
||||
|
||||
Status:
|
||||
|
||||
As of 0.2.1.9-alpha:
|
||||
Cross-certification is implemented for new certificates, but not yet
|
||||
required. Directories support the tor/keys/fp-sk urls.
|
||||
|
||||
Overview:
|
||||
|
||||
Tor's directory specification gives two ways to download a certificate:
|
||||
by its identity fingerprint, or by the digest of its signing key. Both
|
||||
are error-prone. We propose a new download mechanism to make sure that
|
||||
clients get the certificates they want.
|
||||
|
||||
Motivation:
|
||||
|
||||
When a client wants a certificate to verify a consensus, it has two choices
|
||||
currently:
|
||||
- Download by identity key fingerprint. In this case, the client risks
|
||||
getting a certificate for the same authority, but with a different
|
||||
signing key than the one used to sign the consensus.
|
||||
|
||||
- Download by signing key fingerprint. In this case, the client risks
|
||||
getting a forged certificate that contains the right signing key
|
||||
signed with the wrong identity key. (Since caches are willing to
|
||||
cache certs from authorities they do not themselves recognize, the
|
||||
attacker wouldn't need to compromise an authority's key to do this.)
|
||||
|
||||
Current solution:
|
||||
|
||||
Clients fetch by identity keys, and re-fetch with backoff if they don't get
|
||||
certs with the signing key they want.
|
||||
|
||||
Proposed solution:
|
||||
|
||||
Phase 1: Add a URL type for clients to download certs by identity _and_
|
||||
signing key fingerprint. Unless both fields match, the client doesn't
|
||||
accept the certificate(s). Clients begin using this method when their
|
||||
randomly chosen directory cache supports it.
|
||||
|
||||
Phase 1A: Simultaneously, add a cross-certification element to
|
||||
certificates.
|
||||
|
||||
Phase 2: Once many directory caches support phase 1, clients should prefer
|
||||
to fetch certificates using that protocol when available.
|
||||
|
||||
Phase 2A: Once all authorities are generating cross-certified certificates
|
||||
as in phase 1A, require cross-certification.
|
||||
|
||||
Specification additions:
|
||||
|
||||
The key certificate whose identity key fingerprint is <F> and whose signing
|
||||
key fingerprint is <S> should be available at:
|
||||
|
||||
http://<hostname>/tor/keys/fp-sk/<F>-<S>.z
|
||||
|
||||
As usual, clients may request multiple certificates using:
|
||||
|
||||
http://<hostname>/tor/keys/fp-sk/<F1>-<S1>+<F2>-<S2>.z
|
||||
|
||||
Clients SHOULD use this format whenever they know both key fingerprints for
|
||||
a desired certificate.
|
||||
|
||||
|
||||
Certificates SHOULD contain the following field (at most once):
|
||||
|
||||
"dir-key-crosscert" NL CrossSignature NL
|
||||
|
||||
where CrossSignature is a signature, made using the certificate's signing
|
||||
key, of the digest of the PKCS1-padded hash of the certificate's identity
|
||||
key. For backward compatibility with broken versions of the parser, we
|
||||
wrap the base64-encoded signature in -----BEGIN ID SIGNATURE---- and
|
||||
-----END ID SIGNATURE----- tags. (See bug 880.) Implementations MUST allow
|
||||
the "ID " portion to be omitted, however.
|
||||
|
||||
When encountering a certificate with a dir-key-crosscert entry,
|
||||
implementations MUST verify that the signature is a correct signature of
|
||||
the hash of the identity key using the signing key.
|
||||
|
||||
(In a future version of this specification, dir-key-crosscert entries will
|
||||
be required.)
|
||||
|
||||
Why cross-certify too?
|
||||
|
||||
Cross-certification protects clients who haven't updated yet, by reducing
|
||||
the number of caches that are willing to hold and serve bogus certificates.
|
||||
|
||||
References:
|
||||
|
||||
This is related to part 2 of bug 854.
|
@ -1,207 +0,0 @@
|
||||
Filename: 158-microdescriptors.txt
|
||||
Title: Clients download consensus + microdescriptors
|
||||
Version: $Revision$
|
||||
Last-Modified: $Date$
|
||||
Author: Roger Dingledine
|
||||
Created: 17-Jan-2009
|
||||
Status: Open
|
||||
|
||||
1. Overview
|
||||
|
||||
This proposal replaces section 3.2 of proposal 141, which was
|
||||
called "Fetching descriptors on demand". Rather than modifying the
|
||||
circuit-building protocol to fetch a server descriptor inline at each
|
||||
circuit extend, we instead put all of the information that clients need
|
||||
either into the consensus itself, or into a new set of data about each
|
||||
relay called a microdescriptor. The microdescriptor is a direct
|
||||
transform from the relay descriptor, so relays don't even need to know
|
||||
this is happening.
|
||||
|
||||
Descriptor elements that are small and frequently changing should go
|
||||
in the consensus itself, and descriptor elements that are small and
|
||||
relatively static should go in the microdescriptor. If we ever end up
|
||||
with descriptor elements that aren't small yet clients need to know
|
||||
them, we'll need to resume considering some design like the one in
|
||||
proposal 141.
|
||||
|
||||
2. Motivation
|
||||
|
||||
See
|
||||
http://archives.seul.org/or/dev/Nov-2008/msg00000.html and
|
||||
http://archives.seul.org/or/dev/Nov-2008/msg00001.html and especially
|
||||
http://archives.seul.org/or/dev/Nov-2008/msg00007.html
|
||||
for a discussion of the options and why this is currently the best
|
||||
approach.
|
||||
|
||||
3. Design
|
||||
|
||||
There are three pieces to the proposal. First, authorities will list in
|
||||
their votes (and thus in the consensus) what relay descriptor elements
|
||||
are included in the microdescriptor, and also list the expected hash
|
||||
of microdescriptor for each relay. Second, directory mirrors will serve
|
||||
microdescriptors. Third, clients will ask for them and cache them.
|
||||
|
||||
3.1. Consensus changes
|
||||
|
||||
V3 votes should include a new line:
|
||||
microdescriptor-elements bar baz foo
|
||||
listing each descriptor element (sorted alphabetically) that authority
|
||||
included when it calculated its expected microdescriptor hashes.
|
||||
|
||||
We also need to include the hash of each expected microdescriptor in
|
||||
the routerstatus section. I suggest a new "m" line for each stanza,
|
||||
with the base64 of the hash of the elements that the authority voted
|
||||
for above.
|
||||
|
||||
The consensus microdescriptor-elements and "m" lines are then computed
|
||||
as described in Section 3.1.2 below.
|
||||
|
||||
I believe that means we need a new consensus-method "6" that knows
|
||||
how to compute the microdescriptor-elements and add "m" lines.
|
||||
|
||||
3.1.1. Descriptor elements to include for now
|
||||
|
||||
To start, the element list that authorities suggest should be
|
||||
family onion-key
|
||||
|
||||
(Note that the or-dev posts above only mention onion-key, but if
|
||||
we don't also include family then clients will never learn it. It
|
||||
seemed like it should be relatively static, so putting it in the
|
||||
microdescriptor is smarter than trying to fit it into the consensus.)
|
||||
|
||||
We could imagine a config option "family,onion-key" so authorities
|
||||
could change their voted preferences without needing to upgrade.
|
||||
|
||||
3.1.2. Computing consensus for microdescriptor-elements and "m" lines
|
||||
|
||||
One approach is for the consensus microdescriptor-elements line to
|
||||
include every element listed by a majority of authorities, sorted. The
|
||||
problem here is that it will no longer be deterministic what the correct
|
||||
hash for the "m" line should be. We could imagine telling the authority
|
||||
to go look in its descriptor and produce the right hash itself, but
|
||||
we don't want consensus calculation to be based on external data like
|
||||
that. (Plus, the authority may not have the descriptor that everybody
|
||||
else voted to use.)
|
||||
|
||||
The better approach is to take the exact set that has the most votes
|
||||
(breaking ties by the set that has the most elements, and breaking
|
||||
ties after that by whichever is alphabetically first). That will
|
||||
increase the odds that we actually get a microdescriptor hash that
|
||||
is both a) for the descriptor we're putting in the consensus, and b)
|
||||
over the elements that we're declaring it should be for.
|
||||
|
||||
Then the "m" line for a given relay is the one that gets the most votes
|
||||
from authorities that both a) voted for the microdescriptor-elements
|
||||
line we're using, and b) voted for the descriptor we're using.
|
||||
|
||||
(If there's a tie, use the smaller hash. But really, if there are
|
||||
multiple such votes and they differ about a microdescriptor, we caught
|
||||
one of them lying or being buggy. We should log it to track down why.)
|
||||
|
||||
If there are no such votes, then we leave out the "m" line for that
|
||||
relay. That means clients should avoid it for this time period. (As
|
||||
an extension it could instead mean that clients should fetch the
|
||||
descriptor and figure out its microdescriptor themselves. But let's
|
||||
not get ahead of ourselves.)
|
||||
|
||||
It would be nice to have a more foolproof way to agree on what
|
||||
microdescriptor hash each authority should vote for, so we can avoid
|
||||
missing "m" lines. Just switching to a new consensus-method each time
|
||||
we change the set of microdescriptor-elements won't help though, since
|
||||
each authority will still have to decide what hash to vote for before
|
||||
knowing what consensus-method will be used.
|
||||
|
||||
Here's one way we could do it. Each vote / consensus includes
|
||||
the microdescriptor-elements that were used to compute the hashes,
|
||||
and also a preferred-microdescriptor-elements set. If an authority
|
||||
has a consensus from the previous period, then it should use the
|
||||
consensus preferred-microdescriptor-elements when computing its votes
|
||||
for microdescriptor-elements and the appropriate hashes in the upcoming
|
||||
period. (If it has no previous consensus, then it just writes its
|
||||
own preferences in both lines.)
|
||||
|
||||
3.2. Directory mirrors serve microdescriptors
|
||||
|
||||
Directory mirrors should then read the microdescriptor-elements line
|
||||
from the consensus, and learn how to answer requests. (Directory mirrors
|
||||
continue to serve normal relay descriptors too, a) to serve old clients
|
||||
and b) to be able to construct microdescriptors on the fly.)
|
||||
|
||||
The microdescriptors with hashes <D1>,<D2>,<D3> should be available at:
|
||||
http://<hostname>/tor/micro/d/<D1>+<D2>+<D3>.z
|
||||
|
||||
All the microdescriptors from the current consensus should also be
|
||||
available at:
|
||||
http://<hostname>/tor/micro/all.z
|
||||
so a client that's bootstrapping doesn't need to send a 70KB URL just
|
||||
to name every microdescriptor it's looking for.
|
||||
|
||||
The format of a microdescriptor is the header line
|
||||
"microdescriptor-header"
|
||||
followed by each element (keyword and body), alphabetically. There's
|
||||
no need to mention what hash it's for, since it's self-identifying:
|
||||
you can hash the elements to learn this.
|
||||
|
||||
(Do we need a footer line to show that it's over, or is the next
|
||||
microdescriptor line or EOF enough of a hint? A footer line wouldn't
|
||||
hurt much. Also, no fair voting for the microdescriptor-element
|
||||
"microdescriptor-header".)
|
||||
|
||||
The hash of the microdescriptor is simply the hash of the concatenated
|
||||
elements -- not counting the header line or hypothetical footer line.
|
||||
Unless you prefer that?
|
||||
|
||||
Is there a reasonable way to version these things? We could say that
|
||||
the microdescriptor-header line can contain arguments which clients
|
||||
must ignore if they don't understand them. Any better ways?
|
||||
|
||||
Directory mirrors should check to make sure that the microdescriptors
|
||||
they're about to serve match the right hashes (either the hashes from
|
||||
the fetch URL or the hashes from the consensus, respectively).
|
||||
|
||||
We will probably want to consider some sort of smart data structure to
|
||||
be able to quickly convert microdescriptor hashes into the appropriate
|
||||
microdescriptor. Clients will want this anyway when they load their
|
||||
microdescriptor cache and want to match it up with the consensus to
|
||||
see what's missing.
|
||||
|
||||
3.3. Clients fetch them and cache them
|
||||
|
||||
When a client gets a new consensus, it looks to see if there are any
|
||||
microdescriptors it needs to learn. If it needs to learn more than
|
||||
some threshold of the microdescriptors (half?), it requests 'all',
|
||||
else it requests only the missing ones.
|
||||
|
||||
Clients maintain a cache of microdescriptors along with metadata like
|
||||
when it was last referenced by a consensus. They keep a microdescriptor
|
||||
until it hasn't been mentioned in any consensus for a week. Future
|
||||
clients might cache them for longer or shorter times.
|
||||
|
||||
3.3.1. Information leaks from clients
|
||||
|
||||
If a client asks you for a set of microdescs, then you know she didn't
|
||||
have them cached before. How much does that leak? What about when
|
||||
we're all using our entry guards as directory guards, and we've seen
|
||||
that user make a bunch of circuits already?
|
||||
|
||||
Fetching "all" when you need at least half is a good first order fix,
|
||||
but might not be all there is to it.
|
||||
|
||||
Another future option would be to fetch some of the microdescriptors
|
||||
anonymously (via a Tor circuit).
|
||||
|
||||
4. Transition and deployment
|
||||
|
||||
Phase one, the directory authorities should start voting on
|
||||
microdescriptors and microdescriptor elements, and putting them in the
|
||||
consensus. This should happen during the 0.2.1.x series, and should
|
||||
be relatively easy to do.
|
||||
|
||||
Phase two, directory mirrors should learn how to serve them, and learn
|
||||
how to read the consensus to find out what they should be serving. This
|
||||
phase could be done either in 0.2.1.x or early in 0.2.2.x, depending
|
||||
on how messy it turns out to be and how quickly we get around to it.
|
||||
|
||||
Phase three, clients should start fetching and caching them instead
|
||||
of normal descriptors. This should happen post 0.2.1.x.
|
||||
|
@ -1,144 +0,0 @@
|
||||
Filename: 159-exit-scanning.txt
|
||||
Title: Exit Scanning
|
||||
Version: $Revision$
|
||||
Last-Modified: $Date$
|
||||
Author: Mike Perry
|
||||
Created: 13-Feb-2009
|
||||
Status: Open
|
||||
|
||||
Overview:
|
||||
|
||||
This proposal describes the implementation and integration of an
|
||||
automated exit node scanner for scanning the Tor network for malicious,
|
||||
misconfigured, firewalled or filtered nodes.
|
||||
|
||||
Motivation:
|
||||
|
||||
Tor exit nodes can be run by anyone with an Internet connection. Often,
|
||||
these users aren't fully aware of limitations of their networking
|
||||
setup. Content filters, antivirus software, advertisements injected by
|
||||
their service providers, malicious upstream providers, and the resource
|
||||
limitations of their computer or networking equipment have all been
|
||||
observed on the current Tor network.
|
||||
|
||||
It is also possible that some nodes exist purely for malicious
|
||||
purposes. In the past, there have been intermittent instances of
|
||||
nodes spoofing SSH keys, as well as nodes being used for purposes of
|
||||
plaintext surveillance.
|
||||
|
||||
While it is not realistic to expect to catch extremely targeted or
|
||||
completely passive malicious adversaries, the goal is to prevent
|
||||
malicious adversaries from deploying dragnet attacks against large
|
||||
segments of the Tor userbase.
|
||||
|
||||
|
||||
Scanning methodology:
|
||||
|
||||
The first scans to be implemented are HTTP, HTML, Javascript, and
|
||||
SSL scans.
|
||||
|
||||
The HTTP scan scrapes Google for common filetype urls such as exe, msi,
|
||||
doc, dmg, etc. It then fetches these urls through Non-Tor and Tor, and
|
||||
compares the SHA1 hashes of the resulting content.
|
||||
|
||||
The SSL scan downloads certificates for all IPs a domain will locally
|
||||
resolve to and compares these certificates to those seen over Tor. The
|
||||
scanner notes if a domain had rotated certificates locally in the
|
||||
results for each scan.
|
||||
|
||||
The HTML scan checks HTML, Javascript, and plugin content for
|
||||
modifications. Because of the dynamic nature of most of the web, the
|
||||
scanner has a number of mechanisms built in to filter out false
|
||||
positives that are used when a change is noticed between Tor and
|
||||
Non-Tor.
|
||||
|
||||
All tests also share a URL-based false positive filter that
|
||||
automatically removes results retroactively if the number of failures
|
||||
exceeds a certain percentage of nodes tested with the URL.
|
||||
|
||||
|
||||
Deployment Stages:
|
||||
|
||||
To avoid instances where bugs cause us to mark exit nodes as BadExit
|
||||
improperly, it is proposed that we begin use of the scanner in stages.
|
||||
|
||||
1. Manual Review:
|
||||
|
||||
In the first stage, basic scans will be run by a small number of
|
||||
people while we stabilize the scanner. The scanner has the ability
|
||||
to resume crashed scans, and to rescan nodes that fail various
|
||||
tests.
|
||||
|
||||
2. Human Review:
|
||||
|
||||
In the second stage, results will be automatically mailed to
|
||||
an email list of interested parties for review. We will also begin
|
||||
classifying failure types into three to four different severity
|
||||
levels, based on both the reliability of the test and the nature of
|
||||
the failure.
|
||||
|
||||
3. Automatic BadExit Marking:
|
||||
|
||||
In the final stage, the scanner will begin marking exits depending
|
||||
on the failure severity level in one of three different ways: by
|
||||
node idhex, by node IP, or by node IP mask. A potential fourth, less
|
||||
severe category of results may still be delivered via email only for
|
||||
review.
|
||||
|
||||
BadExit markings will be delivered in batches upon completion
|
||||
of whole-network scans, so that the final false positive
|
||||
filter has an opportunity to filter out URLs that exhibit
|
||||
dynamic content beyond what we can filter.
|
||||
|
||||
|
||||
Specification of Exit Marking:
|
||||
|
||||
Technically, BadExit could be marked via SETCONF AuthDirBadExit over
|
||||
the control port, but this would allow full access to the directory
|
||||
authority configuration and operation.
|
||||
|
||||
The approved-routers file could also be used, but currently it only
|
||||
supports fingerprints, and it also contains other data unrelated to
|
||||
exit scanning that would be difficult to coordinate.
|
||||
|
||||
Instead, we propose that a new badexit-routers file that has three
|
||||
keywords:
|
||||
|
||||
BadExitNet 1*[exitpattern from 2.3 in dir-spec.txt]
|
||||
BadExitFP 1*[hexdigest from 2.3 in dir-spec.txt]
|
||||
|
||||
BadExitNet lines would follow the codepaths used by AuthDirBadExit to
|
||||
set authdir_badexit_policy, and BadExitFP would follow the codepaths
|
||||
from approved-router's !badexit lines.
|
||||
|
||||
The scanner would have exclusive ability to write, append, rewrite,
|
||||
and modify this file. Prior to building a new consensus vote, a
|
||||
participating Tor authority would read in a fresh copy.
|
||||
|
||||
|
||||
Security Implications:
|
||||
|
||||
Aside from evading the scanner's detection, there are two additional
|
||||
high-level security considerations:
|
||||
|
||||
1. Ensure nodes cannot be marked BadExit by an adversary at will
|
||||
|
||||
It is possible individual website owners will be able to target certain
|
||||
Tor nodes, but once they begin to attempt to fail more than the URL
|
||||
filter percentage of the exits, their sites will be automatically
|
||||
discarded.
|
||||
|
||||
Failing specific nodes is possible, but scanned results are fully
|
||||
reproducible, and BadExits should be rare enough that humans are never
|
||||
fully removed from the loop.
|
||||
|
||||
State (cookies, cache, etc) does not otherwise persist in the scanner
|
||||
between exit nodes to enable one exit node to bias the results of a
|
||||
later one.
|
||||
|
||||
2. Ensure that scanner compromise does not yield authority compromise
|
||||
|
||||
Having a separate file that is under the exclusive control of the
|
||||
scanner allows us to heavily isolate the scanner from the Tor
|
||||
authority, potentially even running them on separate machines.
|
||||
|
@ -1,39 +0,0 @@
|
||||
|
||||
Notes on an auto updater:
|
||||
|
||||
steve wants a "latest" symlink so he can always just fetch that.
|
||||
|
||||
roger worries that this will exacerbate the "what version are you
|
||||
using?" "latest." problem.
|
||||
|
||||
weasel suggests putting the latest recommended version in dns. then
|
||||
we don't have to hit the website. it's got caching, it's lightweight,
|
||||
it scales. just put it in a TXT record or something.
|
||||
|
||||
but, no dnssec.
|
||||
|
||||
roger suggests a file on the https website that lists the latest
|
||||
recommended version (or filename or url or something like that).
|
||||
|
||||
(steve seems to already be doing this with xerobank. he additionally
|
||||
suggests a little blurb that can be displayed to the user to describe
|
||||
what's new.)
|
||||
|
||||
how to verify you're getting the right file?
|
||||
a) it's https.
|
||||
b) ship with a signing key, and use some openssl functions to verify.
|
||||
c) both
|
||||
|
||||
andrew reminds us that we have a "recommended versions" line in the
|
||||
consensus directory already.
|
||||
|
||||
if only we had some way to point out the "latest stable recommendation"
|
||||
from this list. we could list it first, or something.
|
||||
|
||||
the recommended versions line also doesn't take into account which
|
||||
packages are available -- e.g. on Windows one version might be the best
|
||||
available, and on OS X it might be a different one.
|
||||
|
||||
aren't there existing solutions to this? surely there is a beautiful,
|
||||
efficient, crypto-correct auto updater lib out there. even for windows.
|
||||
|
@ -1,174 +0,0 @@
|
||||
|
||||
How to hand out bridges.
|
||||
|
||||
Divide bridges into 'strategies' as they come in. Do this uniformly
|
||||
at random for now.
|
||||
|
||||
For each strategy, we'll hand out bridges in a different way to
|
||||
clients. This document describes two strategies: email-based and
|
||||
IP-based.
|
||||
|
||||
0. Notation:
|
||||
|
||||
HMAC(k,v) : an HMAC of v using the key k.
|
||||
|
||||
A|B: The string A concatenated with the string B.
|
||||
|
||||
|
||||
1. Email-based.
|
||||
|
||||
Goal: bootstrap based on one or more popular email service's sybil
|
||||
prevention algorithms.
|
||||
|
||||
|
||||
Parameters:
|
||||
HMAC -- an HMAC function
|
||||
P -- a time period
|
||||
K -- the number of bridges to send in a period.
|
||||
|
||||
Setup: Generate two nonces, N and M.
|
||||
|
||||
As bridges arrive, put them into a ring according to HMAC(N,ID)
|
||||
where ID is the bridges's identity digest.
|
||||
|
||||
Divide time into divisions of length P.
|
||||
|
||||
When we get an email:
|
||||
|
||||
If it's not from a supported email service, reject it.
|
||||
|
||||
If we already sent a response to that email address (normalized)
|
||||
in this period, send _exactly_ the same response.
|
||||
|
||||
If it is from a supported service, generate X = HMAC(M,PS|E) where E
|
||||
is the lowercased normalized email address for the user, and
|
||||
where PS is the start of the currrent period. Send
|
||||
the first K bridges in the ring after point X.
|
||||
|
||||
[If we want to make sure that repeat queries are given exactly the
|
||||
same results, then we can't let the ring change during the
|
||||
time period. For a long time period like a month, that's quite a
|
||||
hassle. How about instead just keeping a replay cache of addresses
|
||||
that have been answered, and sending them a "sorry, you already got
|
||||
your addresses for the time period; perhaps you should try these
|
||||
other fine distribution strategies while you wait?" response? This
|
||||
approach would also resolve the "Make sure you can't construct a
|
||||
distinct address to match an existing one" note below. -RD]
|
||||
|
||||
[I think, if we get a replay, we need to send back the same
|
||||
answer as we did the first time, not say "try again."
|
||||
Otherwise we need to worry that an attacker can keep people
|
||||
from getting bridges by preemtively asking for them,
|
||||
or that an attacker may force them to prove they haven't
|
||||
gotten any bridges by asking. -NM]
|
||||
|
||||
[While we're at it, if we do the replay cache thing and don't need
|
||||
repeatable answers, we could just pick K random answers from the
|
||||
pool. Is it beneficial that a bridge user who knows about a clump of
|
||||
nodes will be sharing them with other users who know about a similar
|
||||
(overlapping) clump? One good aspect is against an adversary who
|
||||
learns about a clump this way and watches those bridges to learn
|
||||
other users and discover *their* bridges: he doesn't learn about
|
||||
as many new bridges as he might if they were randomly distributed.
|
||||
A drawback is against an adversary who happens to pick two email
|
||||
addresses in P that include overlapping answers: he can measure
|
||||
the difference in clumps and estimate how quickly the bridge pool
|
||||
is growing. -RD]
|
||||
|
||||
[Random is one more darn thing to implement; rings are already
|
||||
there. -NM]
|
||||
|
||||
[If we make the period P be mailbox-specific, and make it a random
|
||||
value around some mean, then we make it harder for an attacker to
|
||||
know when to try using his small army of gmail addresses to gather
|
||||
another harvest. But we also make it harder for users to know when
|
||||
they can try again. -RD]
|
||||
|
||||
[Letting the users know about when they can try again seems
|
||||
worthwhile. Otherwise users and attackers will all probe and
|
||||
probe and probe until they get an answer. No additional
|
||||
security will be achieved, but bandwidth will be lost. -NM]
|
||||
|
||||
To normalize an email address:
|
||||
Start with the RFC822 address. Consider only the mailbox {???}
|
||||
portion of the address (username@domain). Put this into lowercase
|
||||
ascii.
|
||||
|
||||
Questions:
|
||||
What to do with weird character encodings? Look up the RFC.
|
||||
|
||||
Notes:
|
||||
Make sure that you can't force a single email address to appear
|
||||
in lots of different ways. IOW, if nickm@freehaven.net and
|
||||
NICKM@freehaven.net aren't treated the same, then I can get lots
|
||||
more bridges than I should.
|
||||
|
||||
Make sure you can't construct a distinct address to match an
|
||||
existing one. IOW, if we treat nickm@X and nickm@Y as the same
|
||||
user, then anybody can register nickm@Z and use it to tell which
|
||||
bridges nickm@X got (or would get).
|
||||
|
||||
Make sure that we actually check headers so we can't be trivially
|
||||
used to spam people.
|
||||
|
||||
|
||||
2. IP-based.
|
||||
|
||||
Goal: avoid handing out all the bridges to users in a similar IP
|
||||
space and time.
|
||||
|
||||
Parameters:
|
||||
|
||||
T_Flush -- how long it should take a user on a single network to
|
||||
see a whole cluster of bridges.
|
||||
|
||||
N_C
|
||||
|
||||
K -- the number of bridges we hand out in response to a single
|
||||
request.
|
||||
|
||||
Setup: using an AS map or a geoip map or some other flawed input
|
||||
source, divide IP space into "areas" such that surveying a large
|
||||
collection of "areas" is hard. For v0, use /24 address blocks.
|
||||
|
||||
Group areas into N_C clusters.
|
||||
|
||||
Generate secrets L, M, N.
|
||||
|
||||
Set the period P such that P*(bridges-per-cluster/K) = T_flush.
|
||||
Don't set P to greater than a week, or less than three hours.
|
||||
|
||||
When we get a bridge:
|
||||
|
||||
Based on HMAC(L,ID), assign the bridge to a cluster. Within each
|
||||
cluster, keep the bridges in a ring based on HMAC(M,ID).
|
||||
|
||||
[Should we re-sort the rings for each new time period, so the ring
|
||||
for a given cluster is based on HMAC(M,PS|ID)? -RD]
|
||||
|
||||
When we get a connection:
|
||||
|
||||
If it's http, redirect it to https.
|
||||
|
||||
Let area be the incoming IP network. Let PS be the current
|
||||
period. Compute X = HMAC(N, PS|area). Return the next K bridges
|
||||
in the ring after X.
|
||||
|
||||
[Don't we want to compute C = HMAC(key, area) to learn what cluster
|
||||
to answer from, and then X = HMAC(key, PS|area) to pick a point in
|
||||
that ring? -RD]
|
||||
|
||||
|
||||
Need to clarify that some HMACs are for rings, and some are for
|
||||
partitions. How rings scale is clear. How do we grow the number of
|
||||
partitions? Looking at successive bits from the HMAC output is one way.
|
||||
|
||||
3. Open issues
|
||||
|
||||
Denial of service attacks
|
||||
A good view of network topology
|
||||
|
||||
at some point we should learn some reliability stats on our bridges. when
|
||||
we say above 'give out k bridges', we might give out 2 reliable ones and
|
||||
k-2 others. we count around the ring the same way we do now, to find them.
|
||||
|
@ -1,44 +0,0 @@
|
||||
Author: Geoff Goodell
|
||||
Title: Allow controller to manage circuit extensions
|
||||
Date: 12 March 2006
|
||||
|
||||
History:
|
||||
|
||||
This was once bug 268. Moving it into the proposal system for posterity.
|
||||
|
||||
Test:
|
||||
|
||||
Tor controllers should have a means of learning more about circuits built
|
||||
through Tor routers. Specifically, if a Tor controller is connected to a Tor
|
||||
router, it should be able to subscribe to a new class of events, perhaps
|
||||
"onion" or "router" events. A Tor router SHOULD then ensure that the
|
||||
controller is informed:
|
||||
|
||||
(a) (NEW) when it receives a connection from some other location, in which
|
||||
case it SHOULD indicate (1) a unique identifier for the circuit, and (2) a
|
||||
ServerID in the event of an OR connection from another Tor router, and
|
||||
Hostname otherwise.
|
||||
|
||||
(b) (REQUEST) when it receives a request to extend an existing circuit to a
|
||||
successive Tor router, in which case it SHOULD provide (1) the unique
|
||||
identifier for the circuit, (2) a Hostname (or, if possible, ServerID) of the
|
||||
previous Tor router in the circuit, and (3) a ServerID for the requested
|
||||
successive Tor router in the circuit;
|
||||
|
||||
(c) (EXTEND) Tor will attempt to extend the circuit to some other router, in
|
||||
which case it SHOULD provide the same fields as provided for REQUEST.
|
||||
|
||||
(d) (SUCCEEDED) The circuit has been successfully extended to some ther
|
||||
router, in which case it SHOULD provide the same fields as provided for
|
||||
REQUEST.
|
||||
|
||||
We also need a new configuration option analogous to _leavestreamsunattached,
|
||||
specifying whether the controller is to manage circuit extensions or not.
|
||||
Perhaps we can call it "_leavecircuitsunextended". When set to 0, Tor
|
||||
manages everything as usual. When set to 1, a circuit received by the Tor
|
||||
router cannot transition from "REQUEST" to "EXTEND" state without being
|
||||
directed by a new controller command. The controller command probably does
|
||||
not need any arguments, since circuits are extended per client source
|
||||
routing, and all that the controller does is accept or reject the extension.
|
||||
|
||||
This feature can be used as a basis for enforcing routing policy.
|
@ -1,44 +0,0 @@
|
||||
1. Scanning process
|
||||
A. Non-HTML/JS HTTP mime types compared via SHA1 hash
|
||||
B. Dynamic HTTP content filtered at 4 levels:
|
||||
1. IP change+Tor cookie utilization
|
||||
- Tor cookies replayed with new IP in case of changes
|
||||
2. HTML Tag+Attribute+JS comparison
|
||||
- Comparisons made based only on "relevant" HTML tags
|
||||
and attributes
|
||||
3. HTML Tag+Attribute+JS diffing
|
||||
- Tags, attributes and JS AST nodes that change during
|
||||
Non-Tor fetches pruned from comparison
|
||||
4. URLS with > N% of node failures removed
|
||||
- results purged from filesystem at end of scan loop
|
||||
C. SSL scanning handles some forms of dynamic certs
|
||||
1. Catalogs certs for all IPs resolved locally
|
||||
by getaddrinfo over the duration of the scan.
|
||||
- Updated each test.
|
||||
2. If the domain presents a new cert for each IP, this
|
||||
is noted on the failure result for the node
|
||||
3. If the same IP presents two different certs locally,
|
||||
the cert list is first refreshed, and if it happens
|
||||
again, discarded
|
||||
4. A N% node failure filter also applies
|
||||
D. Scanner can be restarted from any point in the event
|
||||
of scanner or system crashes, or graceful shutdown.
|
||||
- Results+scan state pickled to filesystem continuously
|
||||
2. Cron job checks results periodically for reporting
|
||||
A. Divide failures into three types of BadExit based on type
|
||||
and frequency over time and incident rate
|
||||
B. write reject lines to approved-routers for those three types:
|
||||
1. ID Hex based (for misconfig/network problems easily fixed)
|
||||
2. IP based (for content modification)
|
||||
3. IP+mask based (for continuous/egregious content modification)
|
||||
C. Emails results to tor-scanners@freehaven.net
|
||||
3. Human Review and Appeal
|
||||
A. ID Hex-based BadExit is meant to be possible to removed easily
|
||||
without needing to beg us.
|
||||
- Should this behavior be encouraged?
|
||||
B. Optionally can reserve IP based badexits for human review
|
||||
1. Results are encapsulated fully on the filesystem and can be
|
||||
reviewed without network access
|
||||
2. Soat has --rescan to rescan failed nodes from a data directory
|
||||
- New set of URLs used
|
||||
|
@ -1,137 +0,0 @@
|
||||
|
||||
|
||||
Abstract
|
||||
|
||||
This document explains how to tell about how many Tor users there
|
||||
are, and how many there are in which country. Statistics are
|
||||
involved.
|
||||
|
||||
Motivation
|
||||
|
||||
There are a few reasons we need to keep track of which countries
|
||||
Tor users (in aggregate) are coming from:
|
||||
|
||||
- Resource allocation. Knowing about underserved countries with
|
||||
lots of users can let us know about where we need to direct
|
||||
translation and outreach efforts.
|
||||
|
||||
- Anticensorship. Sudden drops in usage on a national basis can
|
||||
indicate the arrival of a censorious firewall.
|
||||
|
||||
- Sponsor outreach and self-evalutation. Many people and
|
||||
organizations who are interested in funding The Tor Project's
|
||||
work want to know that we're successfully serving parts of the
|
||||
world they're interested in, and that efforts to expand our
|
||||
userbase are actually succeeding. So do we.
|
||||
|
||||
Goals
|
||||
|
||||
We want to know approximately how many Tor users there are, and which
|
||||
countries they're in, even in the presence of a hypothetical
|
||||
"directory guard" feature. Some uncertainty is okay, but we'd like
|
||||
to be able to put a bound on the uncertainty.
|
||||
|
||||
We need to make sure this information isn't exposed in a way that
|
||||
helps an adversary.
|
||||
|
||||
Methods for current clients:
|
||||
|
||||
Every client downloads network status documents. There are
|
||||
currently three methods (one hypothetical) for clients to get them.
|
||||
- 0.1.2.x clients (and earlier) fetch a v2 networkstatus
|
||||
document about every NETWORKSTATUS_CLIENT_DL_INTERVAL [30
|
||||
minutes].
|
||||
|
||||
- 0.2.0.x clients fetch a v3 networkstatus consensus document
|
||||
at a random interval between when their current document is no
|
||||
longer freshest, and when their current document is about to
|
||||
expire.
|
||||
|
||||
[In both of the above cases, clients choose a running
|
||||
directory cache at random with odds roughly proportional to
|
||||
its bandwidth. If they're just starting, they know a XXXX FIXME -NM]
|
||||
|
||||
- In some future version, clients will choose directory caches
|
||||
to serve as their "directory guards" to avoid profiling
|
||||
attacks, similarly to how clients currently start all their
|
||||
circuits at guard nodes.
|
||||
|
||||
We assume that a directory cache can tell which of these three
|
||||
categories a client is in by the format of its status request.
|
||||
|
||||
A directory cache can be made to count distinct client IP
|
||||
addresses that make a certain request of it in a given timeframe,
|
||||
and total requests made to it over that timeframe. For the first
|
||||
two cases, a cache can get a picture of the overall
|
||||
number and countries of users in the network by dividing the IP
|
||||
count by the probability with which they (as a cache) would be
|
||||
chosen. Assuming that our listed bandwidth is such that we expect
|
||||
to be chosen with probability P for any given request, and we've
|
||||
been counting IPs for long enough that we expect the average
|
||||
client to have made N requests, they will have visited us at least
|
||||
once with probability P' = 1-(1-P)^N, and so we divide the IP
|
||||
counts we've seen by P' for our estimate. To estimate total
|
||||
number of clients of a given type, determine how many requests a
|
||||
client of that type will make over that time, and assume we'll
|
||||
have seen P of them.
|
||||
|
||||
Both of these numbers are useful: the IP counts will give the
|
||||
total number of IPs connecting to the network, and the request
|
||||
counts will give the total number of users on the network at any
|
||||
given time.
|
||||
|
||||
Notes:
|
||||
- [Over H hours, the N for V2 clients is 2*H, and the N for V3
|
||||
clients is currently around H/2 or H/3.]
|
||||
|
||||
- (We should only count requests that we actually intend to answer;
|
||||
503 requests shouldn't count.)
|
||||
|
||||
- These measurements should also be taken at a directory
|
||||
authority if possible: their picture of the network is skewed
|
||||
by clients that fetch from them directly. These clients,
|
||||
however, are all the clients that are just bootstrapping
|
||||
(assuming that the fallback-consensus feature isn't yet used
|
||||
much).
|
||||
|
||||
- These measurements also overestimate the V2 download rate if
|
||||
some downloads fail and clients retry them later after backing
|
||||
off.
|
||||
|
||||
Methods for directory guards:
|
||||
|
||||
If directory guards are in use, directory guards get a picture of
|
||||
all those users who chose them as a guard when they were listed
|
||||
as a good choice for a guard, and who are also on the network
|
||||
now. The cleanest data here will come from nodes that were listed
|
||||
as good new-guards choices for a while, and have not been so for a
|
||||
while longer (to study decay rates); nodes that have been listed
|
||||
as good new-guard choices consistently for a long time (to get a
|
||||
sample of the network); and nodes that have been listed as good
|
||||
new-guard choices only recently (to get a sample of new users and
|
||||
users whose guards have died out.)
|
||||
|
||||
Since directory guards are currently unspecified, we'll need to
|
||||
make some guesses about how they'll turn out to work. Here are
|
||||
a couple of approaches that could work.
|
||||
- We could have clients pick completely new directory guards on
|
||||
a rolling basis every two months or so. This would ensure
|
||||
that staying as a guard for a while would be sufficient to
|
||||
see a sample of users. This is potentially advantageous for
|
||||
load-balancing the network as well, though it might lose some
|
||||
of the benefits of directory guard. We need to quantify the
|
||||
impact of this; it might not actually make stuff worse in
|
||||
practice, if most guards don't stay good guards for a month
|
||||
or two.
|
||||
|
||||
- We could try to collect statistics at several directory
|
||||
guards and combine their statisics, but we would need to make
|
||||
sure that for all time, at least one of the directory guards
|
||||
had been recommended as a good choice for new guards. By
|
||||
looking at new-IP rates for guards, we could get an idea of
|
||||
user uptake; for looking at old-IP decay rates, we could get
|
||||
an idea of turnover. This approach would entail significant
|
||||
complexity, and we'd probably need to record more information
|
||||
than we'd really like to.
|
||||
|
||||
|
@ -1,97 +0,0 @@
|
||||
|
||||
Right now as I understand it, there are n big scaling problems heading
|
||||
our way:
|
||||
|
||||
1) Clients need to learn all the relay descriptors they could use. That's
|
||||
a lot of bytes through a potentially small pipe.
|
||||
2) Relays need to hold open TCP connections to most other relays.
|
||||
3) Clients need to learn the whole networkstatus. Even using v3, as
|
||||
the network grows that will become unwieldy.
|
||||
4) Dir mirrors need to mirror all the relay descriptors; eventually this
|
||||
will get big too.
|
||||
|
||||
Here's my plan.
|
||||
|
||||
--------------------------------------------------------------------
|
||||
|
||||
Piece one: download O(1) descriptors rather than O(n) descriptors.
|
||||
|
||||
We need to change our circuit extend protocol so it fetches a relay
|
||||
descriptor at every 'extend' operation:
|
||||
- Client fetches networkstatus, picks guards, connects to one.
|
||||
- Client picks middle hop out of networkstatus, asks guard for
|
||||
its descriptor, then extends to it.
|
||||
- Clients picks exit hop out of networkstatus, asks middle hop
|
||||
for its descriptor, then extends to it. Done.
|
||||
|
||||
The client needs to ask for the descriptor even if it already has a
|
||||
copy, because otherwise we leak too much. Also, the descriptor needs to
|
||||
be padded to some large (but not too large) size to prevent the middle
|
||||
hops from guessing about it.
|
||||
|
||||
The first step towards this is to instrument the current code to see
|
||||
how much of a win this would actually be -- I am guessing it is already
|
||||
a win even with the current number of descriptors.
|
||||
|
||||
We also would need to assign the 'Exit' flag more usefully, and make
|
||||
clients pay attention to it when picking their last hop, since they
|
||||
don't actually know the exit policies of the relays they're choosing from.
|
||||
|
||||
We also need to think harder about other implications -- for example,
|
||||
a relay with a tiny exit policy won't get the Exit flag, and thus won't
|
||||
ever get picked as an exit relay. Plus, our "enclave exit" model is out
|
||||
the window unless we figure out a cool trick.
|
||||
|
||||
More generally, we'll probably want to compress the descriptors that we
|
||||
send back; maybe 8k is a good upper bound? I wonder if we could ask for
|
||||
several descriptors, and bundle back all of the ones that fit in the 8k?
|
||||
|
||||
We'd also want to put the load balancing weights into the networkstatus,
|
||||
so clients can choose fast nodes more often without needing to see the
|
||||
descriptors. This is a good opportunity for the authorities to be able
|
||||
to put "more accurate" weights in if they learn to detect attacks. It
|
||||
also means we should consider running automated audits to make sure the
|
||||
authorities aren't trying to snooker everybody.
|
||||
|
||||
I'm aiming to get Peter Palfrader to tackle this problem in mid 2008,
|
||||
but I bet he could use some help.
|
||||
|
||||
--------------------------------------------------------------------
|
||||
|
||||
Piece two: inter-relay communication uses UDP
|
||||
|
||||
If relays send packets to/from other relays via UDP, they don't need a
|
||||
new descriptor for each such link. Thus we'll still need to keep state
|
||||
for each link, but we won't max out on sockets.
|
||||
|
||||
Clearly a lot more work needs to be done here. Ian Goldberg has a student
|
||||
who has been working on it, and if all goes well we'll be chipping in
|
||||
some funding to continue that. Also, Camilo Viecco has been doing his
|
||||
PhD thesis on it.
|
||||
|
||||
--------------------------------------------------------------------
|
||||
|
||||
Piece three: networkstatus documents get partitioned
|
||||
|
||||
While the authorities should be expected to be able to handle learning
|
||||
about all the relays, there's no reason the clients or the mirrors need
|
||||
to. Authorities should put a cap on the number of relays listed in a
|
||||
single networkstatus, and split them when they get too big.
|
||||
|
||||
We'd need a good way to have each authority come to the same conclusion
|
||||
about which partition a given relay goes into.
|
||||
|
||||
Directory mirrors would then mirror all the relay descriptors in their
|
||||
partition. This is compatible with 'piece one' above, since clients in
|
||||
a given partition will only ask about descriptors in that partition.
|
||||
|
||||
More complex versions of this design would involve overlapping partitions,
|
||||
but that would seem to start contradicting other parts of this proposal
|
||||
right quick.
|
||||
|
||||
Nobody is working on this piece yet. It's hard to say when we'll need
|
||||
it, but it would be nice to have some more thought on it before the week
|
||||
that we need it.
|
||||
|
||||
--------------------------------------------------------------------
|
||||
|
@ -1,39 +0,0 @@
|
||||
Filename: xxx-hide-platform.txt
|
||||
Title: Hide Tor Platform Information
|
||||
Version: $Revision$
|
||||
Last-Modified: $Date$
|
||||
Author: Jacob Appelbaum
|
||||
Created: 24-July-2008
|
||||
Status: Draft
|
||||
|
||||
|
||||
Hiding Tor Platform Information
|
||||
|
||||
0.0 Introduction
|
||||
|
||||
The current Tor program publishes its specific Tor version and related OS
|
||||
platform information. This information could be misused by an attacker.
|
||||
|
||||
0.1 Current Implementation
|
||||
|
||||
Currently, the Tor binary sends data that looks like the following:
|
||||
|
||||
Tor 0.2.0.26-rc (r14597) on Darwin Power Macintosh
|
||||
Tor 0.1.2.19 on Windows XP Service Pack 3 [workstation] {terminal services,
|
||||
single user}
|
||||
|
||||
1.0 Suggested changes
|
||||
|
||||
It would be useful to allow a user to configure the disclosure of such
|
||||
information. Such a change would be an option in the torrc file like so:
|
||||
|
||||
HidePlatform Yes
|
||||
|
||||
1.1 Suggested default behavior in the future
|
||||
|
||||
If a user would like to disclose this information, they could configure their
|
||||
Tor to do so.
|
||||
|
||||
HidePlatform No
|
||||
|
||||
|
@ -1,93 +0,0 @@
|
||||
Filename: xxx-port-knocking.txt
|
||||
Title: Port knocking for bridge scanning resistance
|
||||
Version: $Revision$
|
||||
Last-Modified: $Date$
|
||||
Author: Jacob Appelbaum
|
||||
Created: 19-April-2009
|
||||
Status: Draft
|
||||
|
||||
Port knocking for bridge scanning resistance
|
||||
|
||||
0.0 Introduction
|
||||
|
||||
This document is a collection of ideas relating to improving scanning
|
||||
resistance for private bridge relays. This is intented to stop opportunistic
|
||||
network scanning and subsequent discovery of private bridge relays.
|
||||
|
||||
|
||||
0.1 Current Implementation
|
||||
|
||||
Currently private bridges are only hidden by their obscurity. If you know
|
||||
a bridge ip address, the bridge can be detected trivially and added to a block
|
||||
list.
|
||||
|
||||
0.2 Configuring an external port knocking program to control the firewall
|
||||
|
||||
It is currently possible for bridge operators to configure a port knocking
|
||||
daemon that controls access to the incoming OR port. This is currently out of
|
||||
scope for Tor and Tor configuration. This process requires the firewall to know
|
||||
the current nodes in the Tor network.
|
||||
|
||||
1.0 Suggested changes
|
||||
|
||||
Private bridge operators should be able to configure a method of hiding their
|
||||
relay. Only authorized users should be able to communicate with the private
|
||||
bridge. This should be done with Tor and if possible without the help of the
|
||||
firewall. It should be possible for a Tor user to enter a secret key into
|
||||
Tor or optionally Vidalia on a per bridge basis. This secret key should be
|
||||
used to authenticate the bridge user to the private bridge.
|
||||
|
||||
1.x Issues with low ports and bind() for ORPort
|
||||
|
||||
Tor opens low numbered ports during startup and then drops privileges. It is
|
||||
no longer possible to rebind to those lower ports after they are closed.
|
||||
|
||||
1.x Issues with OS level packet filtering
|
||||
|
||||
Tor does not know about any OS level packet filtering. Currently there is no
|
||||
packet filters that understands the Tor network in real time.
|
||||
|
||||
1.x Possible partioning of users by bridge operator
|
||||
|
||||
Depending on implementation, it may be possible for bridge operators to
|
||||
uniquely identify users. This appears to be a general bridge issue when a
|
||||
bridge operator uniquely deploys bridges per user.
|
||||
|
||||
2.0 Implementation ideas
|
||||
|
||||
This is a suggested set of methods for port knocking.
|
||||
|
||||
2.x Using SPA port knocking
|
||||
|
||||
Single Packet Authentication port knocking encodes all required data into a
|
||||
single UDP packet. Improperly formatted packets may be simply discarded.
|
||||
Properly formatted packets should be processed and appropriate actions taken.
|
||||
|
||||
2.x Using DNS as a transport for SPA
|
||||
|
||||
It should be possible for Tor to bind to port 53 at startup and merely drop all
|
||||
packets that are not valid. UDP does not require a response and invalid packets
|
||||
will not trigger a response from Tor. With base32 encoding it should be
|
||||
possible to encode SPA as valid DNS requests. This should allow use of the
|
||||
public DNS infrastructure for authorization requests if desired.
|
||||
|
||||
2.x Ghetto firewalling with opportunistic connection closing
|
||||
|
||||
Until a user has authenticated with Tor, Tor only has a UDP listener. This
|
||||
listener should never send data in response, it should only open an ORPort
|
||||
when a user has successfully authenticated. After a user has authenticated
|
||||
with Tor to open an ORPort, only users who have authenticated will be able
|
||||
to use it. All other users as identified by their ip address will have their
|
||||
connection closed before any data is sent or received. This should be
|
||||
accomplished with an access policy. By default, the access policy should block
|
||||
all access to the ORPort.
|
||||
|
||||
2.x Timing and reset of access policies
|
||||
|
||||
Access to the ORPort is sensitive. The bridge should remove any exceptions
|
||||
to its access policy regularly when the ORPort is unused. Valid users should
|
||||
reauthenticate if they do not use the ORPort within a given time frame.
|
||||
|
||||
2.x Additional considerations
|
||||
|
||||
There are many. A format of the packet and the crypto involved is a good start.
|
@ -1,63 +0,0 @@
|
||||
|
||||
1. Overview
|
||||
|
||||
We should rate limit the volume of stream creations at exits:
|
||||
|
||||
2.1. Per-circuit limits
|
||||
|
||||
If a given circuit opens more than N streams in X seconds, further
|
||||
stream requests over the next Y seconds should fail with the reason
|
||||
'resourcelimit'. Clients will automatically notice this and switch to
|
||||
a new circuit.
|
||||
|
||||
The goal is to limit the effects of port scans on a given exit relay,
|
||||
so the relay's ISP won't get hassled as much.
|
||||
|
||||
First thoughts for parameters would be N=100 streams in X=5 seconds
|
||||
causes 30 seconds of fails; and N=300 streams in X=30 seconds causes
|
||||
30 seconds of fails.
|
||||
|
||||
We could simplify by, instead of having a "for 30 seconds" parameter,
|
||||
just marking the circuit as forever failing new requests. (We don't want
|
||||
to just close the circuit because it may still have open streams on it.)
|
||||
|
||||
2.2. Per-destination limits
|
||||
|
||||
If a given circuit opens more than N1 streams in X seconds to a single
|
||||
IP address, or all the circuits combined open more than N2 streams,
|
||||
then we should fail further attempts to reach that address for a while.
|
||||
|
||||
The goal is to limit the abuse that Tor exit relays can dish out
|
||||
to a single target either for socket DoS or for web crawling, in
|
||||
the hopes of a) not triggering their automated defenses, and b) not
|
||||
making them upset at Tor. Hopefully these self-imposed bans will be
|
||||
much shorter-lived than bans or barriers put up by the websites.
|
||||
|
||||
3. Issues
|
||||
|
||||
3.1. Circuit-creation overload
|
||||
|
||||
Making clients move to new circuits more often will cause more circuit
|
||||
creation requests.
|
||||
|
||||
3.2. How to pick the parameters?
|
||||
|
||||
If we pick the numbers too low, then popular sites are effectively
|
||||
cut out of Tor. If we pick them too high, we don't do much good.
|
||||
|
||||
Worse, picking them wrong isn't easy to fix, since the deployed Tor
|
||||
servers will ship with a certain set of numbers.
|
||||
|
||||
We could put numbers (or "general settings") in the networkstatus
|
||||
consensus, and Tor exits would adapt more dynamically.
|
||||
|
||||
We could also have a local config option about how aggressive this
|
||||
server should be with its parameters.
|
||||
|
||||
4. Client-side limitations
|
||||
|
||||
Perhaps the clients should have built-in rate limits too, so they avoid
|
||||
harrassing the servers by default?
|
||||
|
||||
Tricky if we want to get Tor clients in use at large enclaves.
|
||||
|
@ -1,61 +0,0 @@
|
||||
Filename: xxx-separate-streams-by-port.txt
|
||||
Title: Separate streams across circuits by destination port
|
||||
Version: $Revision$
|
||||
Last-Modified: $Date$
|
||||
Author: Robert Hogan
|
||||
Created: 21-Oct-2008
|
||||
Status: Draft
|
||||
|
||||
Here's a patch Robert Hogan wrote to use only one destination port per
|
||||
circuit. It's based on a wishlist item Roger wrote, to never send AIM
|
||||
usernames over the same circuit that we're hoping to browse anonymously
|
||||
through. The remaining open question is: how many extra circuits does this
|
||||
cause an ordinary user to create? My guess is not very many, but I'm wary
|
||||
of putting this in until we have some better estimate. On the other hand,
|
||||
not putting it in means that we have a known security flaw. Hm.
|
||||
|
||||
Index: src/or/or.h
|
||||
===================================================================
|
||||
--- src/or/or.h (revision 17143)
|
||||
+++ src/or/or.h (working copy)
|
||||
@@ -1874,6 +1874,7 @@
|
||||
|
||||
uint8_t state; /**< Current status of this circuit. */
|
||||
uint8_t purpose; /**< Why are we creating this circuit? */
|
||||
+ uint16_t service; /**< Port conn must have to use this circuit. */
|
||||
|
||||
/** How many relay data cells can we package (read from edge streams)
|
||||
* on this circuit before we receive a circuit-level sendme cell asking
|
||||
Index: src/or/circuituse.c
|
||||
===================================================================
|
||||
--- src/or/circuituse.c (revision 17143)
|
||||
+++ src/or/circuituse.c (working copy)
|
||||
@@ -62,10 +62,16 @@
|
||||
return 0;
|
||||
}
|
||||
|
||||
- if (purpose == CIRCUIT_PURPOSE_C_GENERAL)
|
||||
+ if (purpose == CIRCUIT_PURPOSE_C_GENERAL) {
|
||||
if (circ->timestamp_dirty &&
|
||||
circ->timestamp_dirty+get_options()->MaxCircuitDirtiness <= now)
|
||||
return 0;
|
||||
+ /* If the circuit is dirty and used for services on another port,
|
||||
+ then it is not suitable. */
|
||||
+ if (circ->service && conn->socks_request->port &&
|
||||
+ (circ->service != conn->socks_request->port))
|
||||
+ return 0;
|
||||
+ }
|
||||
|
||||
/* decide if this circ is suitable for this conn */
|
||||
|
||||
@@ -1351,7 +1357,9 @@
|
||||
if (connection_ap_handshake_send_resolve(conn) < 0)
|
||||
return -1;
|
||||
}
|
||||
-
|
||||
+ if (conn->socks_request->port
|
||||
+ && (TO_CIRCUIT(circ)->purpose == CIRCUIT_PURPOSE_C_GENERAL))
|
||||
+ TO_CIRCUIT(circ)->service = conn->socks_request->port;
|
||||
return 1;
|
||||
}
|
||||
|
@ -1,140 +0,0 @@
|
||||
Filename: xxx-what-uses-sha1.txt
|
||||
Title: Where does Tor use SHA-1 today?
|
||||
Version: $Revision$
|
||||
Last-Modified: $Date$
|
||||
Author: Nick Mathewson
|
||||
Created: 30-Dec-2008
|
||||
Status: Meta
|
||||
|
||||
|
||||
Introduction:
|
||||
|
||||
Tor uses SHA-1 as a message digest. SHA-1 is showing its age:
|
||||
theoretical attacks for finding collisions against it get better
|
||||
every year or two, and it will likely be broken in practice before
|
||||
too long.
|
||||
|
||||
According to smart crypto people, the SHA-2 functions (SHA-256, etc)
|
||||
share too much of SHA-1's structure to be very good. Some people
|
||||
like other hash functions; most of these have not seen enough
|
||||
analysis to be widely regarded as an extra-good idea.
|
||||
|
||||
By 2012, the NIST SHA-3 competition will be done, and with luck we'll
|
||||
have something good to switch too. But it's probably a bad idea to
|
||||
wait until 2012 to figure out _how_ to migrate to a new hash
|
||||
function, for two reasons:
|
||||
1) It's not inconceivable we'll want to migrate in a hurry
|
||||
some time before then.
|
||||
2) It's likely that migrating to a new hash function will
|
||||
require protocol changes, and it's easiest to make protocol
|
||||
changes backward compatible if we lay the groundwork in
|
||||
advance. It would suck to have to break compatibility with
|
||||
a big hard-to-test "flag day" protocol change.
|
||||
|
||||
This document attempts to list everything Tor uses SHA-1 for today.
|
||||
This is the first step in getting all the design work done to switch
|
||||
to something else.
|
||||
|
||||
This document SHOULD NOT be a clearinghouse of what to do about our
|
||||
use of SHA-1. That's better left for other individual proposals.
|
||||
|
||||
|
||||
Why now?
|
||||
|
||||
The recent publication of "MD5 considered harmful today: Creating a
|
||||
rogue CA certificate" by Alexander Sotirov, Marc Stevens, Jacob
|
||||
Appelbaum, Arjen Lenstra, David Molnar, Dag Arne Osvik, and Benne de
|
||||
Weger has reminded me that:
|
||||
|
||||
* You can't rely on theoretical attacks to stay theoretical.
|
||||
* It's quite unpleasant when theoretical attacks become practical
|
||||
and public on days you were planning to leave for vacation.
|
||||
* Broken hash functions (which SHA-1 is not quite yet AFAIU)
|
||||
should be dropped like hot potatoes. Failure to do so can make
|
||||
one look silly.
|
||||
|
||||
|
||||
|
||||
What Tor uses hashes for today:
|
||||
|
||||
1. Infrastructure.
|
||||
|
||||
A. Our X.509 certificates are signed with SHA-1.
|
||||
B. TLS uses SHA-1 (and MD5) internally to generate keys.
|
||||
C. Some of the TLS ciphersuites we allow use SHA-1.
|
||||
D. When we sign our code with GPG, it might be using SHA-1.
|
||||
E. Our GPG keys might be authenticated with SHA-1.
|
||||
F. OpenSSL's random number generator uses SHA-1, I believe.
|
||||
|
||||
2. The Tor protocol
|
||||
|
||||
A. Everything we sign, we sign using SHA-1-based OAEP-MGF1.
|
||||
B. Our CREATE cell format uses SHA-1 for: OAEP padding.
|
||||
C. Our EXTEND cells use SHA-1 to hash the identity key of the
|
||||
target server.
|
||||
D. Our CREATED cells use SHA-1 to hash the derived key data.
|
||||
E. The data we use in CREATE_FAST cells to generate a key is the
|
||||
length of a SHA-1.
|
||||
F. The data we send back in a CREATED/CREATED_FAST cell is the length
|
||||
of a SHA-1.
|
||||
G. We use SHA-1 to derive our circuit keys from the negotiated g^xy value.
|
||||
H. We use SHA-1 to derive the digest field of each RELAY cell, but that's
|
||||
used more as a checksum than as a strong digest.
|
||||
|
||||
3. Directory services
|
||||
|
||||
A. All signatures are generated on the SHA-1 of their corresponding
|
||||
documents, using PKCS1 padding.
|
||||
B. Router descriptors identify their corresponding extra-info documents
|
||||
by their SHA-1 digest.
|
||||
C. Fingerprints in router descriptors are taken using SHA-1.
|
||||
D. Fingerprints in authority certs are taken using SHA-1.
|
||||
E. Fingerprints in dir-source lines of votes and consensuses are taken
|
||||
using SHA-1.
|
||||
F. Networkstatuses refer to routers identity keys and descriptors by their
|
||||
SHA-1 digests.
|
||||
G. Directory-signature lines identify which key is doing the signing by
|
||||
the SHA-1 digests of the authority's signing key and its identity key.
|
||||
H. The following items are downloaded by the SHA-1 of their contents:
|
||||
XXXX list them
|
||||
I. The following items are downloaded by the SHA-1 of an identity key:
|
||||
XXXX list them too.
|
||||
|
||||
4. The rendezvous protocol
|
||||
|
||||
A. Hidden servers use SHA-1 to establish introduction points on relays,
|
||||
and relays use SHA-1 to check incoming introduction point
|
||||
establishment requests.
|
||||
B. Hidden servers use SHA-1 in multiple places when generating hidden
|
||||
service descriptors.
|
||||
C. Hidden servers performing basic-type client authorization for their
|
||||
services use SHA-1 when encrypting introduction points contained in
|
||||
hidden service descriptors.
|
||||
D. Hidden service directories use SHA-1 to check whether a given hidden
|
||||
service descriptor may be published under a given descriptor
|
||||
identifier or not.
|
||||
E. Hidden servers use SHA-1 to derive .onion addresses of their
|
||||
services.
|
||||
F. Clients use SHA-1 to generate the current hidden service descriptor
|
||||
identifiers for a given .onion address.
|
||||
G. Hidden servers use SHA-1 to remember digests of the first parts of
|
||||
Diffie-Hellman handshakes contained in introduction requests in order
|
||||
to detect replays.
|
||||
H. Hidden servers use SHA-1 during the Diffie-Hellman key exchange with
|
||||
a connecting client.
|
||||
|
||||
5. The bridge protocol
|
||||
|
||||
XXXX write me
|
||||
|
||||
6. The Tor user interface
|
||||
|
||||
A. We log information about servers based on SHA-1 hashes of their
|
||||
identity keys.
|
||||
B. The controller identifies servers based on SHA-1 hashes of their
|
||||
identity keys.
|
||||
C. Nearly all of our configuration options that list servers allow SHA-1
|
||||
hashes of their identity keys.
|
||||
E. The deprecated .exit notation uses SHA-1 hashes of identity keys
|
||||
|
||||
|
@ -1,117 +0,0 @@
|
||||
#!/usr/bin/python
|
||||
|
||||
import re, os
|
||||
class Error(Exception): pass
|
||||
|
||||
STATUSES = """DRAFT NEEDS-REVISION NEEDS-RESEARCH OPEN ACCEPTED META FINISHED
|
||||
CLOSED SUPERSEDED DEAD""".split()
|
||||
REQUIRED_FIELDS = [ "Filename", "Status", "Title" ]
|
||||
CONDITIONAL_FIELDS = { "OPEN" : [ "Target" ],
|
||||
"ACCEPTED" : [ "Target "],
|
||||
"CLOSED" : [ "Implemented-In" ],
|
||||
"FINISHED" : [ "Implemented-In" ] }
|
||||
FNAME_RE = re.compile(r'^(\d\d\d)-.*[^\~]$')
|
||||
DIR = "."
|
||||
OUTFILE = "000-index.txt"
|
||||
TMPFILE = OUTFILE+".tmp"
|
||||
|
||||
def indexed(seq):
|
||||
n = 0
|
||||
for i in seq:
|
||||
yield n, i
|
||||
n += 1
|
||||
|
||||
def readProposal(fn):
|
||||
fields = { }
|
||||
f = open(fn, 'r')
|
||||
lastField = None
|
||||
try:
|
||||
for lineno, line in indexed(f):
|
||||
line = line.rstrip()
|
||||
if not line:
|
||||
return fields
|
||||
if line[0].isspace():
|
||||
fields[lastField] += " %s"%(line.strip())
|
||||
else:
|
||||
parts = line.split(":", 1)
|
||||
if len(parts) != 2:
|
||||
raise Error("%s:%s: Neither field nor continuation"%
|
||||
(fn,lineno))
|
||||
else:
|
||||
fields[parts[0]] = parts[1].strip()
|
||||
lastField = parts[0]
|
||||
|
||||
return fields
|
||||
finally:
|
||||
f.close()
|
||||
|
||||
def checkProposal(fn, fields):
|
||||
status = fields.get("Status")
|
||||
need_fields = REQUIRED_FIELDS + CONDITIONAL_FIELDS.get(status, [])
|
||||
for f in need_fields:
|
||||
if not fields.has_key(f):
|
||||
raise Error("%s has no %s field"%(fn, f))
|
||||
if fn != fields['Filename']:
|
||||
print `fn`, `fields['Filename']`
|
||||
raise Error("Mismatched Filename field in %s"%fn)
|
||||
if fields['Title'][-1] == '.':
|
||||
fields['Title'] = fields['Title'][:-1]
|
||||
|
||||
status = fields['Status'] = status.upper()
|
||||
if status not in STATUSES:
|
||||
raise Error("I've never heard of status %s in %s"%(status,fn))
|
||||
if status in [ "SUPERSEDED", "DEAD" ]:
|
||||
for f in [ 'Implemented-In', 'Target' ]:
|
||||
if fields.has_key(f): del fields[f]
|
||||
|
||||
def readProposals():
|
||||
res = []
|
||||
for fn in os.listdir(DIR):
|
||||
m = FNAME_RE.match(fn)
|
||||
if not m: continue
|
||||
if not fn.endswith(".txt"):
|
||||
raise Error("%s doesn't end with .txt"%fn)
|
||||
num = m.group(1)
|
||||
fields = readProposal(fn)
|
||||
checkProposal(fn, fields)
|
||||
fields['num'] = num
|
||||
res.append(fields)
|
||||
return res
|
||||
|
||||
def writeIndexFile(proposals):
|
||||
proposals.sort(key=lambda f:f['num'])
|
||||
seenStatuses = set()
|
||||
for p in proposals:
|
||||
seenStatuses.add(p['Status'])
|
||||
|
||||
out = open(TMPFILE, 'w')
|
||||
inf = open(OUTFILE, 'r')
|
||||
for line in inf:
|
||||
out.write(line)
|
||||
if line.startswith("====="): break
|
||||
inf.close()
|
||||
|
||||
out.write("Proposals by number:\n\n")
|
||||
for prop in proposals:
|
||||
out.write("%(num)s %(Title)s [%(Status)s]\n"%prop)
|
||||
out.write("\n\nProposals by status:\n\n")
|
||||
for s in STATUSES:
|
||||
if s not in seenStatuses: continue
|
||||
out.write(" %s:\n"%s)
|
||||
for prop in proposals:
|
||||
if s == prop['Status']:
|
||||
out.write(" %(num)s %(Title)s"%prop)
|
||||
if prop.has_key('Target'):
|
||||
out.write(" [for %(Target)s]"%prop)
|
||||
if prop.has_key('Implemented-In'):
|
||||
out.write(" [in %(Implemented-In)s]"%prop)
|
||||
out.write("\n")
|
||||
out.close()
|
||||
os.rename(TMPFILE, OUTFILE)
|
||||
|
||||
try:
|
||||
os.unlink(TMPFILE)
|
||||
except OSError:
|
||||
pass
|
||||
|
||||
writeIndexFile(readProposals())
|
@ -1,768 +0,0 @@
|
||||
$Id$
|
||||
|
||||
Tor Rendezvous Specification
|
||||
|
||||
0. Overview and preliminaries
|
||||
|
||||
Read
|
||||
https://www.torproject.org/doc/design-paper/tor-design.html#sec:rendezvous
|
||||
before you read this specification. It will make more sense.
|
||||
|
||||
Rendezvous points provide location-hidden services (server
|
||||
anonymity) for the onion routing network. With rendezvous points,
|
||||
Bob can offer a TCP service (say, a webserver) via the onion
|
||||
routing network, without revealing the IP of that service.
|
||||
|
||||
Bob does this by anonymously advertising a public key for his
|
||||
service, along with a list of onion routers to act as "Introduction
|
||||
Points" for his service. He creates forward circuits to those
|
||||
introduction points, and tells them about his public key. To
|
||||
connect to Bob, Alice first builds a circuit to an OR to act as
|
||||
her "Rendezvous Point." She then connects to one of Bob's chosen
|
||||
introduction points, optionally provides authentication or
|
||||
authorization information, and asks it to tell him about her Rendezvous
|
||||
Point (RP). If Bob chooses to answer, he builds a circuit to her
|
||||
RP, and tells it to connect him to Alice. The RP joins their
|
||||
circuits together, and begins relaying cells. Alice's 'BEGIN'
|
||||
cells are received directly by Bob's OP, which passes data to
|
||||
and from the local server implementing Bob's service.
|
||||
|
||||
Below we describe a network-level specification of this service,
|
||||
along with interfaces to make this process transparent to Alice
|
||||
(so long as she is using an OP).
|
||||
|
||||
0.1. Notation, conventions and prerequisites
|
||||
|
||||
In the specifications below, we use the same notation and terminology
|
||||
as in "tor-spec.txt". The service specified here also requires the
|
||||
existence of an onion routing network as specified in that file.
|
||||
|
||||
H(x) is a SHA1 digest of x.
|
||||
PKSign(SK,x) is a PKCS.1-padded RSA signature of x with SK.
|
||||
PKEncrypt(SK,x) is a PKCS.1-padded RSA encryption of x with SK.
|
||||
Public keys are all RSA, and encoded in ASN.1.
|
||||
All integers are stored in network (big-endian) order.
|
||||
All symmetric encryption uses AES in counter mode, except where
|
||||
otherwise noted.
|
||||
|
||||
In all discussions, "Alice" will refer to a user connecting to a
|
||||
location-hidden service, and "Bob" will refer to a user running a
|
||||
location-hidden service.
|
||||
|
||||
An OP is (as defined elsewhere) an "Onion Proxy" or Tor client.
|
||||
|
||||
An OR is (as defined elsewhere) an "Onion Router" or Tor server.
|
||||
|
||||
An "Introduction point" is a Tor server chosen to be Bob's medium-term
|
||||
'meeting place'. A "Rendezvous point" is a Tor server chosen by Alice to
|
||||
be a short-term communication relay between her and Bob. All Tor servers
|
||||
potentially act as introduction and rendezvous points.
|
||||
|
||||
0.2. Protocol outline
|
||||
|
||||
1. Bob->Bob's OP: "Offer IP:Port as
|
||||
public-key-name:Port". [configuration]
|
||||
(We do not specify this step; it is left to the implementor of
|
||||
Bob's OP.)
|
||||
|
||||
2. Bob's OP generates keypair and rendezvous service descriptor:
|
||||
"Meet public-key X at introduction point A, B, or C." (signed)
|
||||
|
||||
3. Bob's OP->Introduction point via Tor: [introduction setup]
|
||||
"This pk is me."
|
||||
|
||||
4. Bob's OP->directory service via Tor: publishes Bob's service
|
||||
descriptor [advertisement]
|
||||
|
||||
5. Out of band, Alice receives a [x.y.]z.onion:port address.
|
||||
She opens a SOCKS connection to her OP, and requests
|
||||
x.y.z.onion:port.
|
||||
|
||||
6. Alice's OP retrieves Bob's descriptor via Tor. [descriptor lookup.]
|
||||
|
||||
7. Alice's OP chooses a rendezvous point, opens a circuit to that
|
||||
rendezvous point, and establishes a rendezvous circuit. [rendezvous
|
||||
setup.]
|
||||
|
||||
8. Alice connects to the Introduction point via Tor, and tells it about
|
||||
her rendezvous point and optional authentication/authorization
|
||||
information. (Encrypted to Bob.) [Introduction 1]
|
||||
|
||||
9. The Introduction point passes this on to Bob's OP via Tor, along the
|
||||
introduction circuit. [Introduction 2]
|
||||
|
||||
10. Bob's OP decides whether to connect to Alice, and if so, creates a
|
||||
circuit to Alice's RP via Tor. Establishes a shared circuit.
|
||||
[Rendezvous.]
|
||||
|
||||
11. Alice's OP sends begin cells to Bob's OP. [Connection]
|
||||
|
||||
0.3. Constants and new cell types
|
||||
|
||||
Relay cell types
|
||||
32 -- RELAY_ESTABLISH_INTRO
|
||||
33 -- RELAY_ESTABLISH_RENDEZVOUS
|
||||
34 -- RELAY_INTRODUCE1
|
||||
35 -- RELAY_INTRODUCE2
|
||||
36 -- RELAY_RENDEZVOUS1
|
||||
37 -- RELAY_RENDEZVOUS2
|
||||
38 -- RELAY_INTRO_ESTABLISHED
|
||||
39 -- RELAY_RENDEZVOUS_ESTABLISHED
|
||||
40 -- RELAY_COMMAND_INTRODUCE_ACK
|
||||
|
||||
0.4. Version overview
|
||||
|
||||
There are several parts in the hidden service protocol that have
|
||||
changed over time, each of them having its own version number, whereas
|
||||
other parts remained the same. The following list of potentially
|
||||
versioned protocol parts should help reduce some confusion:
|
||||
|
||||
- Hidden service descriptor: the binary-based v0 was the default for
|
||||
a long time, and an ascii-based v2 has been added by proposal
|
||||
114. See 1.2.
|
||||
|
||||
- Hidden service descriptor propagation mechanism: currently related to
|
||||
the hidden service descriptor version -- v0 publishes to the original
|
||||
hs directory authorities, whereas v2 publishes to a rotating subset
|
||||
of relays with the "hsdir" flag; see 1.4 and 1.6.
|
||||
|
||||
- Introduction protocol for how to generate an introduction cell:
|
||||
v0 specified a nickname for the rendezvous point and assumed the
|
||||
relay would know about it, whereas v2 now specifies IP address,
|
||||
port, and onion key so the relay doesn't need to already recognize
|
||||
it. See 1.8.
|
||||
|
||||
1. The Protocol
|
||||
|
||||
1.1. Bob configures his local OP.
|
||||
|
||||
We do not specify a format for the OP configuration file. However,
|
||||
OPs SHOULD allow Bob to provide more than one advertised service
|
||||
per OP, and MUST allow Bob to specify one or more virtual ports per
|
||||
service. Bob provides a mapping from each of these virtual ports
|
||||
to a local IP:Port pair.
|
||||
|
||||
1.2. Bob's OP generates service descriptors.
|
||||
|
||||
The first time the OP provides an advertised service, it generates
|
||||
a public/private keypair (stored locally). Periodically, the OP
|
||||
generates and publishes a descriptor of type "V0".
|
||||
|
||||
The "V0" descriptor contains:
|
||||
|
||||
KL Key length [2 octets]
|
||||
PK Bob's public key [KL octets]
|
||||
TS A timestamp [4 octets]
|
||||
NI Number of introduction points [2 octets]
|
||||
Ipt A list of NUL-terminated ORs [variable]
|
||||
SIG Signature of above fields [variable]
|
||||
|
||||
KL is the length of PK, in octets.
|
||||
TS is the number of seconds elapsed since Jan 1, 1970.
|
||||
|
||||
The members of Ipt may be either (a) nicknames, or (b) identity key
|
||||
digests, encoded in hex, and prefixed with a '$'. Clients must
|
||||
accept both forms. Services must only generate the second form.
|
||||
Once 0.0.9.x is obsoleted, we can drop the first form.
|
||||
|
||||
[It's ok for Bob to advertise 0 introduction points. He might want
|
||||
to do that if he previously advertised some introduction points,
|
||||
and now he doesn't have any. -RD]
|
||||
|
||||
Beginning with 0.2.0.10-alpha, Bob's OP encodes "V2" descriptors in
|
||||
addition to "V0" descriptors. The format of a "V2" descriptor is as
|
||||
follows:
|
||||
|
||||
"rendezvous-service-descriptor" descriptor-id NL
|
||||
|
||||
[At start, exactly once]
|
||||
|
||||
Indicates the beginning of the descriptor. "descriptor-id" is a
|
||||
periodically changing identifier of 160 bits formatted as 32 base32
|
||||
chars that is calculated by the hidden service and its clients. If
|
||||
the optional "descriptor-cookie" is used, this "descriptor-id"
|
||||
cannot be computed by anyone else. (Everyone can verify that this
|
||||
"descriptor-id" belongs to the rest of the descriptor, even without
|
||||
knowing the optional "descriptor-cookie", as described below.) The
|
||||
"descriptor-id" is calculated by performing the following operation:
|
||||
|
||||
descriptor-id =
|
||||
H(permanent-id | H(time-period | descriptor-cookie | replica))
|
||||
|
||||
"permanent-id" is the permanent identifier of the hidden service,
|
||||
consisting of 80 bits. It can be calculated by computing the hash value
|
||||
of the public hidden service key and truncating after the first 80 bits:
|
||||
|
||||
permanent-id = H(public-key)[:10]
|
||||
|
||||
"H(time-period | descriptor-cookie | replica)" is the (possibly
|
||||
secret) id part that is
|
||||
necessary to verify that the hidden service is the true originator
|
||||
of this descriptor. It can only be created by the hidden service
|
||||
and its clients, but the "signature" below can only be created by
|
||||
the service.
|
||||
|
||||
"descriptor-cookie" is an optional secret password of 128 bits that
|
||||
is shared between the hidden service provider and its clients.
|
||||
|
||||
"replica" denotes the number of the non-consecutive replica.
|
||||
|
||||
(Each descriptor is replicated on a number of _consecutive_ nodes
|
||||
in the identifier ring by making every storing node responsible
|
||||
for the identifier intervals starting from its 3rd predecessor's
|
||||
ID to its own ID. In addition to that, every service publishes
|
||||
multiple descriptors with different descriptor IDs in order to
|
||||
distribute them to different places on the ring. Therefore,
|
||||
"replica" chooses one of the _non-consecutive_ replicas. -KL)
|
||||
|
||||
The "time-period" changes periodically depending on the global time and
|
||||
as a function of "permanent-id". The current value for "time-period" can
|
||||
be calculated using the following formula:
|
||||
|
||||
time-period = (current-time + permanent-id-byte * 86400 / 256)
|
||||
/ 86400
|
||||
|
||||
"current-time" contains the current system time in seconds since
|
||||
1970-01-01 00:00, e.g. 1188241957. "permanent-id-byte" is the first
|
||||
(unsigned) byte of the permanent identifier (which is in network
|
||||
order), e.g. 143. Adding the product of "permanent-id-byte" and
|
||||
86400 (seconds per day), divided by 256, prevents "time-period" from
|
||||
changing for all descriptors at the same time of the day. The result
|
||||
of the overall operation is a (network-ordered) 32-bit integer, e.g.
|
||||
13753 or 0x000035B9 with the example values given above.
|
||||
|
||||
"version" version-number NL
|
||||
|
||||
[Exactly once]
|
||||
|
||||
The version number of this descriptor's format. In this case: 2.
|
||||
|
||||
"permanent-key" NL a public key in PEM format
|
||||
|
||||
[Exactly once]
|
||||
|
||||
The public key of the hidden service which is required to verify the
|
||||
"descriptor-id" and the "signature".
|
||||
|
||||
"secret-id-part" secret-id-part NL
|
||||
|
||||
[Exactly once]
|
||||
|
||||
The result of the following operation as explained above, formatted as
|
||||
32 base32 chars. Using this secret id part, everyone can verify that
|
||||
the signed descriptor belongs to "descriptor-id".
|
||||
|
||||
secret-id-part = H(time-period | descriptor-cookie | replica)
|
||||
|
||||
"publication-time" YYYY-MM-DD HH:MM:SS NL
|
||||
|
||||
[Exactly once]
|
||||
|
||||
A timestamp when this descriptor has been created.
|
||||
|
||||
"protocol-versions" version-string NL
|
||||
|
||||
[Exactly once]
|
||||
|
||||
A comma-separated list of recognized and permitted version numbers
|
||||
for use in INTRODUCE cells; these versions are described in section
|
||||
1.8 below.
|
||||
|
||||
"introduction-points" NL encrypted-string
|
||||
|
||||
[At most once]
|
||||
|
||||
A list of introduction points. If the optional "descriptor-cookie" is
|
||||
used, this list is encrypted with AES in CTR mode with a random
|
||||
initialization vector of 128 bits that is written to
|
||||
the beginning of the encrypted string, and the "descriptor-cookie" as
|
||||
secret key of 128 bits length.
|
||||
|
||||
The string containing the introduction point data (either encrypted
|
||||
or not) is encoded in base64, and surrounded with
|
||||
"-----BEGIN MESSAGE-----" and "-----END MESSAGE-----".
|
||||
|
||||
The unencrypted string may begin with:
|
||||
|
||||
["service-authentication" auth-type NL auth-data ... reserved]
|
||||
|
||||
[At start, any number]
|
||||
|
||||
The service-specific authentication data can be used to perform
|
||||
client authentication. This data is independent of the selected
|
||||
introduction point as opposed to "intro-authentication" below.
|
||||
|
||||
Subsequently, an arbitrary number of introduction point entries may
|
||||
follow, each containing the following data:
|
||||
|
||||
"introduction-point" identifier NL
|
||||
|
||||
[At start, exactly once]
|
||||
|
||||
The identifier of this introduction point: the base-32 encoded
|
||||
hash of this introduction point's identity key.
|
||||
|
||||
"ip-address" ip-address NL
|
||||
|
||||
[Exactly once]
|
||||
|
||||
The IP address of this introduction point.
|
||||
|
||||
"onion-port" port NL
|
||||
|
||||
[Exactly once]
|
||||
|
||||
The TCP port on which the introduction point is listening for
|
||||
incoming onion requests.
|
||||
|
||||
"onion-key" NL a public key in PEM format
|
||||
|
||||
[Exactly once]
|
||||
|
||||
The public key that can be used to encrypt messages to this
|
||||
introduction point.
|
||||
|
||||
"service-key" NL a public key in PEM format
|
||||
|
||||
[Exactly once]
|
||||
|
||||
The public key that can be used to encrypt messages to the hidden
|
||||
service.
|
||||
|
||||
["intro-authentication" auth-type NL auth-data ... reserved]
|
||||
|
||||
[Any number]
|
||||
|
||||
The introduction-point-specific authentication data can be used
|
||||
to perform client authentication. This data depends on the
|
||||
selected introduction point as opposed to "service-authentication"
|
||||
above.
|
||||
|
||||
(This ends the fields in the encrypted portion of the descriptor.)
|
||||
|
||||
"signature" NL signature-string
|
||||
|
||||
[At end, exactly once]
|
||||
|
||||
A signature of all fields above with the private key of the hidden
|
||||
service.
|
||||
|
||||
1.2.1. Other descriptor formats we don't use.
|
||||
|
||||
The V1 descriptor format was understood and accepted from
|
||||
0.1.1.5-alpha-cvs to 0.2.0.6-alpha-dev, but no Tors generated it and
|
||||
it was removed:
|
||||
|
||||
V Format byte: set to 255 [1 octet]
|
||||
V Version byte: set to 1 [1 octet]
|
||||
KL Key length [2 octets]
|
||||
PK Bob's public key [KL octets]
|
||||
TS A timestamp [4 octets]
|
||||
PROTO Protocol versions: bitmask [2 octets]
|
||||
NI Number of introduction points [2 octets]
|
||||
For each introduction point: (as in INTRODUCE2 cells)
|
||||
IP Introduction point's address [4 octets]
|
||||
PORT Introduction point's OR port [2 octets]
|
||||
ID Introduction point identity ID [20 octets]
|
||||
KLEN Length of onion key [2 octets]
|
||||
KEY Introduction point onion key [KLEN octets]
|
||||
SIG Signature of above fields [variable]
|
||||
|
||||
A hypothetical "V1" descriptor, that has never been used but might
|
||||
be useful for historical reasons, contains:
|
||||
|
||||
V Format byte: set to 255 [1 octet]
|
||||
V Version byte: set to 1 [1 octet]
|
||||
KL Key length [2 octets]
|
||||
PK Bob's public key [KL octets]
|
||||
TS A timestamp [4 octets]
|
||||
PROTO Rendezvous protocol versions: bitmask [2 octets]
|
||||
NA Number of auth mechanisms accepted [1 octet]
|
||||
For each auth mechanism:
|
||||
AUTHT The auth type that is supported [2 octets]
|
||||
AUTHL Length of auth data [1 octet]
|
||||
AUTHD Auth data [variable]
|
||||
NI Number of introduction points [2 octets]
|
||||
For each introduction point: (as in INTRODUCE2 cells)
|
||||
ATYPE An address type (typically 4) [1 octet]
|
||||
ADDR Introduction point's IP address [4 or 16 octets]
|
||||
PORT Introduction point's OR port [2 octets]
|
||||
AUTHT The auth type that is supported [2 octets]
|
||||
AUTHL Length of auth data [1 octet]
|
||||
AUTHD Auth data [variable]
|
||||
ID Introduction point identity ID [20 octets]
|
||||
KLEN Length of onion key [2 octets]
|
||||
KEY Introduction point onion key [KLEN octets]
|
||||
SIG Signature of above fields [variable]
|
||||
|
||||
AUTHT specifies which authentication/authorization mechanism is
|
||||
required by the hidden service or the introduction point. AUTHD
|
||||
is arbitrary data that can be associated with an auth approach.
|
||||
Currently only AUTHT of [00 00] is supported, with an AUTHL of 0.
|
||||
See section 2 of this document for details on auth mechanisms.
|
||||
|
||||
1.3. Bob's OP establishes his introduction points.
|
||||
|
||||
The OP establishes a new introduction circuit to each introduction
|
||||
point. These circuits MUST NOT be used for anything but hidden service
|
||||
introduction. To establish the introduction, Bob sends a
|
||||
RELAY_ESTABLISH_INTRO cell, containing:
|
||||
|
||||
KL Key length [2 octets]
|
||||
PK Bob's public key [KL octets]
|
||||
HS Hash of session info [20 octets]
|
||||
SIG Signature of above information [variable]
|
||||
|
||||
[XXX011, need to add auth information here. -RD]
|
||||
|
||||
To prevent replay attacks, the HS field contains a SHA-1 hash based on the
|
||||
shared secret KH between Bob's OP and the introduction point, as
|
||||
follows:
|
||||
HS = H(KH | "INTRODUCE")
|
||||
That is:
|
||||
HS = H(KH | [49 4E 54 52 4F 44 55 43 45])
|
||||
(KH, as specified in tor-spec.txt, is H(g^xy | [00]) .)
|
||||
|
||||
Upon receiving such a cell, the OR first checks that the signature is
|
||||
correct with the included public key. If so, it checks whether HS is
|
||||
correct given the shared state between Bob's OP and the OR. If either
|
||||
check fails, the OP discards the cell; otherwise, it associates the
|
||||
circuit with Bob's public key, and dissociates any other circuits
|
||||
currently associated with PK. On success, the OR sends Bob a
|
||||
RELAY_INTRO_ESTABLISHED cell with an empty payload.
|
||||
|
||||
If a hidden service is configured to publish only v2 hidden service
|
||||
descriptors, Bob's OP does not include its own public key in the
|
||||
RELAY_ESTABLISH_INTRO cell, but the public key of a freshly generated
|
||||
key pair. The OP also includes these fresh public keys in the v2 hidden
|
||||
service descriptor together with the other introduction point
|
||||
information. The reason is that the introduction point does not need to
|
||||
and therefore should not know for which hidden service it works, so as
|
||||
to prevent it from tracking the hidden service's activity. If the hidden
|
||||
service is configured to publish both, v0 and v2 descriptors, two
|
||||
separate sets of introduction points are established.
|
||||
|
||||
1.4. Bob's OP advertises his service descriptor(s).
|
||||
|
||||
Bob's OP opens a stream to each directory server's directory port via Tor.
|
||||
(He may re-use old circuits for this.) Over this stream, Bob's OP makes
|
||||
an HTTP 'POST' request, to a URL "/tor/rendezvous/publish" relative to the
|
||||
directory server's root, containing as its body Bob's service descriptor.
|
||||
|
||||
Bob should upload a service descriptor for each version format that
|
||||
is supported in the current Tor network.
|
||||
|
||||
Upon receiving a descriptor, the directory server checks the signature,
|
||||
and discards the descriptor if the signature does not match the enclosed
|
||||
public key. Next, the directory server checks the timestamp. If the
|
||||
timestamp is more than 24 hours in the past or more than 1 hour in the
|
||||
future, or the directory server already has a newer descriptor with the
|
||||
same public key, the server discards the descriptor. Otherwise, the
|
||||
server discards any older descriptors with the same public key and
|
||||
version format, and associates the new descriptor with the public key.
|
||||
The directory server remembers this descriptor for at least 24 hours
|
||||
after its timestamp. At least every 18 hours, Bob's OP uploads a
|
||||
fresh descriptor.
|
||||
|
||||
If Bob's OP is configured to publish v2 descriptors instead of or in
|
||||
addition to v0 descriptors, it does so to a changing subset of all v2
|
||||
hidden service directories instead of the authoritative directory
|
||||
servers. Therefore, Bob's OP opens a stream via Tor to each
|
||||
responsible hidden service directory. (He may re-use old circuits
|
||||
for this.) Over this stream, Bob's OP makes an HTTP 'POST' request to a
|
||||
URL "/tor/rendezvous2/publish" relative to the hidden service
|
||||
directory's root, containing as its body Bob's service descriptor.
|
||||
|
||||
At any time, there are 6 hidden service directories responsible for
|
||||
keeping replicas of a descriptor; they consist of 2 sets of 3 hidden
|
||||
service directories with consecutive onion IDs. Bob's OP learns about
|
||||
the complete list of hidden service directories by filtering the
|
||||
consensus status document received from the directory authorities. A
|
||||
hidden service directory is deemed responsible for all descriptor IDs in
|
||||
the interval from its direct predecessor, exclusive, to its own ID,
|
||||
inclusive; it further holds replicas for its 2 predecessors. A
|
||||
participant only trusts its own routing list and never learns about
|
||||
routing information from other parties.
|
||||
|
||||
Bob's OP publishes a new v2 descriptor once an hour or whenever its
|
||||
content changes. V2 descriptors can be found by clients within a given
|
||||
time period of 24 hours, after which they change their ID as described
|
||||
under 1.2. If a published descriptor would be valid for less than 60
|
||||
minutes (= 2 x 30 minutes to allow the server to be 30 minutes behind
|
||||
and the client 30 minutes ahead), Bob's OP publishes the descriptor
|
||||
under the ID of both, the current and the next publication period.
|
||||
|
||||
1.5. Alice receives a x.y.z.onion address.
|
||||
|
||||
When Alice receives a pointer to a location-hidden service, it is as a
|
||||
hostname of the form "z.onion" or "y.z.onion" or "x.y.z.onion", where
|
||||
z is a base-32 encoding of a 10-octet hash of Bob's service's public
|
||||
key, computed as follows:
|
||||
|
||||
1. Let H = H(PK).
|
||||
2. Let H' = the first 80 bits of H, considering each octet from
|
||||
most significant bit to least significant bit.
|
||||
2. Generate a 16-character encoding of H', using base32 as defined
|
||||
in RFC 3548.
|
||||
|
||||
(We only use 80 bits instead of the 160 bits from SHA1 because we
|
||||
don't need to worry about arbitrary collisions, and because it will
|
||||
make handling the url's more convenient.)
|
||||
|
||||
The string "x", if present, is the base-32 encoding of the
|
||||
authentication/authorization required by the introduction point.
|
||||
The string "y", if present, is the base-32 encoding of the
|
||||
authentication/authorization required by the hidden service.
|
||||
Omitting a string is taken to mean auth type [00 00].
|
||||
See section 2 of this document for details on auth mechanisms.
|
||||
|
||||
[Yes, numbers are allowed at the beginning. See RFC 1123. -NM]
|
||||
|
||||
1.6. Alice's OP retrieves a service descriptor.
|
||||
|
||||
Alice opens a stream to a directory server via Tor, and makes an HTTP GET
|
||||
request for the document '/tor/rendezvous/<z>', where '<z>' is replaced
|
||||
with the encoding of Bob's public key as described above. (She may re-use
|
||||
old circuits for this.) The directory replies with a 404 HTTP response if
|
||||
it does not recognize <z>, and otherwise returns Bob's most recently
|
||||
uploaded service descriptor.
|
||||
|
||||
If Alice's OP receives a 404 response, it tries the other directory
|
||||
servers, and only fails the lookup if none recognize the public key hash.
|
||||
|
||||
Upon receiving a service descriptor, Alice verifies with the same process
|
||||
as the directory server uses, described above in section 1.4.
|
||||
|
||||
The directory server gives a 400 response if it cannot understand Alice's
|
||||
request.
|
||||
|
||||
Alice should cache the descriptor locally, but should not use
|
||||
descriptors that are more than 24 hours older than their timestamp.
|
||||
[Caching may make her partitionable, but she fetched it anonymously,
|
||||
and we can't very well *not* cache it. -RD]
|
||||
|
||||
Alice's OP fetches v2 descriptors in parallel to v0 descriptors. Similarly
|
||||
to the description in section 1.4, the OP fetches a v2 descriptor from a
|
||||
randomly chosen hidden service directory out of the changing subset of
|
||||
6 nodes. If the request is unsuccessful, Alice retries the other
|
||||
remaining responsible hidden service directories in a random order.
|
||||
Alice relies on Bob to care about a potential clock skew between the two
|
||||
by possibly storing two sets of descriptors (see end of section 1.4).
|
||||
|
||||
Alice's OP opens a stream via Tor to the chosen v2 hidden service
|
||||
directory. (She may re-use old circuits for this.) Over this stream,
|
||||
Alice's OP makes an HTTP 'GET' request for the document
|
||||
"/tor/rendezvous2/<z>", where z is replaced with the encoding of the
|
||||
descriptor ID. The directory replies with a 404 HTTP response if it does
|
||||
not recognize <z>, and otherwise returns Bob's most recently uploaded
|
||||
service descriptor.
|
||||
|
||||
1.7. Alice's OP establishes a rendezvous point.
|
||||
|
||||
When Alice requests a connection to a given location-hidden service,
|
||||
and Alice's OP does not have an established circuit to that service,
|
||||
the OP builds a rendezvous circuit. It does this by establishing
|
||||
a circuit to a randomly chosen OR, and sending a
|
||||
RELAY_ESTABLISH_RENDEZVOUS cell to that OR. The body of that cell
|
||||
contains:
|
||||
|
||||
RC Rendezvous cookie [20 octets]
|
||||
|
||||
[XXX011 this looks like an auth mechanism. should we generalize here? -RD]
|
||||
|
||||
The rendezvous cookie is an arbitrary 20-byte value, chosen randomly by
|
||||
Alice's OP.
|
||||
|
||||
Upon receiving a RELAY_ESTABLISH_RENDEZVOUS cell, the OR associates the
|
||||
RC with the circuit that sent it. It replies to Alice with an empty
|
||||
RELAY_RENDEZVOUS_ESTABLISHED cell to indicate success.
|
||||
|
||||
Alice's OP MUST NOT use the circuit which sent the cell for any purpose
|
||||
other than rendezvous with the given location-hidden service.
|
||||
|
||||
1.8. Introduction: from Alice's OP to Introduction Point
|
||||
|
||||
Alice builds a separate circuit to one of Bob's chosen introduction
|
||||
points, and sends it a RELAY_INTRODUCE1 cell containing:
|
||||
|
||||
Cleartext
|
||||
PK_ID Identifier for Bob's PK [20 octets]
|
||||
Encrypted to Bob's PK: (in the v0 intro protocol)
|
||||
RP Rendezvous point's nickname [20 octets]
|
||||
RC Rendezvous cookie [20 octets]
|
||||
g^x Diffie-Hellman data, part 1 [128 octets]
|
||||
OR (in the v1 intro protocol)
|
||||
VER Version byte: set to 1. [1 octet]
|
||||
RP Rendezvous point nick or ID [42 octets]
|
||||
RC Rendezvous cookie [20 octets]
|
||||
g^x Diffie-Hellman data, part 1 [128 octets]
|
||||
OR (in the v2 intro protocol)
|
||||
VER Version byte: set to 2. [1 octet]
|
||||
IP Rendezvous point's address [4 octets]
|
||||
PORT Rendezvous point's OR port [2 octets]
|
||||
ID Rendezvous point identity ID [20 octets]
|
||||
KLEN Length of onion key [2 octets]
|
||||
KEY Rendezvous point onion key [KLEN octets]
|
||||
RC Rendezvous cookie [20 octets]
|
||||
g^x Diffie-Hellman data, part 1 [128 octets]
|
||||
|
||||
PK_ID is the hash of Bob's public key. RP is NUL-padded and
|
||||
terminated. In version 0, it must contain a nickname. In version 1,
|
||||
it must contain EITHER a nickname or an identity key digest that is
|
||||
encoded in hex and prefixed with a '$'.
|
||||
|
||||
The hybrid encryption to Bob's PK works just like the hybrid
|
||||
encryption in CREATE cells (see tor-spec). Thus the payload of the
|
||||
version 0 RELAY_INTRODUCE1 cell on the wire will contain
|
||||
20+42+16+20+20+128=246 bytes, and the version 1 and version 2
|
||||
introduction formats have other sizes.
|
||||
|
||||
Through Tor 0.2.0.6-alpha, clients only generated the v0 introduction
|
||||
format, whereas hidden services have understood and accepted v0,
|
||||
v1, and v2 since 0.1.1.x. As of Tor 0.2.0.7-alpha and 0.1.2.18,
|
||||
clients switched to using the v2 intro format.
|
||||
|
||||
If Alice has downloaded a v2 descriptor, she uses the contained public
|
||||
key ("service-key") instead of Bob's public key to create the
|
||||
RELAY_INTRODUCE1 cell as described above.
|
||||
|
||||
1.8.1. Other introduction formats we don't use.
|
||||
|
||||
We briefly speculated about using the following format for the
|
||||
"encrypted to Bob's PK" part of the introduction, but no Tors have
|
||||
ever generated these.
|
||||
|
||||
VER Version byte: set to 3. [1 octet]
|
||||
ATYPE An address type (typically 4) [1 octet]
|
||||
ADDR Rendezvous point's IP address [4 or 16 octets]
|
||||
PORT Rendezvous point's OR port [2 octets]
|
||||
AUTHT The auth type that is supported [2 octets]
|
||||
AUTHL Length of auth data [1 octet]
|
||||
AUTHD Auth data [variable]
|
||||
ID Rendezvous point identity ID [20 octets]
|
||||
KLEN Length of onion key [2 octets]
|
||||
KEY Rendezvous point onion key [KLEN octets]
|
||||
RC Rendezvous cookie [20 octets]
|
||||
g^x Diffie-Hellman data, part 1 [128 octets]
|
||||
|
||||
1.9. Introduction: From the Introduction Point to Bob's OP
|
||||
|
||||
If the Introduction Point recognizes PK_ID as a public key which has
|
||||
established a circuit for introductions as in 1.3 above, it sends the body
|
||||
of the cell in a new RELAY_INTRODUCE2 cell down the corresponding circuit.
|
||||
(If the PK_ID is unrecognized, the RELAY_INTRODUCE1 cell is discarded.)
|
||||
|
||||
After sending the RELAY_INTRODUCE2 cell, the OR replies to Alice with an
|
||||
empty RELAY_COMMAND_INTRODUCE_ACK cell. If no RELAY_INTRODUCE2 cell can
|
||||
be sent, the OR replies to Alice with a non-empty cell to indicate an
|
||||
error. (The semantics of the cell body may be determined later; the
|
||||
current implementation sends a single '1' byte on failure.)
|
||||
|
||||
When Bob's OP receives the RELAY_INTRODUCE2 cell, it decrypts it with
|
||||
the private key for the corresponding hidden service, and extracts the
|
||||
rendezvous point's nickname, the rendezvous cookie, and the value of g^x
|
||||
chosen by Alice.
|
||||
|
||||
1.10. Rendezvous
|
||||
|
||||
Bob's OP builds a new Tor circuit ending at Alice's chosen rendezvous
|
||||
point, and sends a RELAY_RENDEZVOUS1 cell along this circuit, containing:
|
||||
RC Rendezvous cookie [20 octets]
|
||||
g^y Diffie-Hellman [128 octets]
|
||||
KH Handshake digest [20 octets]
|
||||
|
||||
(Bob's OP MUST NOT use this circuit for any other purpose.)
|
||||
|
||||
If the RP recognizes RC, it relays the rest of the cell down the
|
||||
corresponding circuit in a RELAY_RENDEZVOUS2 cell, containing:
|
||||
|
||||
g^y Diffie-Hellman [128 octets]
|
||||
KH Handshake digest [20 octets]
|
||||
|
||||
(If the RP does not recognize the RC, it discards the cell and
|
||||
tears down the circuit.)
|
||||
|
||||
When Alice's OP receives a RELAY_RENDEZVOUS2 cell on a circuit which
|
||||
has sent a RELAY_ESTABLISH_RENDEZVOUS cell but which has not yet received
|
||||
a reply, it uses g^y and H(g^xy) to complete the handshake as in the Tor
|
||||
circuit extend process: they establish a 60-octet string as
|
||||
K = SHA1(g^xy | [00]) | SHA1(g^xy | [01]) | SHA1(g^xy | [02])
|
||||
and generate
|
||||
KH = K[0..15]
|
||||
Kf = K[16..31]
|
||||
Kb = K[32..47]
|
||||
|
||||
Subsequently, the rendezvous point passes relay cells, unchanged, from
|
||||
each of the two circuits to the other. When Alice's OP sends
|
||||
RELAY cells along the circuit, it first encrypts them with the
|
||||
Kf, then with all of the keys for the ORs in Alice's side of the circuit;
|
||||
and when Alice's OP receives RELAY cells from the circuit, it decrypts
|
||||
them with the keys for the ORs in Alice's side of the circuit, then
|
||||
decrypts them with Kb. Bob's OP does the same, with Kf and Kb
|
||||
interchanged.
|
||||
|
||||
1.11. Creating streams
|
||||
|
||||
To open TCP connections to Bob's location-hidden service, Alice's OP sends
|
||||
a RELAY_BEGIN cell along the established circuit, using the special
|
||||
address "", and a chosen port. Bob's OP chooses a destination IP and
|
||||
port, based on the configuration of the service connected to the circuit,
|
||||
and opens a TCP stream. From then on, Bob's OP treats the stream as an
|
||||
ordinary exit connection.
|
||||
[ Except he doesn't include addr in the connected cell or the end
|
||||
cell. -RD]
|
||||
|
||||
Alice MAY send multiple RELAY_BEGIN cells along the circuit, to open
|
||||
multiple streams to Bob. Alice SHOULD NOT send RELAY_BEGIN cells for any
|
||||
other address along her circuit to Bob; if she does, Bob MUST reject them.
|
||||
|
||||
2. Authentication and authorization.
|
||||
|
||||
Foo.
|
||||
|
||||
3. Hidden service directory operation
|
||||
|
||||
This section has been introduced with the v2 hidden service descriptor
|
||||
format. It describes all operations of the v2 hidden service descriptor
|
||||
fetching and propagation mechanism that are required for the protocol
|
||||
described in section 1 to succeed with v2 hidden service descriptors.
|
||||
|
||||
3.1. Configuring as hidden service directory
|
||||
|
||||
Every onion router that has its directory port open can decide whether it
|
||||
wants to store and serve hidden service descriptors. An onion router which
|
||||
is configured as such includes the "hidden-service-dir" flag in its router
|
||||
descriptors that it sends to directory authorities.
|
||||
|
||||
The directory authorities include a new flag "HSDir" for routers that
|
||||
decided to provide storage for hidden service descriptors and that
|
||||
have been running for at least 24 hours.
|
||||
|
||||
3.2. Accepting publish requests
|
||||
|
||||
Hidden service directory nodes accept publish requests for v2 hidden service
|
||||
descriptors and store them to their local memory. (It is not necessary to
|
||||
make descriptors persistent, because after restarting, the onion router
|
||||
would not be accepted as a storing node anyway, because it has not been
|
||||
running for at least 24 hours.) All requests and replies are formatted as
|
||||
HTTP messages. Requests are initiated via BEGIN_DIR cells directed to
|
||||
the router's directory port, and formatted as HTTP POST requests to the URL
|
||||
"/tor/rendezvous2/publish" relative to the hidden service directory's root,
|
||||
containing as its body a v2 service descriptor.
|
||||
|
||||
A hidden service directory node parses every received descriptor and only
|
||||
stores it when it thinks that it is responsible for storing that descriptor
|
||||
based on its own routing table. See section 1.4 for more information on how
|
||||
to determine responsibility for a certain descriptor ID.
|
||||
|
||||
3.3. Processing fetch requests
|
||||
|
||||
Hidden service directory nodes process fetch requests for hidden service
|
||||
descriptors by looking them up in their local memory. (They do not need to
|
||||
determine if they are responsible for the passed ID, because it does no harm
|
||||
if they deliver a descriptor for which they are not (any more) responsible.)
|
||||
All requests and replies are formatted as HTTP messages. Requests are
|
||||
initiated via BEGIN_DIR cells directed to the router's directory port,
|
||||
and formatted as HTTP GET requests for the document "/tor/rendezvous2/<z>",
|
||||
where z is replaced with the encoding of the descriptor ID.
|
||||
|
@ -1,79 +0,0 @@
|
||||
$Id$
|
||||
Tor's extensions to the SOCKS protocol
|
||||
|
||||
1. Overview
|
||||
|
||||
The SOCKS protocol provides a generic interface for TCP proxies. Client
|
||||
software connects to a SOCKS server via TCP, and requests a TCP connection
|
||||
to another address and port. The SOCKS server establishes the connection,
|
||||
and reports success or failure to the client. After the connection has
|
||||
been established, the client application uses the TCP stream as usual.
|
||||
|
||||
Tor supports SOCKS4 as defined in [1], SOCKS4A as defined in [2], and
|
||||
SOCKS5 as defined in [3].
|
||||
|
||||
The stickiest issue for Tor in supporting clients, in practice, is forcing
|
||||
DNS lookups to occur at the OR side: if clients do their own DNS lookup,
|
||||
the DNS server can learn which addresses the client wants to reach.
|
||||
SOCKS4 supports addressing by IPv4 address; SOCKS4A is a kludge on top of
|
||||
SOCKS4 to allow addressing by hostname; SOCKS5 supports IPv4, IPv6, and
|
||||
hostnames.
|
||||
|
||||
1.1. Extent of support
|
||||
|
||||
Tor supports the SOCKS4, SOCKS4A, and SOCKS5 standards, except as follows:
|
||||
|
||||
BOTH:
|
||||
- The BIND command is not supported.
|
||||
|
||||
SOCKS4,4A:
|
||||
- SOCKS4 usernames are ignored.
|
||||
|
||||
SOCKS5:
|
||||
- The (SOCKS5) "UDP ASSOCIATE" command is not supported.
|
||||
- IPv6 is not supported in CONNECT commands.
|
||||
- Only the "NO AUTHENTICATION" (SOCKS5) authentication method [00] is
|
||||
supported.
|
||||
|
||||
2. Name lookup
|
||||
|
||||
As an extension to SOCKS4A and SOCKS5, Tor implements a new command value,
|
||||
"RESOLVE" [F0]. When Tor receives a "RESOLVE" SOCKS command, it initiates
|
||||
a remote lookup of the hostname provided as the target address in the SOCKS
|
||||
request. The reply is either an error (if the address couldn't be
|
||||
resolved) or a success response. In the case of success, the address is
|
||||
stored in the portion of the SOCKS response reserved for remote IP address.
|
||||
|
||||
(We support RESOLVE in SOCKS4 too, even though it is unnecessary.)
|
||||
|
||||
For SOCKS5 only, we support reverse resolution with a new command value,
|
||||
"RESOLVE_PTR" [F1]. In response to a "RESOLVE_PTR" SOCKS5 command with
|
||||
an IPv4 address as its target, Tor attempts to find the canonical
|
||||
hostname for that IPv4 record, and returns it in the "server bound
|
||||
address" portion of the reply.
|
||||
(This command was not supported before Tor 0.1.2.2-alpha.)
|
||||
|
||||
3. Other command extensions.
|
||||
|
||||
Tor 0.1.2.4-alpha added a new command value: "CONNECT_DIR" [F2].
|
||||
In this case, Tor will open an encrypted direct TCP connection to the
|
||||
directory port of the Tor server specified by address:port (the port
|
||||
specified should be the ORPort of the server). It uses a one-hop tunnel
|
||||
and a "BEGIN_DIR" relay cell to accomplish this secure connection.
|
||||
|
||||
The F2 command value was removed in Tor 0.2.0.10-alpha in favor of a
|
||||
new use_begindir flag in edge_connection_t.
|
||||
|
||||
4. HTTP-resistance
|
||||
|
||||
Tor checks the first byte of each SOCKS request to see whether it looks
|
||||
more like an HTTP request (that is, it starts with a "G", "H", or "P"). If
|
||||
so, Tor returns a small webpage, telling the user that his/her browser is
|
||||
misconfigured. This is helpful for the many users who mistakenly try to
|
||||
use Tor as an HTTP proxy instead of a SOCKS proxy.
|
||||
|
||||
References:
|
||||
[1] http://archive.socks.permeo.com/protocol/socks4.protocol
|
||||
[2] http://archive.socks.permeo.com/protocol/socks4a.protocol
|
||||
[3] SOCKS5: RFC1928
|
||||
|
@ -1,993 +0,0 @@
|
||||
$Id$
|
||||
|
||||
Tor Protocol Specification
|
||||
|
||||
Roger Dingledine
|
||||
Nick Mathewson
|
||||
|
||||
Note: This document aims to specify Tor as implemented in 0.2.1.x. Future
|
||||
versions of Tor may implement improved protocols, and compatibility is not
|
||||
guaranteed. Compatibility notes are given for versions 0.1.1.15-rc and
|
||||
later; earlier versions are not compatible with the Tor network as of this
|
||||
writing.
|
||||
|
||||
This specification is not a design document; most design criteria
|
||||
are not examined. For more information on why Tor acts as it does,
|
||||
see tor-design.pdf.
|
||||
|
||||
0. Preliminaries
|
||||
|
||||
0.1. Notation and encoding
|
||||
|
||||
PK -- a public key.
|
||||
SK -- a private key.
|
||||
K -- a key for a symmetric cypher.
|
||||
|
||||
a|b -- concatenation of 'a' and 'b'.
|
||||
|
||||
[A0 B1 C2] -- a three-byte sequence, containing the bytes with
|
||||
hexadecimal values A0, B1, and C2, in that order.
|
||||
|
||||
All numeric values are encoded in network (big-endian) order.
|
||||
|
||||
H(m) -- a cryptographic hash of m.
|
||||
|
||||
0.2. Security parameters
|
||||
|
||||
Tor uses a stream cipher, a public-key cipher, the Diffie-Hellman
|
||||
protocol, and a hash function.
|
||||
|
||||
KEY_LEN -- the length of the stream cipher's key, in bytes.
|
||||
|
||||
PK_ENC_LEN -- the length of a public-key encrypted message, in bytes.
|
||||
PK_PAD_LEN -- the number of bytes added in padding for public-key
|
||||
encryption, in bytes. (The largest number of bytes that can be encrypted
|
||||
in a single public-key operation is therefore PK_ENC_LEN-PK_PAD_LEN.)
|
||||
|
||||
DH_LEN -- the number of bytes used to represent a member of the
|
||||
Diffie-Hellman group.
|
||||
DH_SEC_LEN -- the number of bytes used in a Diffie-Hellman private key (x).
|
||||
|
||||
HASH_LEN -- the length of the hash function's output, in bytes.
|
||||
|
||||
PAYLOAD_LEN -- The longest allowable cell payload, in bytes. (509)
|
||||
|
||||
CELL_LEN -- The length of a Tor cell, in bytes.
|
||||
|
||||
0.3. Ciphers
|
||||
|
||||
For a stream cipher, we use 128-bit AES in counter mode, with an IV of all
|
||||
0 bytes.
|
||||
|
||||
For a public-key cipher, we use RSA with 1024-bit keys and a fixed
|
||||
exponent of 65537. We use OAEP-MGF1 padding, with SHA-1 as its digest
|
||||
function. We leave the optional "Label" parameter unset. (For OAEP
|
||||
padding, see ftp://ftp.rsasecurity.com/pub/pkcs/pkcs-1/pkcs-1v2-1.pdf)
|
||||
|
||||
For Diffie-Hellman, we use a generator (g) of 2. For the modulus (p), we
|
||||
use the 1024-bit safe prime from rfc2409 section 6.2 whose hex
|
||||
representation is:
|
||||
|
||||
"FFFFFFFFFFFFFFFFC90FDAA22168C234C4C6628B80DC1CD129024E08"
|
||||
"8A67CC74020BBEA63B139B22514A08798E3404DDEF9519B3CD3A431B"
|
||||
"302B0A6DF25F14374FE1356D6D51C245E485B576625E7EC6F44C42E9"
|
||||
"A637ED6B0BFF5CB6F406B7EDEE386BFB5A899FA5AE9F24117C4B1FE6"
|
||||
"49286651ECE65381FFFFFFFFFFFFFFFF"
|
||||
|
||||
As an optimization, implementations SHOULD choose DH private keys (x) of
|
||||
320 bits. Implementations that do this MUST never use any DH key more
|
||||
than once.
|
||||
[May other implementations reuse their DH keys?? -RD]
|
||||
[Probably not. Conceivably, you could get away with changing DH keys once
|
||||
per second, but there are too many oddball attacks for me to be
|
||||
comfortable that this is safe. -NM]
|
||||
|
||||
For a hash function, we use SHA-1.
|
||||
|
||||
KEY_LEN=16.
|
||||
DH_LEN=128; DH_SEC_LEN=40.
|
||||
PK_ENC_LEN=128; PK_PAD_LEN=42.
|
||||
HASH_LEN=20.
|
||||
|
||||
When we refer to "the hash of a public key", we mean the SHA-1 hash of the
|
||||
DER encoding of an ASN.1 RSA public key (as specified in PKCS.1).
|
||||
|
||||
All "random" values should be generated with a cryptographically strong
|
||||
random number generator, unless otherwise noted.
|
||||
|
||||
The "hybrid encryption" of a byte sequence M with a public key PK is
|
||||
computed as follows:
|
||||
1. If M is less than PK_ENC_LEN-PK_PAD_LEN, pad and encrypt M with PK.
|
||||
2. Otherwise, generate a KEY_LEN byte random key K.
|
||||
Let M1 = the first PK_ENC_LEN-PK_PAD_LEN-KEY_LEN bytes of M,
|
||||
and let M2 = the rest of M.
|
||||
Pad and encrypt K|M1 with PK. Encrypt M2 with our stream cipher,
|
||||
using the key K. Concatenate these encrypted values.
|
||||
[XXX Note that this "hybrid encryption" approach does not prevent
|
||||
an attacker from adding or removing bytes to the end of M. It also
|
||||
allows attackers to modify the bytes not covered by the OAEP --
|
||||
see Goldberg's PET2006 paper for details. We will add a MAC to this
|
||||
scheme one day. -RD]
|
||||
|
||||
0.4. Other parameter values
|
||||
|
||||
CELL_LEN=512
|
||||
|
||||
1. System overview
|
||||
|
||||
Tor is a distributed overlay network designed to anonymize
|
||||
low-latency TCP-based applications such as web browsing, secure shell,
|
||||
and instant messaging. Clients choose a path through the network and
|
||||
build a ``circuit'', in which each node (or ``onion router'' or ``OR'')
|
||||
in the path knows its predecessor and successor, but no other nodes in
|
||||
the circuit. Traffic flowing down the circuit is sent in fixed-size
|
||||
``cells'', which are unwrapped by a symmetric key at each node (like
|
||||
the layers of an onion) and relayed downstream.
|
||||
|
||||
1.1. Keys and names
|
||||
|
||||
Every Tor server has multiple public/private keypairs:
|
||||
|
||||
- A long-term signing-only "Identity key" used to sign documents and
|
||||
certificates, and used to establish server identity.
|
||||
- A medium-term "Onion key" used to decrypt onion skins when accepting
|
||||
circuit extend attempts. (See 5.1.) Old keys MUST be accepted for at
|
||||
least one week after they are no longer advertised. Because of this,
|
||||
servers MUST retain old keys for a while after they're rotated.
|
||||
- A short-term "Connection key" used to negotiate TLS connections.
|
||||
Tor implementations MAY rotate this key as often as they like, and
|
||||
SHOULD rotate this key at least once a day.
|
||||
|
||||
Tor servers are also identified by "nicknames"; these are specified in
|
||||
dir-spec.txt.
|
||||
|
||||
2. Connections
|
||||
|
||||
Connections between two Tor servers, or between a client and a server,
|
||||
use TLS/SSLv3 for link authentication and encryption. All
|
||||
implementations MUST support the SSLv3 ciphersuite
|
||||
"SSL_DHE_RSA_WITH_3DES_EDE_CBC_SHA", and SHOULD support the TLS
|
||||
ciphersuite "TLS_DHE_RSA_WITH_AES_128_CBC_SHA" if it is available.
|
||||
|
||||
There are three acceptable ways to perform a TLS handshake when
|
||||
connecting to a Tor server: "certificates up-front", "renegotiation", and
|
||||
"backwards-compatible renegotiation". ("Backwards-compatible
|
||||
renegotiation" is, as the name implies, compatible with both other
|
||||
handshake types.)
|
||||
|
||||
Before Tor 0.2.0.21, only "certificates up-front" was supported. In Tor
|
||||
0.2.0.21 or later, "backwards-compatible renegotiation" is used.
|
||||
|
||||
In "certificates up-front", the connection initiator always sends a
|
||||
two-certificate chain, consisting of an X.509 certificate using a
|
||||
short-term connection public key and a second, self- signed X.509
|
||||
certificate containing its identity key. The other party sends a similar
|
||||
certificate chain. The initiator's ClientHello MUST NOT include any
|
||||
ciphersuites other than:
|
||||
TLS_DHE_RSA_WITH_AES_256_CBC_SHA
|
||||
TLS_DHE_RSA_WITH_AES_128_CBC_SHA
|
||||
SSL_DHE_RSA_WITH_3DES_EDE_CBC_SHA
|
||||
SSL_DHE_DSS_WITH_3DES_EDE_CBC_SHA
|
||||
|
||||
In "renegotiation", the connection initiator sends no certificates, and
|
||||
the responder sends a single connection certificate. Once the TLS
|
||||
handshake is complete, the initiator renegotiates the handshake, with each
|
||||
parties sending a two-certificate chain as in "certificates up-front".
|
||||
The initiator's ClientHello MUST include at least once ciphersuite not in
|
||||
the list above. The responder SHOULD NOT select any ciphersuite besides
|
||||
those in the list above.
|
||||
[The above "should not" is because some of the ciphers that
|
||||
clients list may be fake.]
|
||||
|
||||
In "backwards-compatible renegotiation", the connection initiator's
|
||||
ClientHello MUST include at least one ciphersuite other than those listed
|
||||
above. The connection responder examines the initiator's ciphersuite list
|
||||
to see whether it includes any ciphers other than those included in the
|
||||
list above. If extra ciphers are included, the responder proceeds as in
|
||||
"renegotiation": it sends a single certificate and does not request
|
||||
client certificates. Otherwise (in the case that no extra ciphersuites
|
||||
are included in the ClientHello) the responder proceeds as in
|
||||
"certificates up-front": it requests client certificates, and sends a
|
||||
two-certificate chain. In either case, once the responder has sent its
|
||||
certificate or certificates, the initiator counts them. If two
|
||||
certificates have been sent, it proceeds as in "certificates up-front";
|
||||
otherwise, it proceeds as in "renegotiation".
|
||||
|
||||
All new implementations of the Tor server protocol MUST support
|
||||
"backwards-compatible renegotiation"; clients SHOULD do this too. If
|
||||
this is not possible, new client implementations MUST support both
|
||||
"renegotiation" and "certificates up-front" and use the router's
|
||||
published link protocols list (see dir-spec.txt on the "protocols" entry)
|
||||
to decide which to use.
|
||||
|
||||
In all of the above handshake variants, certificates sent in the clear
|
||||
SHOULD NOT include any strings to identify the host as a Tor server. In
|
||||
the "renegotation" and "backwards-compatible renegotiation", the
|
||||
initiator SHOULD chose a list of ciphersuites and TLS extensions chosen
|
||||
to mimic one used by a popular web browser.
|
||||
|
||||
Responders MUST NOT select any TLS ciphersuite that lacks ephemeral keys,
|
||||
or whose symmetric keys are less then KEY_LEN bits, or whose digests are
|
||||
less than HASH_LEN bits. Responders SHOULD NOT select any SSLv3
|
||||
ciphersuite other than those listed above.
|
||||
|
||||
Even though the connection protocol is identical, we will think of the
|
||||
initiator as either an onion router (OR) if it is willing to relay
|
||||
traffic for other Tor users, or an onion proxy (OP) if it only handles
|
||||
local requests. Onion proxies SHOULD NOT provide long-term-trackable
|
||||
identifiers in their handshakes.
|
||||
|
||||
In all handshake variants, once all certificates are exchanged, all
|
||||
parties receiving certificates must confirm that the identity key is as
|
||||
expected. (When initiating a connection, the expected identity key is
|
||||
the one given in the directory; when creating a connection because of an
|
||||
EXTEND cell, the expected identity key is the one given in the cell.) If
|
||||
the key is not as expected, the party must close the connection.
|
||||
|
||||
When connecting to an OR, all parties SHOULD reject the connection if that
|
||||
OR has a malformed or missing certificate. When accepting an incoming
|
||||
connection, an OR SHOULD NOT reject incoming connections from parties with
|
||||
malformed or missing certificates. (However, an OR should not believe
|
||||
that an incoming connection is from another OR unless the certificates
|
||||
are present and well-formed.)
|
||||
|
||||
[Before version 0.1.2.8-rc, ORs rejected incoming connections from ORs and
|
||||
OPs alike if their certificates were missing or malformed.]
|
||||
|
||||
Once a TLS connection is established, the two sides send cells
|
||||
(specified below) to one another. Cells are sent serially. All
|
||||
cells are CELL_LEN bytes long. Cells may be sent embedded in TLS
|
||||
records of any size or divided across TLS records, but the framing
|
||||
of TLS records MUST NOT leak information about the type or contents
|
||||
of the cells.
|
||||
|
||||
TLS connections are not permanent. Either side MAY close a connection
|
||||
if there are no circuits running over it and an amount of time
|
||||
(KeepalivePeriod, defaults to 5 minutes) has passed since the last time
|
||||
any traffic was transmitted over the TLS connection. Clients SHOULD
|
||||
also hold a TLS connection with no circuits open, if it is likely that a
|
||||
circuit will be built soon using that connection.
|
||||
|
||||
(As an exception, directory servers may try to stay connected to all of
|
||||
the ORs -- though this will be phased out for the Tor 0.1.2.x release.)
|
||||
|
||||
To avoid being trivially distinguished from servers, client-only Tor
|
||||
instances are encouraged but not required to use a two-certificate chain
|
||||
as well. Clients SHOULD NOT keep using the same certificates when
|
||||
their IP address changes. Clients MAY send no certificates at all.
|
||||
|
||||
3. Cell Packet format
|
||||
|
||||
The basic unit of communication for onion routers and onion
|
||||
proxies is a fixed-width "cell".
|
||||
|
||||
On a version 1 connection, each cell contains the following
|
||||
fields:
|
||||
|
||||
CircID [2 bytes]
|
||||
Command [1 byte]
|
||||
Payload (padded with 0 bytes) [PAYLOAD_LEN bytes]
|
||||
|
||||
On a version 2 connection, all cells are as in version 1 connections,
|
||||
except for the initial VERSIONS cell, whose format is:
|
||||
|
||||
Circuit [2 octets; set to 0]
|
||||
Command [1 octet; set to 7 for VERSIONS]
|
||||
Length [2 octets; big-endian integer]
|
||||
Payload [Length bytes]
|
||||
|
||||
The CircID field determines which circuit, if any, the cell is
|
||||
associated with.
|
||||
|
||||
The 'Command' field holds one of the following values:
|
||||
0 -- PADDING (Padding) (See Sec 7.2)
|
||||
1 -- CREATE (Create a circuit) (See Sec 5.1)
|
||||
2 -- CREATED (Acknowledge create) (See Sec 5.1)
|
||||
3 -- RELAY (End-to-end data) (See Sec 5.5 and 6)
|
||||
4 -- DESTROY (Stop using a circuit) (See Sec 5.4)
|
||||
5 -- CREATE_FAST (Create a circuit, no PK) (See Sec 5.1)
|
||||
6 -- CREATED_FAST (Circuit created, no PK) (See Sec 5.1)
|
||||
7 -- VERSIONS (Negotiate proto version) (See Sec 4)
|
||||
8 -- NETINFO (Time and address info) (See Sec 4)
|
||||
9 -- RELAY_EARLY (End-to-end data; limited) (See sec 5.6)
|
||||
|
||||
The interpretation of 'Payload' depends on the type of the cell.
|
||||
PADDING: Payload is unused.
|
||||
CREATE: Payload contains the handshake challenge.
|
||||
CREATED: Payload contains the handshake response.
|
||||
RELAY: Payload contains the relay header and relay body.
|
||||
DESTROY: Payload contains a reason for closing the circuit.
|
||||
(see 5.4)
|
||||
Upon receiving any other value for the command field, an OR must
|
||||
drop the cell. Since more cell types may be added in the future, ORs
|
||||
should generally not warn when encountering unrecognized commands.
|
||||
|
||||
The payload is padded with 0 bytes.
|
||||
|
||||
PADDING cells are currently used to implement connection keepalive.
|
||||
If there is no other traffic, ORs and OPs send one another a PADDING
|
||||
cell every few minutes.
|
||||
|
||||
CREATE, CREATED, and DESTROY cells are used to manage circuits;
|
||||
see section 5 below.
|
||||
|
||||
RELAY cells are used to send commands and data along a circuit; see
|
||||
section 6 below.
|
||||
|
||||
VERSIONS and NETINFO cells are used to set up connections. See section 4
|
||||
below.
|
||||
|
||||
4. Negotiating and initializing connections
|
||||
|
||||
4.1. Negotiating versions with VERSIONS cells
|
||||
|
||||
There are multiple instances of the Tor link connection protocol. Any
|
||||
connection negotiated using the "certificates up front" handshake (see
|
||||
section 2 above) is "version 1". In any connection where both parties
|
||||
have behaved as in the "renegotiation" handshake, the link protocol
|
||||
version is 2 or higher.
|
||||
|
||||
To determine the version, in any connection where the "renegotiation"
|
||||
handshake was used (that is, where the server sent only one certificate
|
||||
at first and where the client did not send any certificates until
|
||||
renegotiation), both parties MUST send a VERSIONS cell immediately after
|
||||
the renegotiation is finished, before any other cells are sent. Parties
|
||||
MUST NOT send any other cells on a connection until they have received a
|
||||
VERSIONS cell.
|
||||
|
||||
The payload in a VERSIONS cell is a series of big-endian two-byte
|
||||
integers. Both parties MUST select as the link protocol version the
|
||||
highest number contained both in the VERSIONS cell they sent and in the
|
||||
versions cell they received. If they have no such version in common,
|
||||
they cannot communicate and MUST close the connection.
|
||||
|
||||
Since the version 1 link protocol does not use the "renegotiation"
|
||||
handshake, implementations MUST NOT list version 1 in their VERSIONS
|
||||
cell.
|
||||
|
||||
4.2. NETINFO cells
|
||||
|
||||
If version 2 or higher is negotiated, each party sends the other a
|
||||
NETINFO cell. The cell's payload is:
|
||||
|
||||
Timestamp [4 bytes]
|
||||
Other OR's address [variable]
|
||||
Number of addresses [1 byte]
|
||||
This OR's addresses [variable]
|
||||
|
||||
The address format is a type/length/value sequence as given in section
|
||||
6.4 below. The timestamp is a big-endian unsigned integer number of
|
||||
seconds since the unix epoch.
|
||||
|
||||
Implementations MAY use the timestamp value to help decide if their
|
||||
clocks are skewed. Initiators MAY use "other OR's address" to help
|
||||
learn which address their connections are originating from, if they do
|
||||
not know it. Initiators SHOULD use "this OR's address" to make sure
|
||||
that they have connected to another OR at its canonical address.
|
||||
|
||||
[As of 0.2.0.23-rc, implementations use none of the above values.]
|
||||
|
||||
|
||||
5. Circuit management
|
||||
|
||||
5.1. CREATE and CREATED cells
|
||||
|
||||
Users set up circuits incrementally, one hop at a time. To create a
|
||||
new circuit, OPs send a CREATE cell to the first node, with the
|
||||
first half of the DH handshake; that node responds with a CREATED
|
||||
cell with the second half of the DH handshake plus the first 20 bytes
|
||||
of derivative key data (see section 5.2). To extend a circuit past
|
||||
the first hop, the OP sends an EXTEND relay cell (see section 5)
|
||||
which instructs the last node in the circuit to send a CREATE cell
|
||||
to extend the circuit.
|
||||
|
||||
The payload for a CREATE cell is an 'onion skin', which consists
|
||||
of the first step of the DH handshake data (also known as g^x).
|
||||
This value is hybrid-encrypted (see 0.3) to Bob's onion key, giving
|
||||
an onion-skin of:
|
||||
PK-encrypted:
|
||||
Padding [PK_PAD_LEN bytes]
|
||||
Symmetric key [KEY_LEN bytes]
|
||||
First part of g^x [PK_ENC_LEN-PK_PAD_LEN-KEY_LEN bytes]
|
||||
Symmetrically encrypted:
|
||||
Second part of g^x [DH_LEN-(PK_ENC_LEN-PK_PAD_LEN-KEY_LEN)
|
||||
bytes]
|
||||
|
||||
The relay payload for an EXTEND relay cell consists of:
|
||||
Address [4 bytes]
|
||||
Port [2 bytes]
|
||||
Onion skin [DH_LEN+KEY_LEN+PK_PAD_LEN bytes]
|
||||
Identity fingerprint [HASH_LEN bytes]
|
||||
|
||||
The port and address field denote the IPV4 address and port of the next
|
||||
onion router in the circuit; the public key hash is the hash of the PKCS#1
|
||||
ASN1 encoding of the next onion router's identity (signing) key. (See 0.3
|
||||
above.) Including this hash allows the extending OR verify that it is
|
||||
indeed connected to the correct target OR, and prevents certain
|
||||
man-in-the-middle attacks.
|
||||
|
||||
The payload for a CREATED cell, or the relay payload for an
|
||||
EXTENDED cell, contains:
|
||||
DH data (g^y) [DH_LEN bytes]
|
||||
Derivative key data (KH) [HASH_LEN bytes] <see 5.2 below>
|
||||
|
||||
The CircID for a CREATE cell is an arbitrarily chosen 2-byte integer,
|
||||
selected by the node (OP or OR) that sends the CREATE cell. To prevent
|
||||
CircID collisions, when one node sends a CREATE cell to another, it chooses
|
||||
from only one half of the possible values based on the ORs' public
|
||||
identity keys: if the sending node has a lower key, it chooses a CircID with
|
||||
an MSB of 0; otherwise, it chooses a CircID with an MSB of 1.
|
||||
|
||||
(An OP with no public key MAY choose any CircID it wishes, since an OP
|
||||
never needs to process a CREATE cell.)
|
||||
|
||||
Public keys are compared numerically by modulus.
|
||||
|
||||
As usual with DH, x and y MUST be generated randomly.
|
||||
|
||||
5.1.1. CREATE_FAST/CREATED_FAST cells
|
||||
|
||||
When initializing the first hop of a circuit, the OP has already
|
||||
established the OR's identity and negotiated a secret key using TLS.
|
||||
Because of this, it is not always necessary for the OP to perform the
|
||||
public key operations to create a circuit. In this case, the
|
||||
OP MAY send a CREATE_FAST cell instead of a CREATE cell for the first
|
||||
hop only. The OR responds with a CREATED_FAST cell, and the circuit is
|
||||
created.
|
||||
|
||||
A CREATE_FAST cell contains:
|
||||
|
||||
Key material (X) [HASH_LEN bytes]
|
||||
|
||||
A CREATED_FAST cell contains:
|
||||
|
||||
Key material (Y) [HASH_LEN bytes]
|
||||
Derivative key data [HASH_LEN bytes] (See 5.2 below)
|
||||
|
||||
The values of X and Y must be generated randomly.
|
||||
|
||||
If an OR sees a circuit created with CREATE_FAST, the OR is sure to be the
|
||||
first hop of a circuit. ORs SHOULD reject attempts to create streams with
|
||||
RELAY_BEGIN exiting the circuit at the first hop: letting Tor be used as a
|
||||
single hop proxy makes exit nodes a more attractive target for compromise.
|
||||
|
||||
5.2. Setting circuit keys
|
||||
|
||||
Once the handshake between the OP and an OR is completed, both can
|
||||
now calculate g^xy with ordinary DH. Before computing g^xy, both client
|
||||
and server MUST verify that the received g^x or g^y value is not degenerate;
|
||||
that is, it must be strictly greater than 1 and strictly less than p-1
|
||||
where p is the DH modulus. Implementations MUST NOT complete a handshake
|
||||
with degenerate keys. Implementations MUST NOT discard other "weak"
|
||||
g^x values.
|
||||
|
||||
(Discarding degenerate keys is critical for security; if bad keys
|
||||
are not discarded, an attacker can substitute the server's CREATED
|
||||
cell's g^y with 0 or 1, thus creating a known g^xy and impersonating
|
||||
the server. Discarding other keys may allow attacks to learn bits of
|
||||
the private key.)
|
||||
|
||||
If CREATE or EXTEND is used to extend a circuit, the client and server
|
||||
base their key material on K0=g^xy, represented as a big-endian unsigned
|
||||
integer.
|
||||
|
||||
If CREATE_FAST is used, the client and server base their key material on
|
||||
K0=X|Y.
|
||||
|
||||
From the base key material K0, they compute KEY_LEN*2+HASH_LEN*3 bytes of
|
||||
derivative key data as
|
||||
K = H(K0 | [00]) | H(K0 | [01]) | H(K0 | [02]) | ...
|
||||
|
||||
The first HASH_LEN bytes of K form KH; the next HASH_LEN form the forward
|
||||
digest Df; the next HASH_LEN 41-60 form the backward digest Db; the next
|
||||
KEY_LEN 61-76 form Kf, and the final KEY_LEN form Kb. Excess bytes from K
|
||||
are discarded.
|
||||
|
||||
KH is used in the handshake response to demonstrate knowledge of the
|
||||
computed shared key. Df is used to seed the integrity-checking hash
|
||||
for the stream of data going from the OP to the OR, and Db seeds the
|
||||
integrity-checking hash for the data stream from the OR to the OP. Kf
|
||||
is used to encrypt the stream of data going from the OP to the OR, and
|
||||
Kb is used to encrypt the stream of data going from the OR to the OP.
|
||||
|
||||
5.3. Creating circuits
|
||||
|
||||
When creating a circuit through the network, the circuit creator
|
||||
(OP) performs the following steps:
|
||||
|
||||
1. Choose an onion router as an exit node (R_N), such that the onion
|
||||
router's exit policy includes at least one pending stream that
|
||||
needs a circuit (if there are any).
|
||||
|
||||
2. Choose a chain of (N-1) onion routers
|
||||
(R_1...R_N-1) to constitute the path, such that no router
|
||||
appears in the path twice.
|
||||
|
||||
3. If not already connected to the first router in the chain,
|
||||
open a new connection to that router.
|
||||
|
||||
4. Choose a circID not already in use on the connection with the
|
||||
first router in the chain; send a CREATE cell along the
|
||||
connection, to be received by the first onion router.
|
||||
|
||||
5. Wait until a CREATED cell is received; finish the handshake
|
||||
and extract the forward key Kf_1 and the backward key Kb_1.
|
||||
|
||||
6. For each subsequent onion router R (R_2 through R_N), extend
|
||||
the circuit to R.
|
||||
|
||||
To extend the circuit by a single onion router R_M, the OP performs
|
||||
these steps:
|
||||
|
||||
1. Create an onion skin, encrypted to R_M's public onion key.
|
||||
|
||||
2. Send the onion skin in a relay EXTEND cell along
|
||||
the circuit (see section 5).
|
||||
|
||||
3. When a relay EXTENDED cell is received, verify KH, and
|
||||
calculate the shared keys. The circuit is now extended.
|
||||
|
||||
When an onion router receives an EXTEND relay cell, it sends a CREATE
|
||||
cell to the next onion router, with the enclosed onion skin as its
|
||||
payload. As special cases, if the extend cell includes a digest of
|
||||
all zeroes, or asks to extend back to the relay that sent the extend
|
||||
cell, the circuit will fail and be torn down. The initiating onion
|
||||
router chooses some circID not yet used on the connection between the
|
||||
two onion routers. (But see section 5.1. above, concerning choosing
|
||||
circIDs based on lexicographic order of nicknames.)
|
||||
|
||||
When an onion router receives a CREATE cell, if it already has a
|
||||
circuit on the given connection with the given circID, it drops the
|
||||
cell. Otherwise, after receiving the CREATE cell, it completes the
|
||||
DH handshake, and replies with a CREATED cell. Upon receiving a
|
||||
CREATED cell, an onion router packs it payload into an EXTENDED relay
|
||||
cell (see section 5), and sends that cell up the circuit. Upon
|
||||
receiving the EXTENDED relay cell, the OP can retrieve g^y.
|
||||
|
||||
(As an optimization, OR implementations may delay processing onions
|
||||
until a break in traffic allows time to do so without harming
|
||||
network latency too greatly.)
|
||||
|
||||
5.3.1. Canonical connections
|
||||
|
||||
It is possible for an attacker to launch a man-in-the-middle attack
|
||||
against a connection by telling OR Alice to extend to OR Bob at some
|
||||
address X controlled by the attacker. The attacker cannot read the
|
||||
encrypted traffic, but the attacker is now in a position to count all
|
||||
bytes sent between Alice and Bob (assuming Alice was not already
|
||||
connected to Bob.)
|
||||
|
||||
To prevent this, when an OR we gets an extend request, it SHOULD use an
|
||||
existing OR connection if the ID matches, and ANY of the following
|
||||
conditions hold:
|
||||
- The IP matches the requested IP.
|
||||
- The OR knows that the IP of the connection it's using is canonical
|
||||
because it was listed in the NETINFO cell.
|
||||
- The OR knows that the IP of the connection it's using is canonical
|
||||
because it was listed in the server descriptor.
|
||||
|
||||
[This is not implemented in Tor 0.2.0.23-rc.]
|
||||
|
||||
5.4. Tearing down circuits
|
||||
|
||||
Circuits are torn down when an unrecoverable error occurs along
|
||||
the circuit, or when all streams on a circuit are closed and the
|
||||
circuit's intended lifetime is over. Circuits may be torn down
|
||||
either completely or hop-by-hop.
|
||||
|
||||
To tear down a circuit completely, an OR or OP sends a DESTROY
|
||||
cell to the adjacent nodes on that circuit, using the appropriate
|
||||
direction's circID.
|
||||
|
||||
Upon receiving an outgoing DESTROY cell, an OR frees resources
|
||||
associated with the corresponding circuit. If it's not the end of
|
||||
the circuit, it sends a DESTROY cell for that circuit to the next OR
|
||||
in the circuit. If the node is the end of the circuit, then it tears
|
||||
down any associated edge connections (see section 6.1).
|
||||
|
||||
After a DESTROY cell has been processed, an OR ignores all data or
|
||||
destroy cells for the corresponding circuit.
|
||||
|
||||
To tear down part of a circuit, the OP may send a RELAY_TRUNCATE cell
|
||||
signaling a given OR (Stream ID zero). That OR sends a DESTROY
|
||||
cell to the next node in the circuit, and replies to the OP with a
|
||||
RELAY_TRUNCATED cell.
|
||||
|
||||
When an unrecoverable error occurs along one connection in a
|
||||
circuit, the nodes on either side of the connection should, if they
|
||||
are able, act as follows: the node closer to the OP should send a
|
||||
RELAY_TRUNCATED cell towards the OP; the node farther from the OP
|
||||
should send a DESTROY cell down the circuit.
|
||||
|
||||
The payload of a RELAY_TRUNCATED or DESTROY cell contains a single octet,
|
||||
describing why the circuit is being closed or truncated. When sending a
|
||||
TRUNCATED or DESTROY cell because of another TRUNCATED or DESTROY cell,
|
||||
the error code should be propagated. The origin of a circuit always sets
|
||||
this error code to 0, to avoid leaking its version.
|
||||
|
||||
The error codes are:
|
||||
0 -- NONE (No reason given.)
|
||||
1 -- PROTOCOL (Tor protocol violation.)
|
||||
2 -- INTERNAL (Internal error.)
|
||||
3 -- REQUESTED (A client sent a TRUNCATE command.)
|
||||
4 -- HIBERNATING (Not currently operating; trying to save bandwidth.)
|
||||
5 -- RESOURCELIMIT (Out of memory, sockets, or circuit IDs.)
|
||||
6 -- CONNECTFAILED (Unable to reach server.)
|
||||
7 -- OR_IDENTITY (Connected to server, but its OR identity was not
|
||||
as expected.)
|
||||
8 -- OR_CONN_CLOSED (The OR connection that was carrying this circuit
|
||||
died.)
|
||||
9 -- FINISHED (The circuit has expired for being dirty or old.)
|
||||
10 -- TIMEOUT (Circuit construction took too long)
|
||||
11 -- DESTROYED (The circuit was destroyed w/o client TRUNCATE)
|
||||
12 -- NOSUCHSERVICE (Request for unknown hidden service)
|
||||
|
||||
5.5. Routing relay cells
|
||||
|
||||
When an OR receives a RELAY or RELAY_EARLY cell, it checks the cell's
|
||||
circID and determines whether it has a corresponding circuit along that
|
||||
connection. If not, the OR drops the cell.
|
||||
|
||||
Otherwise, if the OR is not at the OP edge of the circuit (that is,
|
||||
either an 'exit node' or a non-edge node), it de/encrypts the payload
|
||||
with the stream cipher, as follows:
|
||||
'Forward' relay cell (same direction as CREATE):
|
||||
Use Kf as key; decrypt.
|
||||
'Back' relay cell (opposite direction from CREATE):
|
||||
Use Kb as key; encrypt.
|
||||
Note that in counter mode, decrypt and encrypt are the same operation.
|
||||
|
||||
The OR then decides whether it recognizes the relay cell, by
|
||||
inspecting the payload as described in section 6.1 below. If the OR
|
||||
recognizes the cell, it processes the contents of the relay cell.
|
||||
Otherwise, it passes the decrypted relay cell along the circuit if
|
||||
the circuit continues. If the OR at the end of the circuit
|
||||
encounters an unrecognized relay cell, an error has occurred: the OR
|
||||
sends a DESTROY cell to tear down the circuit.
|
||||
|
||||
When a relay cell arrives at an OP, the OP decrypts the payload
|
||||
with the stream cipher as follows:
|
||||
OP receives data cell:
|
||||
For I=N...1,
|
||||
Decrypt with Kb_I. If the payload is recognized (see
|
||||
section 6..1), then stop and process the payload.
|
||||
|
||||
For more information, see section 6 below.
|
||||
|
||||
5.6. Handling relay_early cells
|
||||
|
||||
A RELAY_EARLY cell is designed to limit the length any circuit can reach.
|
||||
When an OR receives a RELAY_EARLY cell, and the next node in the circuit
|
||||
is speaking v2 of the link protocol or later, the OR relays the cell as a
|
||||
RELAY_EARLY cell. Otherwise, it relays it as a RELAY cell.
|
||||
|
||||
If a node ever receives more than 8 RELAY_EARLY cells on a given
|
||||
outbound circuit, it SHOULD close the circuit. (For historical reasons,
|
||||
we don't limit the number of inbound RELAY_EARLY cells; they should
|
||||
be harmless anyway because clients won't accept extend requests. See
|
||||
bug 1038.)
|
||||
|
||||
When speaking v2 of the link protocol or later, clients MUST only send
|
||||
EXTEND cells inside RELAY_EARLY cells. Clients SHOULD send the first ~8
|
||||
RELAY cells that are not targeted at the first hop of any circuit as
|
||||
RELAY_EARLY cells too, in order to partially conceal the circuit length.
|
||||
|
||||
[In a future version of Tor, servers will reject any EXTEND cell not
|
||||
received in a RELAY_EARLY cell. See proposal 110.]
|
||||
|
||||
6. Application connections and stream management
|
||||
|
||||
6.1. Relay cells
|
||||
|
||||
Within a circuit, the OP and the exit node use the contents of
|
||||
RELAY packets to tunnel end-to-end commands and TCP connections
|
||||
("Streams") across circuits. End-to-end commands can be initiated
|
||||
by either edge; streams are initiated by the OP.
|
||||
|
||||
The payload of each unencrypted RELAY cell consists of:
|
||||
Relay command [1 byte]
|
||||
'Recognized' [2 bytes]
|
||||
StreamID [2 bytes]
|
||||
Digest [4 bytes]
|
||||
Length [2 bytes]
|
||||
Data [CELL_LEN-14 bytes]
|
||||
|
||||
The relay commands are:
|
||||
1 -- RELAY_BEGIN [forward]
|
||||
2 -- RELAY_DATA [forward or backward]
|
||||
3 -- RELAY_END [forward or backward]
|
||||
4 -- RELAY_CONNECTED [backward]
|
||||
5 -- RELAY_SENDME [forward or backward] [sometimes control]
|
||||
6 -- RELAY_EXTEND [forward] [control]
|
||||
7 -- RELAY_EXTENDED [backward] [control]
|
||||
8 -- RELAY_TRUNCATE [forward] [control]
|
||||
9 -- RELAY_TRUNCATED [backward] [control]
|
||||
10 -- RELAY_DROP [forward or backward] [control]
|
||||
11 -- RELAY_RESOLVE [forward]
|
||||
12 -- RELAY_RESOLVED [backward]
|
||||
13 -- RELAY_BEGIN_DIR [forward]
|
||||
|
||||
32..40 -- Used for hidden services; see rend-spec.txt.
|
||||
|
||||
Commands labelled as "forward" must only be sent by the originator
|
||||
of the circuit. Commands labelled as "backward" must only be sent by
|
||||
other nodes in the circuit back to the originator. Commands marked
|
||||
as either can be sent either by the originator or other nodes.
|
||||
|
||||
The 'recognized' field in any unencrypted relay payload is always set
|
||||
to zero; the 'digest' field is computed as the first four bytes of
|
||||
the running digest of all the bytes that have been destined for
|
||||
this hop of the circuit or originated from this hop of the circuit,
|
||||
seeded from Df or Db respectively (obtained in section 5.2 above),
|
||||
and including this RELAY cell's entire payload (taken with the digest
|
||||
field set to zero).
|
||||
|
||||
When the 'recognized' field of a RELAY cell is zero, and the digest
|
||||
is correct, the cell is considered "recognized" for the purposes of
|
||||
decryption (see section 5.5 above).
|
||||
|
||||
(The digest does not include any bytes from relay cells that do
|
||||
not start or end at this hop of the circuit. That is, it does not
|
||||
include forwarded data. Therefore if 'recognized' is zero but the
|
||||
digest does not match, the running digest at that node should
|
||||
not be updated, and the cell should be forwarded on.)
|
||||
|
||||
All RELAY cells pertaining to the same tunneled stream have the
|
||||
same stream ID. StreamIDs are chosen arbitrarily by the OP. RELAY
|
||||
cells that affect the entire circuit rather than a particular
|
||||
stream use a StreamID of zero -- they are marked in the table above
|
||||
as "[control]" style cells. (Sendme cells are marked as "sometimes
|
||||
control" because they can take include a StreamID or not depending
|
||||
on their purpose -- see Section 7.)
|
||||
|
||||
The 'Length' field of a relay cell contains the number of bytes in
|
||||
the relay payload which contain real payload data. The remainder of
|
||||
the payload is padded with NUL bytes.
|
||||
|
||||
If the RELAY cell is recognized but the relay command is not
|
||||
understood, the cell must be dropped and ignored. Its contents
|
||||
still count with respect to the digests, though.
|
||||
|
||||
6.2. Opening streams and transferring data
|
||||
|
||||
To open a new anonymized TCP connection, the OP chooses an open
|
||||
circuit to an exit that may be able to connect to the destination
|
||||
address, selects an arbitrary StreamID not yet used on that circuit,
|
||||
and constructs a RELAY_BEGIN cell with a payload encoding the address
|
||||
and port of the destination host. The payload format is:
|
||||
|
||||
ADDRESS | ':' | PORT | [00]
|
||||
|
||||
where ADDRESS can be a DNS hostname, or an IPv4 address in
|
||||
dotted-quad format, or an IPv6 address surrounded by square brackets;
|
||||
and where PORT is a decimal integer between 1 and 65535, inclusive.
|
||||
|
||||
[What is the [00] for? -NM]
|
||||
[It's so the payload is easy to parse out with string funcs -RD]
|
||||
|
||||
Upon receiving this cell, the exit node resolves the address as
|
||||
necessary, and opens a new TCP connection to the target port. If the
|
||||
address cannot be resolved, or a connection can't be established, the
|
||||
exit node replies with a RELAY_END cell. (See 6.4 below.)
|
||||
Otherwise, the exit node replies with a RELAY_CONNECTED cell, whose
|
||||
payload is in one of the following formats:
|
||||
The IPv4 address to which the connection was made [4 octets]
|
||||
A number of seconds (TTL) for which the address may be cached [4 octets]
|
||||
or
|
||||
Four zero-valued octets [4 octets]
|
||||
An address type (6) [1 octet]
|
||||
The IPv6 address to which the connection was made [16 octets]
|
||||
A number of seconds (TTL) for which the address may be cached [4 octets]
|
||||
[XXXX No version of Tor currently generates the IPv6 format.]
|
||||
|
||||
[Tor servers before 0.1.2.0 set the TTL field to a fixed value. Later
|
||||
versions set the TTL to the last value seen from a DNS server, and expire
|
||||
their own cached entries after a fixed interval. This prevents certain
|
||||
attacks.]
|
||||
|
||||
The OP waits for a RELAY_CONNECTED cell before sending any data.
|
||||
Once a connection has been established, the OP and exit node
|
||||
package stream data in RELAY_DATA cells, and upon receiving such
|
||||
cells, echo their contents to the corresponding TCP stream.
|
||||
RELAY_DATA cells sent to unrecognized streams are dropped.
|
||||
|
||||
Relay RELAY_DROP cells are long-range dummies; upon receiving such
|
||||
a cell, the OR or OP must drop it.
|
||||
|
||||
6.2.1. Opening a directory stream
|
||||
|
||||
If a Tor server is a directory server, it should respond to a
|
||||
RELAY_BEGIN_DIR cell as if it had received a BEGIN cell requesting a
|
||||
connection to its directory port. RELAY_BEGIN_DIR cells ignore exit
|
||||
policy, since the stream is local to the Tor process.
|
||||
|
||||
If the Tor server is not running a directory service, it should respond
|
||||
with a REASON_NOTDIRECTORY RELAY_END cell.
|
||||
|
||||
Clients MUST generate an all-zero payload for RELAY_BEGIN_DIR cells,
|
||||
and servers MUST ignore the payload.
|
||||
|
||||
[RELAY_BEGIN_DIR was not supported before Tor 0.1.2.2-alpha; clients
|
||||
SHOULD NOT send it to routers running earlier versions of Tor.]
|
||||
|
||||
6.3. Closing streams
|
||||
|
||||
When an anonymized TCP connection is closed, or an edge node
|
||||
encounters error on any stream, it sends a 'RELAY_END' cell along the
|
||||
circuit (if possible) and closes the TCP connection immediately. If
|
||||
an edge node receives a 'RELAY_END' cell for any stream, it closes
|
||||
the TCP connection completely, and sends nothing more along the
|
||||
circuit for that stream.
|
||||
|
||||
The payload of a RELAY_END cell begins with a single 'reason' byte to
|
||||
describe why the stream is closing, plus optional data (depending on
|
||||
the reason.) The values are:
|
||||
|
||||
1 -- REASON_MISC (catch-all for unlisted reasons)
|
||||
2 -- REASON_RESOLVEFAILED (couldn't look up hostname)
|
||||
3 -- REASON_CONNECTREFUSED (remote host refused connection) [*]
|
||||
4 -- REASON_EXITPOLICY (OR refuses to connect to host or port)
|
||||
5 -- REASON_DESTROY (Circuit is being destroyed)
|
||||
6 -- REASON_DONE (Anonymized TCP connection was closed)
|
||||
7 -- REASON_TIMEOUT (Connection timed out, or OR timed out
|
||||
while connecting)
|
||||
8 -- (unallocated) [**]
|
||||
9 -- REASON_HIBERNATING (OR is temporarily hibernating)
|
||||
10 -- REASON_INTERNAL (Internal error at the OR)
|
||||
11 -- REASON_RESOURCELIMIT (OR has no resources to fulfill request)
|
||||
12 -- REASON_CONNRESET (Connection was unexpectedly reset)
|
||||
13 -- REASON_TORPROTOCOL (Sent when closing connection because of
|
||||
Tor protocol violations.)
|
||||
14 -- REASON_NOTDIRECTORY (Client sent RELAY_BEGIN_DIR to a
|
||||
non-directory server.)
|
||||
|
||||
(With REASON_EXITPOLICY, the 4-byte IPv4 address or 16-byte IPv6 address
|
||||
forms the optional data, along with a 4-byte TTL; no other reason
|
||||
currently has extra data.)
|
||||
|
||||
OPs and ORs MUST accept reasons not on the above list, since future
|
||||
versions of Tor may provide more fine-grained reasons.
|
||||
|
||||
Tors SHOULD NOT send any reason except REASON_MISC for a stream that they
|
||||
have originated.
|
||||
|
||||
[*] Older versions of Tor also send this reason when connections are
|
||||
reset.
|
||||
[**] Due to a bug in versions of Tor through 0095, error reason 8 must
|
||||
remain allocated until that version is obsolete.
|
||||
|
||||
--- [The rest of this section describes unimplemented functionality.]
|
||||
|
||||
Because TCP connections can be half-open, we follow an equivalent
|
||||
to TCP's FIN/FIN-ACK/ACK protocol to close streams.
|
||||
|
||||
An exit connection can have a TCP stream in one of three states:
|
||||
'OPEN', 'DONE_PACKAGING', and 'DONE_DELIVERING'. For the purposes
|
||||
of modeling transitions, we treat 'CLOSED' as a fourth state,
|
||||
although connections in this state are not, in fact, tracked by the
|
||||
onion router.
|
||||
|
||||
A stream begins in the 'OPEN' state. Upon receiving a 'FIN' from
|
||||
the corresponding TCP connection, the edge node sends a 'RELAY_FIN'
|
||||
cell along the circuit and changes its state to 'DONE_PACKAGING'.
|
||||
Upon receiving a 'RELAY_FIN' cell, an edge node sends a 'FIN' to
|
||||
the corresponding TCP connection (e.g., by calling
|
||||
shutdown(SHUT_WR)) and changing its state to 'DONE_DELIVERING'.
|
||||
|
||||
When a stream in already in 'DONE_DELIVERING' receives a 'FIN', it
|
||||
also sends a 'RELAY_FIN' along the circuit, and changes its state
|
||||
to 'CLOSED'. When a stream already in 'DONE_PACKAGING' receives a
|
||||
'RELAY_FIN' cell, it sends a 'FIN' and changes its state to
|
||||
'CLOSED'.
|
||||
|
||||
If an edge node encounters an error on any stream, it sends a
|
||||
'RELAY_END' cell (if possible) and closes the stream immediately.
|
||||
|
||||
6.4. Remote hostname lookup
|
||||
|
||||
To find the address associated with a hostname, the OP sends a
|
||||
RELAY_RESOLVE cell containing the hostname to be resolved with a nul
|
||||
terminating byte. (For a reverse lookup, the OP sends a RELAY_RESOLVE
|
||||
cell containing an in-addr.arpa address.) The OR replies with a
|
||||
RELAY_RESOLVED cell containing a status byte, and any number of
|
||||
answers. Each answer is of the form:
|
||||
Type (1 octet)
|
||||
Length (1 octet)
|
||||
Value (variable-width)
|
||||
TTL (4 octets)
|
||||
"Length" is the length of the Value field.
|
||||
"Type" is one of:
|
||||
0x00 -- Hostname
|
||||
0x04 -- IPv4 address
|
||||
0x06 -- IPv6 address
|
||||
0xF0 -- Error, transient
|
||||
0xF1 -- Error, nontransient
|
||||
|
||||
If any answer has a type of 'Error', then no other answer may be given.
|
||||
|
||||
The RELAY_RESOLVE cell must use a nonzero, distinct streamID; the
|
||||
corresponding RELAY_RESOLVED cell must use the same streamID. No stream
|
||||
is actually created by the OR when resolving the name.
|
||||
|
||||
7. Flow control
|
||||
|
||||
7.1. Link throttling
|
||||
|
||||
Each client or relay should do appropriate bandwidth throttling to
|
||||
keep its user happy.
|
||||
|
||||
Communicants rely on TCP's default flow control to push back when they
|
||||
stop reading.
|
||||
|
||||
The mainline Tor implementation uses token buckets (one for reads,
|
||||
one for writes) for the rate limiting.
|
||||
|
||||
Since 0.2.0.x, Tor has let the user specify an additional pair of
|
||||
token buckets for "relayed" traffic, so people can deploy a Tor relay
|
||||
with strict rate limiting, but also use the same Tor as a client. To
|
||||
avoid partitioning concerns we combine both classes of traffic over a
|
||||
given OR connection, and keep track of the last time we read or wrote
|
||||
a high-priority (non-relayed) cell. If it's been less than N seconds
|
||||
(currently N=30), we give the whole connection high priority, else we
|
||||
give the whole connection low priority. We also give low priority
|
||||
to reads and writes for connections that are serving directory
|
||||
information. See proposal 111 for details.
|
||||
|
||||
7.2. Link padding
|
||||
|
||||
Link padding can be created by sending PADDING cells along the
|
||||
connection; relay cells of type "DROP" can be used for long-range
|
||||
padding.
|
||||
|
||||
Currently nodes are not required to do any sort of link padding or
|
||||
dummy traffic. Because strong attacks exist even with link padding,
|
||||
and because link padding greatly increases the bandwidth requirements
|
||||
for running a node, we plan to leave out link padding until this
|
||||
tradeoff is better understood.
|
||||
|
||||
7.3. Circuit-level flow control
|
||||
|
||||
To control a circuit's bandwidth usage, each OR keeps track of two
|
||||
'windows', consisting of how many RELAY_DATA cells it is allowed to
|
||||
originate (package for transmission), and how many RELAY_DATA cells
|
||||
it is willing to consume (receive for local streams). These limits
|
||||
do not apply to cells that the OR receives from one host and relays
|
||||
to another.
|
||||
|
||||
Each 'window' value is initially set to 1000 data cells
|
||||
in each direction (cells that are not data cells do not affect
|
||||
the window). When an OR is willing to deliver more cells, it sends a
|
||||
RELAY_SENDME cell towards the OP, with Stream ID zero. When an OR
|
||||
receives a RELAY_SENDME cell with stream ID zero, it increments its
|
||||
packaging window.
|
||||
|
||||
Each of these cells increments the corresponding window by 100.
|
||||
|
||||
The OP behaves identically, except that it must track a packaging
|
||||
window and a delivery window for every OR in the circuit.
|
||||
|
||||
An OR or OP sends cells to increment its delivery window when the
|
||||
corresponding window value falls under some threshold (900).
|
||||
|
||||
If a packaging window reaches 0, the OR or OP stops reading from
|
||||
TCP connections for all streams on the corresponding circuit, and
|
||||
sends no more RELAY_DATA cells until receiving a RELAY_SENDME cell.
|
||||
[this stuff is badly worded; copy in the tor-design section -RD]
|
||||
|
||||
7.4. Stream-level flow control
|
||||
|
||||
Edge nodes use RELAY_SENDME cells to implement end-to-end flow
|
||||
control for individual connections across circuits. Similarly to
|
||||
circuit-level flow control, edge nodes begin with a window of cells
|
||||
(500) per stream, and increment the window by a fixed value (50)
|
||||
upon receiving a RELAY_SENDME cell. Edge nodes initiate RELAY_SENDME
|
||||
cells when both a) the window is <= 450, and b) there are less than
|
||||
ten cell payloads remaining to be flushed at that edge.
|
||||
|
||||
A.1. Differences between spec and implementation
|
||||
|
||||
- The current specification requires all ORs to have IPv4 addresses, but
|
||||
allows servers to exit and resolve to IPv6 addresses, and to declare IPv6
|
||||
addresses in their exit policies. The current codebase has no IPv6
|
||||
support at all.
|
||||
|
@ -1,45 +0,0 @@
|
||||
$Id$
|
||||
|
||||
HOW TOR VERSION NUMBERS WORK
|
||||
|
||||
1. The Old Way
|
||||
|
||||
Before 0.1.0, versions were of the format:
|
||||
MAJOR.MINOR.MICRO(status(PATCHLEVEL))?(-cvs)?
|
||||
where MAJOR, MINOR, MICRO, and PATCHLEVEL are numbers, status is one
|
||||
of "pre" (for an alpha release), "rc" (for a release candidate), or
|
||||
"." for a release. As a special case, "a.b.c" was equivalent to
|
||||
"a.b.c.0". We compare the elements in order (major, minor, micro,
|
||||
status, patchlevel, cvs), with "cvs" preceding non-cvs.
|
||||
|
||||
We would start each development branch with a final version in mind:
|
||||
say, "0.0.8". Our first pre-release would be "0.0.8pre1", followed by
|
||||
(for example) "0.0.8pre2-cvs", "0.0.8pre2", "0.0.8pre3-cvs",
|
||||
"0.0.8rc1", "0.0.8rc2-cvs", and "0.0.8rc2". Finally, we'd release
|
||||
0.0.8. The stable CVS branch would then be versioned "0.0.8.1-cvs",
|
||||
and any eventual bugfix release would be "0.0.8.1".
|
||||
|
||||
2. The New Way
|
||||
|
||||
After 0.1.0, versions are of the format:
|
||||
MAJOR.MINOR.MICRO(.PATCHLEVEL)(-status_tag)
|
||||
The stuff in parentheses is optional. As before, MAJOR, MINOR, MICRO,
|
||||
and PATCHLEVEL are numbers, with an absent number equivalent to 0.
|
||||
All versions should be distinguishable purely by those four
|
||||
numbers. The status tag is purely informational, and lets you know how
|
||||
stable we think the release is: "alpha" is pretty unstable; "rc" is a
|
||||
release candidate; and no tag at all means that we have a final
|
||||
release. If the tag ends with "-cvs" or "-dev", you're looking at a
|
||||
development snapshot that came after a given release. If we *do*
|
||||
encounter two versions that differ only by status tag, we compare them
|
||||
lexically.
|
||||
|
||||
Now, we start each development branch with (say) 0.1.1.1-alpha. The
|
||||
patchlevel increments consistently as the status tag changes, for
|
||||
example, as in: 0.1.1.2-alpha, 0.1.1.3-alpha, 0.1.1.4-rc, 0.1.1.5-rc.
|
||||
Eventually, we release 0.1.1.6. The next patch release is 0.1.1.7.
|
||||
|
||||
Between these releases, CVS is versioned with a -cvs tag: after
|
||||
0.1.1.1-alpha comes 0.1.1.1-alpha-cvs, and so on. But starting with
|
||||
0.1.2.1-alpha-dev, we switched to SVN and started using the "-dev"
|
||||
suffix instead of the "-cvs" suffix.
|
Loading…
Reference in New Issue
Block a user