more fixes. i declare this the first draft.

svn:r3598
This commit is contained in:
Roger Dingledine 2005-02-09 10:10:22 +00:00
parent aca8c362bf
commit e3266768f4

View File

@ -1,16 +1,14 @@
\documentclass{llncs} \documentclass{llncs}
% XXXX NM: Fold ``bandwidth and usability'' into ``Tor and file-sharing'' --
% ``bandwidth and file-sharing''.
\usepackage{url} \usepackage{url}
\usepackage{amsmath} \usepackage{amsmath}
\usepackage{epsfig} \usepackage{epsfig}
\setlength{\textwidth}{6.1in} \setlength{\textwidth}{5.9in}
\setlength{\textheight}{8.5in} \setlength{\textheight}{8.4in}
\setlength{\topmargin}{1cm} \setlength{\topmargin}{.5cm}
\setlength{\oddsidemargin}{.5cm} \setlength{\oddsidemargin}{1cm}
\setlength{\evensidemargin}{.5cm} \setlength{\evensidemargin}{1cm}
\newenvironment{tightlist}{\begin{list}{$\bullet$}{ \newenvironment{tightlist}{\begin{list}{$\bullet$}{
\setlength{\itemsep}{0mm} \setlength{\itemsep}{0mm}
@ -122,7 +120,7 @@ giving an effective vector for physical or online attackers.
Tor provides these protections even when a portion of its Tor provides these protections even when a portion of its
infrastructure is compromised. infrastructure is compromised.
To connect to a remove server via Tor, the client software learns a signed To connect to a remote server via Tor, the client software learns a signed
list of Tor nodes from one of several central \emph{directory servers}, and list of Tor nodes from one of several central \emph{directory servers}, and
incrementally creates a private pathway or \emph{circuit} of encrypted incrementally creates a private pathway or \emph{circuit} of encrypted
connections through authenticated Tor nodes on the network, negotiating a connections through authenticated Tor nodes on the network, negotiating a
@ -373,10 +371,10 @@ eavesdropper can perform traffic analysis on the entire network.
%financial health as well as network security. %financial health as well as network security.
The Java The Java
Anon Proxy~\cite{web-mix} provides similar functionality to Tor but Anon Proxy~\cite{web-mix} provides similar functionality to Tor but
handles only web browsing rather than arbitrary TCP\@. handles only web browsing rather than all TCP\@.
%Some peer-to-peer file-sharing overlay networks such as %Some peer-to-peer file-sharing overlay networks such as
%Freenet~\cite{freenet} and Mute~\cite{mute} %Freenet~\cite{freenet} and Mute~\cite{mute}
Zero-Knowledge Systems' commercial Freedom Zero-Knowledge Systems' Freedom
network~\cite{freedom21-security} was even more flexible than Tor in network~\cite{freedom21-security} was even more flexible than Tor in
transporting arbitrary IP packets, and also supported transporting arbitrary IP packets, and also supported
pseudonymity in addition to anonymity; but it has pseudonymity in addition to anonymity; but it has
@ -387,7 +385,7 @@ more scalable peer-to-peer designs like Tarzan~\cite{tarzan:ccs02} and
MorphMix~\cite{morphmix:fc04} have been proposed in the literature, but MorphMix~\cite{morphmix:fc04} have been proposed in the literature, but
have not been fielded. These systems differ somewhat have not been fielded. These systems differ somewhat
in threat model and presumably practical resistance to threats. in threat model and presumably practical resistance to threats.
Note that MorphMix and Tor differ only in Note that MorphMix differs from Tor only in
node discovery and circuit setup; so Tor's architecture is flexible node discovery and circuit setup; so Tor's architecture is flexible
enough to contain a MorphMix experiment. enough to contain a MorphMix experiment.
We direct the interested reader We direct the interested reader
@ -461,7 +459,7 @@ attacks, because its network has fewer edges. JAP was born out of
the ISDN mix design~\cite{isdn-mixes}, where padding made sense because the ISDN mix design~\cite{isdn-mixes}, where padding made sense because
every user had a fixed bandwidth allocation and altering the timing every user had a fixed bandwidth allocation and altering the timing
pattern of packets could be immediately detected. But in its current context pattern of packets could be immediately detected. But in its current context
as a general Internet web anonymizer, adding sufficient padding to JAP as an Internet web anonymizer, adding sufficient padding to JAP
would probably be prohibitively expensive and ineffective against a would probably be prohibitively expensive and ineffective against a
minimally active attacker.\footnote{Even if JAP could minimally active attacker.\footnote{Even if JAP could
fund higher-capacity nodes indefinitely, our experience fund higher-capacity nodes indefinitely, our experience
@ -621,7 +619,7 @@ any anonymizing network: their intensive bandwidth requirement, and the
degree to which they are associated (correctly or not) with copyright degree to which they are associated (correctly or not) with copyright
infringement. infringement.
As noted above, high-bandwidth protocols can make the network unresponsive, High-bandwidth protocols can make the network unresponsive,
but tend to be somewhat self-correcting as lack of bandwidth drives away but tend to be somewhat self-correcting as lack of bandwidth drives away
users who need it. Issues of copyright violation, users who need it. Issues of copyright violation,
however, are more interesting. Typical exit node operators want to help however, are more interesting. Typical exit node operators want to help
@ -636,7 +634,7 @@ So when letters arrive, operators are likely to face
pressure to block file-sharing applications entirely, in order to avoid the pressure to block file-sharing applications entirely, in order to avoid the
hassle. hassle.
But blocking file-sharing is not easy: many popular But blocking file-sharing is not easy: popular
protocols have evolved to run on non-standard ports to protocols have evolved to run on non-standard ports to
get around other port-based bans. Thus, exit node operators who want to get around other port-based bans. Thus, exit node operators who want to
block file-sharing would have to find some way to integrate Tor with a block file-sharing would have to find some way to integrate Tor with a
@ -726,20 +724,20 @@ nodes, open proxies, and service abusers, these systems hope to make
ongoing abuse difficult. Although the system is imperfect, it works ongoing abuse difficult. Although the system is imperfect, it works
tolerably well for them in practice. tolerably well for them in practice.
But of course, we would prefer that legitimate anonymous users be able to Of course, we would prefer that legitimate anonymous users be able to
access abuse-prone services. One conceivable approach would be to require access abuse-prone services. One conceivable approach would require
would-be IRC users, for instance, to register accounts if they want to would-be IRC users, for instance, to register accounts if they want to
access the IRC network from Tor. In practice this would not access the IRC network from Tor. In practice this would not
significantly impede abuse if creating new accounts were easily automatable; significantly impede abuse if creating new accounts were easily automatable;
this is why services use IP blocking. To deter abuse, pseudonymous this is why services use IP blocking. To deter abuse, pseudonymous
identities need to require a significant switching cost in resources or human identities need to require a significant switching cost in resources or human
time. Some popular webmail applications time. Some popular webmail applications
impose cost with Reverse Turing Tests, but these may not be costly enough to impose cost with Reverse Turing Tests, but this step may not deter all
deter abusers. Freedom used blind signatures to limit abusers. Freedom used blind signatures to limit
the number of pseudonyms for each paying account, but Tor has neither the the number of pseudonyms for each paying account, but Tor has neither the
ability nor the desire to collect payment. ability nor the desire to collect payment.
We stress that as far as we can tell, most Tor uses so far are not We stress that as far as we can tell, most Tor uses are not
abusive. Most services have not complained, and others are actively abusive. Most services have not complained, and others are actively
working to find ways besides banning to cope with the abuse. For example, working to find ways besides banning to cope with the abuse. For example,
the Freenode IRC network had a problem with a coordinated group of the Freenode IRC network had a problem with a coordinated group of
@ -891,8 +889,8 @@ prevent individual machines within the enclave from running Tor
clients~\cite{or-jsac98,or-discex00}. clients~\cite{or-jsac98,or-discex00}.
Of course, Tor's default path length of Of course, Tor's default path length of
three is insufficient for these enclaves, since the entry and/or exit three is insufficient for these enclaves, since the entry or exit
themselves are sensitive. Tor thus increments the path length by one themselves are sensitive. Tor thus increments path length by one
for each sensitive endpoint in the circuit. for each sensitive endpoint in the circuit.
Enclaves also help to protect against end-to-end attacks, since it's Enclaves also help to protect against end-to-end attacks, since it's
possible that traffic coming from the node has simply been relayed from possible that traffic coming from the node has simply been relayed from
@ -1208,49 +1206,47 @@ further study.
\subsection{Trust and discovery} \subsection{Trust and discovery}
\label{subsec:trust-and-discovery} \label{subsec:trust-and-discovery}
The published Tor design adopted a deliberately simplistic design for The published Tor design uses a deliberately simplistic design for
authorizing new nodes and informing clients about Tor nodes and their status. authorizing new nodes and informing clients about Tor nodes and their status.
In preliminary Tor designs, all nodes periodically uploaded a All nodes periodically upload a signed description
signed description
of their locations, keys, and capabilities to each of several well-known {\it of their locations, keys, and capabilities to each of several well-known {\it
directory servers}. These directory servers constructed a signed summary directory servers}. These directory servers construct a signed summary
of all known Tor nodes (a ``directory''), and a signed statement of which of all known Tor nodes (a ``directory''), and a signed statement of which
nodes they nodes they
believed to be operational at any given time (a ``network status''). Clients believe to be operational then (a ``network status''). Clients
periodically downloaded a directory to learn the latest nodes and periodically download a directory to learn the latest nodes and
keys, and more frequently downloaded a network status to learn which nodes were keys, and more frequently download a network status to learn which nodes are
likely to be running. Tor nodes also operate as directory caches, to likely to be running. Tor nodes also operate as directory caches, to
lighten the bandwidth on the authoritative directory servers. lighten the bandwidth on the directory servers.
In order to prevent Sybil attacks (wherein an adversary signs up many To prevent Sybil attacks (wherein an adversary signs up many
purportedly independent nodes to increase her chances of observing purportedly independent nodes to increase her network view),
a stream as it enters and leaves the network), the early Tor directory design this design
required the operators of the authoritative directory servers to manually requires the directory server operators to manually
approve new nodes. Unapproved nodes were included in the directory, approve new nodes. Unapproved nodes are included in the directory,
but clients but clients
did not use them at the start or end of their circuits. In practice, do not use them at the start or end of their circuits. In practice,
directory administrators performed little actual verification, and tended to directory administrators perform little actual verification, and tend to
approve any Tor node whose operator could compose a coherent email. approve any Tor node whose operator can compose a coherent email.
This procedure This procedure
may have prevented trivial automated Sybil attacks, but would do little may prevent trivial automated Sybil attacks, but will do little
against a clever and determined attacker. against a clever and determined attacker.
There are a number of flaws in this system that need to be addressed as we There are a number of flaws in this system that need to be addressed as we
move forward. They include: move forward. First,
\begin{tightlist} each directory server represents an independent point of failure: any
\item Each directory server represents an independent point of failure; if compromised directory server could start recommending only compromised
any one were compromised, it could immediately compromise all of its users nodes.
by recommending only compromised nodes. Second, as more nodes join the network, %the more unreasonable it
\item The more nodes join the network, the more unreasonable it %becomes to expect clients to know about them all.
becomes to expect clients to know about them all. Directories directories
become infeasibly large, and downloading the list of nodes becomes become infeasibly large, and downloading the list of nodes becomes
burdensome. burdensome.
\item The validation scheme may do as much harm as it does good. It is not Third, the validation scheme may do as much harm as it does good. It not
only incapable of preventing clever attackers from mounting Sybil attacks, only can't prevent clever attackers from mounting Sybil attacks,
but may deter node operators from joining the network. (For instance, if but it may deter node operators from joining the network, if
they expect the validation process to be difficult, or if they do not share they expect the validation process to be difficult, or they do not share
any languages in common with the directory server operators.) any languages in common with the directory server operators.
\end{tightlist}
We could try to move the system in several directions, depending on our We could try to move the system in several directions, depending on our
choice of threat model and requirements. If we did not need to increase choice of threat model and requirements. If we did not need to increase
@ -1261,18 +1257,17 @@ But, we can only do that if can simultaneously make node capacity
scale much more than we anticipate to be feasible soon, and if we can find scale much more than we anticipate to be feasible soon, and if we can find
entities willing to run such nodes, an equally daunting prospect. entities willing to run such nodes, an equally daunting prospect.
In order to address the first two issues, it seems wise to move to a system In order to address the first two issues, it seems wise to move to a system
including a number of semi-trusted directory servers, no one of which can including a number of semi-trusted directory servers, no one of which can
compromise a user on its own. Ultimately, of course, we cannot escape the compromise a user on its own. Ultimately, of course, we cannot escape the
problem of a first introducer: since most users will run Tor in whatever problem of a first introducer: since most users will run Tor in whatever
configuration the software ships with, the Tor distribution itself will configuration the software ships with, the Tor distribution itself will
remain a potential single point of failure so long as it includes the seed remain a single point of failure so long as it includes the seed
keys for directory servers, a list of directory servers, or any other means keys for directory servers, a list of directory servers, or any other means
to learn which nodes are on the network. But omitting this information to learn which nodes are on the network. But omitting this information
from the Tor distribution would only delegate the trust problem to the from the Tor distribution would only delegate the trust problem to each
individual users, most of whom are presumably less informed about how to make individual user. %, most of whom are presumably less informed about how to make
trust decisions than the Tor developers. %trust decisions than the Tor developers.
%Network discovery, sybil, node admission, scaling. It seems that the code %Network discovery, sybil, node admission, scaling. It seems that the code
%will ship with something and that's our trust root. We could try to get %will ship with something and that's our trust root. We could try to get
@ -1310,20 +1305,19 @@ for views of a node's latency and/or bandwidth to vary wildly between
observers. Further, it is unclear whether total bandwidth is really observers. Further, it is unclear whether total bandwidth is really
the right measure; perhaps clients should instead be considering nodes the right measure; perhaps clients should instead be considering nodes
based on unused bandwidth or observed throughput. based on unused bandwidth or observed throughput.
% XXXX say more here?
%How to measure performance without letting people selectively deny service %How to measure performance without letting people selectively deny service
%by distinguishing pings. Heck, just how to measure performance at all. In %by distinguishing pings. Heck, just how to measure performance at all. In
%practice people have funny firewalls that don't match up to their exit %practice people have funny firewalls that don't match up to their exit
%policies and Tor doesn't deal. %policies and Tor doesn't deal.
%
%Network investigation: Is all this bandwidth publishing thing a good idea? %Network investigation: Is all this bandwidth publishing thing a good idea?
%How can we collect stats better? Note weasel's smokeping, at %How can we collect stats better? Note weasel's smokeping, at
%http://seppia.noreply.org/cgi-bin/smokeping.cgi?target=Tor %http://seppia.noreply.org/cgi-bin/smokeping.cgi?target=Tor
%which probably gives george and steven enough info to break tor? %which probably gives george and steven enough info to break tor?
%
Even if we can collect and use this network information effectively, we need And even if we can collect and use this network information effectively,
to make sure that it is not more useful to attackers than to us. While it we must ensure
that it is not more useful to attackers than to us. While it
seems plausible that bandwidth data alone is not enough to reveal seems plausible that bandwidth data alone is not enough to reveal
sender-recipient connections under most circumstances, it could certainly sender-recipient connections under most circumstances, it could certainly
reveal the path taken by large traffic flows under low-usage circumstances. reveal the path taken by large traffic flows under low-usage circumstances.
@ -1331,24 +1325,27 @@ reveal the path taken by large traffic flows under low-usage circumstances.
\subsection{Non-clique topologies} \subsection{Non-clique topologies}
Tor's comparatively weak threat model may allow easier scaling than Tor's comparatively weak threat model may allow easier scaling than
other mix-net other
designs. High-latency mix networks need to avoid partitioning attacks, where designs. High-latency mix networks need to avoid partitioning attacks, where
network splits let an attacker distinguish users in different partitions. network splits let an attacker distinguish users in different partitions.
Since Tor assumes the adversary cannot cheaply observe nodes at will, Since Tor assumes the adversary cannot cheaply observe nodes at will,
a network split may not decrease protection much. a network split may not decrease protection much.
Thus, one option when the scale of a Tor network Thus, one option when the scale of a Tor network
exceeds some size is simply to split it. Nodes could be allocated into exceeds some size is simply to split it. Nodes could be allocated into
partitions while hampering collobrating hostile nodes from taking over partitions while hampering collaborating hostile nodes from taking over
a single partition~\cite{casc-rep}. a single partition~\cite{casc-rep}.
Clients could switch between Clients could switch between
networks, even on a per-circuit basis. Future analysis may uncover networks, even on a per-circuit basis.
other dangers beyond those affecting mix-nets. %Future analysis may uncover
%other dangers beyond those affecting mix-nets.
More conservatively, we can try to scale a single Tor network. Potential More conservatively, we can try to scale a single Tor network. Likely
problems with adding more servers to a single Tor network include an problems with adding more servers to a single Tor network include an
explosion in the number of sockets needed on each server as more servers explosion in the number of sockets needed on each server as more servers
join, and an increase in coordination overhead as keeping everyone's view of join, and increased coordination overhead to keep each users' view of
the network consistent becomes increasingly difficult. the network consistent. As we grow, we will also have more instances of
servers that can't reach each other simply due to Internet topology or
routing problems.
%include restricting the number of sockets and the amount of bandwidth %include restricting the number of sockets and the amount of bandwidth
%used by each node. The number of sockets is determined by the network's %used by each node. The number of sockets is determined by the network's
@ -1369,9 +1366,7 @@ extend to Tor, which has a weaker threat model but higher performance
requirements: instead of analyzing the requirements: instead of analyzing the
probability of an attacker's viewing whole paths, we will need to examine the probability of an attacker's viewing whole paths, we will need to examine the
attacker's likelihood of compromising the endpoints. attacker's likelihood of compromising the endpoints.
%
% Nick edits these next 2 grafs.
Tor may not need an expander graph per se: it Tor may not need an expander graph per se: it
may be enough to have a single subnet that is highly connected, like may be enough to have a single subnet that is highly connected, like
an internet backbone. % As an an internet backbone. % As an
@ -1382,22 +1377,22 @@ an internet backbone. % As an
%center and anyone out of the center that they want to. Then the %center and anyone out of the center that they want to. Then the
%network easily scales to c. 2500 nodes with commensurate increase in %network easily scales to c. 2500 nodes with commensurate increase in
%bandwidth. %bandwidth.
There are many open questions: how to distribute directory information There are many open questions: how to distribute connectivity information
(presumably information about the center nodes could (presumably nodes will learn about the center nodes
be given to any new nodes with their codebase), whether center nodes when they download Tor), whether center nodes
will need to function as a `backbone', and so one. As above, will need to function as a `backbone', and so on. As above,
this could create problems for the expected anonymity for a mix-net, this could create problems for the expected anonymity for a mix-net,
but for a low-latency network where anonymity derives largely from but for a low-latency network where anonymity derives largely from
the edges, it may be feasible. the edges, it may be feasible.
In a sense, Tor already has a non-clique topology. %In a sense, Tor already has a non-clique topology.
Individuals can set up and run Tor nodes without informing the %Individuals can set up and run Tor nodes without informing the
directory servers. This allows groups to run a %directory servers. This allows groups to run a
local Tor network of private nodes that connects to the public Tor %local Tor network of private nodes that connects to the public Tor
network. This network is hidden behind the Tor network, and its %network. This network is hidden behind the Tor network, and its
only visible connection to Tor is at those points where it connects. %only visible connection to Tor is at those points where it connects.
As far as the public network, or anyone observing it, is concerned, %As far as the public network, or anyone observing it, is concerned,
they are running clients. %they are running clients.
\section{The Future} \section{The Future}
\label{sec:conclusion} \label{sec:conclusion}