more patches on sec2 and sec3; rewrite threat model

svn:r712
2024-11-20 02:09:24 +01:00 · 2003-11-02 06:14:59 +00:00 · 2003-11-02 06:14:59 +00:00 · fddda9a797
commit fddda9a797
parent b0c6a5ea2e
2 changed files with 130 additions and 269 deletions
--- a/doc/TODO
+++ b/doc/TODO
@ -1,6 +1,10 @@
-mutiny: if none of the ports is defined maybe it shouldn't start.
+mutiny suggests: if none of the ports is defined maybe it shouldn't start.
 aaron got a crash in tor_timegm in tzset on os x, with -l warn but not with -l debug.
 Oct 25 04:29:17.017 [warn] directory_initiate_command(): No running dirservers known. This is really bad.
 rename ACI to CircID
 rotate tls-level connections -- make new ones, expire old ones.
 dirserver shouldn't put you in running-routers list if you haven't
  uploading a descriptor recently
 Legend:
 SPEC!!  - Not specified
--- a/doc/tor-design.tex
+++ b/doc/tor-design.tex
@ -39,7 +39,7 @@
 %  \pdfpageheight=\the\paperheight
 %\fi
-\title{Tor: Design of a Second-Generation Onion Router}
+\title{Tor: The Second-Generation Onion Router}
 %\author{Roger Dingledine \\ The Free Haven Project \\ arma@freehaven.net \and
 %Nick Mathewson \\ The Free Haven Project \\ nickm@freehaven.net \and
@ -308,22 +308,20 @@ Concentrating the traffic to a single point increases the anonymity set
 analysis easier: an adversary need only eavesdrop on the proxy to observe
 the entire system.
-More complex are distributed-trust, circuit-based anonymizing systems.  In
+More complex are distributed-trust, circuit-based anonymizing systems.
-these designs, a user establishes one or more medium-term bidirectional
+In these designs, a user establishes one or more medium-term bidirectional
-end-to-end tunnels to exit servers, and uses those tunnels to deliver
+end-to-end circuits, and tunnels TCP streams in fixed-size cells.
-low-latency packets to and from one or more destinations per
+Establishing circuits is expensive and typically requires public-key
-tunnel. %XXX reword
+cryptography, whereas relaying cells is comparatively inexpensive.
-Establishing tunnels is expensive and typically
+Because a circuit crosses several servers, no single server can link a
-requires public-key cryptography, whereas relaying packets along a tunnel is
+user to her communication partners.
 comparatively inexpensive.  Because a tunnel crosses several servers, no
 single server can link a user to her communication partners.
-In some distributed-trust systems, such as the Java Anon Proxy (also known
+The Java Anon Proxy (also known
-as JAP or Web MIXes), users build their tunnels along a fixed shared route
+as JAP or Web MIXes) uses fixed shared routes known as
-or \emph{cascade}.  As with a single-hop proxy, this approach aggregates
+\emph{cascades}.  As with a single-hop proxy, this approach aggregates
 users into larger anonymity sets, but again an attacker only needs to
 observe both ends of the cascade to bridge all the system's traffic.
-The Java Anon Proxy's design seeks to prevent this by padding
+The Java Anon Proxy's design provides protection by padding
 between end users and the head of the cascade \cite{web-mix}. However, the
 current implementation does no padding and thus remains vulnerable
 to both active and passive bridging.
@ -350,10 +348,10 @@ from the data stream.
 Hordes \cite{hordes-jcs} is based on Crowds but also uses multicast
 responses to hide the initiator. Herbivore \cite{herbivore} and P5
-\cite{p5} go even further, requiring broadcast.  Each uses broadcast
+\cite{p5} go even further, requiring broadcast. They make anonymity
-in different ways, and trade-offs are made to make broadcast more
+and efficiency tradeoffs to make broadcast more practical.
-practical. Both Herbivore and P5 are designed primarily for communication
+These systems are designed primarily for communication between peers,
-between peers, although Herbivore permits external connections by
+although Herbivore users can make external connections by
 requesting a peer to serve as a proxy.  Allowing easy connections to
 nonparticipating responders or recipients is important for usability,
 for example so users can visit nonparticipating Web sites or exchange
@ -391,273 +389,132 @@ Eternity and Free Haven.
 \SubSection{Goals}
 Like other low-latency anonymity designs, Tor seeks to frustrate
 attackers from linking communication partners, or from linking
-multiple communications to or from a single point.  Within this
+multiple communications to or from a single user.  Within this
 main goal, however, several design considerations have directed
 Tor's evolution.
-\begin{tightlist}
+\textbf{Deployability:} The design must be one which can be implemented,
-\item[Deployability:] The design must be one which can be implemented,
+deployed, and used in the real world.  This requirement precludes designs
-  deployed, and used in the real world.  This requirement precludes designs
+that are expensive to run (for example, by requiring more bandwidth
-  that are expensive to run (for example, by requiring more bandwidth than
+than volunteers are willing to provide); designs that place a heavy
-  volunteers are willing to provide); designs that place a heavy liability
+liability burden on operators (for example, by allowing attackers to
-  burden on operators (for example, by allowing attackers to implicate onion
+implicate onion routers in illegal activities); and designs that are
-  routers in illegal activities); and designs that are difficult or expensive
+difficult or expensive to implement (for example, by requiring kernel
-  to implement (for example, by requiring kernel patches, or separate proxies
+patches, or separate proxies for every protocol).  This requirement also
-  for every protocol).  This requirement also precludes systems in which
+precludes systems in which users who do not benefit from anonymity are
-  users who do not benefit from anonymity are required to run special
+required to run special software in order to communicate with anonymous
-  software in order to communicate with anonymous parties.
+parties. (We do not meet this goal for the current rendezvous design,
-%     Our rendezvous points require clients to use our software to get to
+however; see Section~\ref{sec:rendezvous}.)
-%     the location-hidden servers.
+
-%     Or at least, they require somebody near the client-side running our
+\textbf{Usability:} A hard-to-use system has fewer users---and because
-%     software. We haven't worked out the details of keeping it transparent
+anonymity systems hide users among users, a system with fewer users
-%     for Alice if she's using some other http proxy somewhere. I guess the
+provides less anonymity.  Usability is not only a convenience for Tor:
-%     external http proxy should route through a Tor client, which automatically
+it is a security requirement \cite{econymics,back01}. Tor should not
-%     translates the foo.onion address? -RD
+require modifying applications; should not introduce prohibitive delays;
-%
+and should require the user to make as few configuration decisions
-%  1. Such clients do benefit from anonymity: they can reach the server.
+as possible.
-%  Recall that our goal for location hidden servers is to continue to
+
-%  provide service to priviliged clients when a DoS is happening or
+\textbf{Flexibility:} The protocol must be flexible and well-specified,
-%  to provide access to a location sensitive service. I see no contradiction.
+so that it can serve as a test-bed for future research in low-latency
-%  2. A good idiot check is whether what we require people to download
+anonymity systems.  Many of the open problems in low-latency anonymity
-%  and use is more extreme than downloading the anonymizer toolbar or
+networks, such as generating dummy traffic or preventing Sybil attacks
-%  privacy manager. I don't think so, though I'm not claiming we've already
+\cite{sybil}, may be solvable independently from the issues solved by
-%  got the installation and running of a client down to that simplicity
+Tor. Hopefully future systems will not need to reinvent Tor's design
-%  at this time. -PS
+decisions.  (But note that while a flexible design benefits researchers,
-\item[Usability:] A hard-to-use system has fewer users---and because
+there is a danger that differing choices of extensions will make users
-  anonymity systems hide users among users, a system with fewer users
+distinguishable. Experiments should be run on a separate network.)
-  provides less anonymity.  Usability is not only a convenience for Tor:
+
-  it is a security requirement \cite{econymics,back01}. Tor
+\textbf{Conservative design:} The protocol's design and security
-  should work with most of a user's unmodified applications; shouldn't
+parameters must be conservative. Additional features impose implementation
-  introduce prohibitive delays; and should require the user to make as few
+and complexity costs; adding unproven techniques to the design threatens
-  configuration decisions as possible.
+deployability, readability, and ease of security analysis. Tor aims to
-\item[Flexibility:] The protocol must be flexible and
+deploy a simple and stable system that integrates the best well-understood
-  well-specified, so that it can serve as a test-bed for future research in
+approaches to protecting anonymity.
  low-latency anonymity systems.  Many of the open problems in low-latency
  anonymity networks (such as generating dummy traffic, or preventing
  pseudospoofing attacks) may be solvable independently from the issues
  solved by Tor; it would be beneficial if future systems were not forced to
  reinvent Tor's design decisions.  (But note that while a flexible design
  benefits researchers, there is a danger that differing choices of
  extensions will render users distinguishable.  Thus, experiments
  on extensions should be limited and should not significantly affect
  the distinguishability of ordinary users.
  % To run an experiment researchers must file an
  % anonymity impact statement -PS
  of implementations should
  not permit different protocol extensions to coexist in a single deployed
  network.)
 \item[Conservative design:] The protocol's design and security parameters
  must be conservative.  Because additional features impose implementation
  and complexity costs, Tor should include as few speculative features as
  possible.  (We do not oppose speculative designs in general; however, it is
  our goal with Tor to embody a solution to the problems in low-latency
  anonymity that we can solve today before we plunge into the problems of
  tomorrow.)
  % This last bit sounds completely cheesy.  Somebody should tone it down. -NM 
 \end{tightlist}
 \SubSection{Non-goals}
 \label{subsec:non-goals}
 In favoring conservative, deployable designs, we have explicitly deferred
-a number of goals. Many of these goals are desirable in anonymity systems,
+a number of goals, either because they are solved elsewhere, or because
-but we choose to defer them either because they are solved elsewhere,
+they are an open research question.
 or because they present an area of active research lacking a generally
 accepted solution.
-\begin{tightlist}
+\textbf{Not Peer-to-peer:} Tarzan and MorphMix aim to scale to completely
-\item[Not Peer-to-peer:] Tarzan and MorphMix aim to
+decentralized peer-to-peer environments with thousands of short-lived
-  scale to completely decentralized peer-to-peer environments with thousands
+servers, many of which may be controlled by an adversary.  This approach
-  of short-lived servers, many of which may be controlled by an adversary.
+is appealing, but still has many open problems.
-  Because of the many open problems in this approach, Tor uses a more
+
-  conservative design.
+\textbf{Not secure against end-to-end attacks:} Tor does not claim
-\item[Not secure against end-to-end attacks:] Tor does not claim to provide a
+to provide a definitive solution to end-to-end timing or intersection
-  definitive solution to end-to-end timing or intersection attacks. Some
+attacks. Some approaches, such as running an onion router, may help;
-  approaches, such as running an onion router, may help; see
+see Section~\ref{sec:analysis} for more discussion.
-  Section~\ref{sec:analysis} for more discussion.
+
-\item[No protocol normalization:] Tor does not provide \emph{protocol
+\textbf{No protocol normalization:} Tor does not provide \emph{protocol
-  normalization} like Privoxy or the Anonymizer.  In order to make clients
+normalization} like Privoxy or the Anonymizer. For complex and variable
-  indistinguishable when they use complex and variable protocols such as HTTP,
+protocols such as HTTP, Tor must be layered with a filtering proxy such
-  Tor must be layered with a filtering proxy such as Privoxy to hide
+as Privoxy to hide differences between clients, and expunge protocol
-  differences between clients, expunge protocol features that leak identity,
+features that leak identity. Similarly, Tor does not currently integrate
-  and so on.  Similarly, Tor does not currently integrate tunneling for
+tunneling for non-stream-based protocols like UDP; this too must be
-  non-stream-based protocols like UDP; this too must be provided by
+provided by an external service.
  an external service.
 % Actually, tunneling udp over tcp is probably horrible for some apps.
 % Should this get its own non-goal bulletpoint? The motivation for
-% non-goal-ness would be burden on clients / portability.
+% non-goal-ness would be burden on clients / portability. -RD
-\item[Not steganographic:] Tor does not try to conceal which users are
+% No, leave it as is. -RD
-  sending or receiving communications; it only tries to conceal whom they are
+
-  communicating with.
+\textbf{Not steganographic:} Tor does not try to conceal which users are
-\end{tightlist}
+sending or receiving communications; it only tries to conceal with whom
 they communicate.
 \SubSection{Threat Model}
 \label{subsec:threat-model}
 A global passive adversary is the most commonly assumed threat when
-analyzing theoretical anonymity designs. But like all practical low-latency
+analyzing theoretical anonymity designs. But like all practical
-systems, Tor is not secure against this adversary.  Instead, we assume an
+low-latency systems, Tor does not protect against such a strong
-adversary that is weaker than global with respect to distribution, but that
+adversary. Instead, we expect an adversary who can observe some fraction
-is not merely passive.  Our threat model expands on that from
+of network traffic; who can generate, modify, delete, or delay traffic
-\cite{or-pet00}.
+on the network; who can operate onion routers of its own; and who can
 compromise some fraction of the onion routers on the network.
-%%%% This is really keen analytical stuff, but it isn't our threat model:
+%Large adversaries will be able to compromise a considerable fraction
-%%%% we just go ahead and assume a fraction of hostile nodes for
+%of the network. (In some circumstances---for example, if the Tor
-%%%% convenience. -NM
+%network is running on a hardened network where all operators have
-%
+%had background checks---the number of compromised nodes could be quite
-%% The basic adversary components we consider are:
+%small.) Compromised nodes can arbitrarily manipulate the connections that
-%% \begin{tightlist}
+%pass through them, as well as creating new connections that pass through
-%% \item[Observer:] can observe a connection (e.g., a sniffer on an
+%themselves.  They can observe traffic, and record it for later analysis.
 %%   Internet router), but cannot initiate connections. Observations may
 %%   include timing and/or volume of packets as well as appearance of
 %%   individual packets (including headers and content).
 %% \item[Disrupter:] can delay (indefinitely) or corrupt traffic on a
 %%   link. Can change all those things that an observer can observe up to
 %%   the limits of computational ability (e.g., cannot forge signatures
 %%   unless a key is compromised).
 %% \item[Hostile initiator:] can initiate (or destroy) connections with
 %%   specific routes as well as vary the timing and content of traffic
 %%   on the connections it creates. A special case of the disrupter with
 %%   additional abilities appropriate to its role in forming connections.
 %% \item[Hostile responder:] can vary the traffic on the connections made
 %%   to it including refusing them entirely, intentionally modifying what
 %%   it sends and at what rate, and selectively closing them. Also a
 %%   special case of the disrupter.
 %% \item[Key breaker:] can break the key used to encrypt connection
 %%   initiation requests sent to a Tor-node.
 %% % Er, there are no long-term private decryption keys. They have
 %% % long-term private signing keys, and medium-term onion (decryption)
 %% % keys. Plus short-term link keys. Should we lump them together or
 %% % separate them out? -RD
 %% %
 %% %  Hmmm, I was talking about the keys used to encrypt the onion skin
 %% %  that contains the public DH key from the initiator. Is that what you
 %% %  mean by medium-term onion key? (``Onion key'' used to mean the
 %% %  session keys distributed in the onion, back when there were onions.)
 %% %  Also, why are link keys short-term? By link keys I assume you mean
 %% %  keys that neighbor nodes use to superencrypt all the stuff they send
 %% %  to each other on a link.  Did you mean the session keys? I had been
 %% %  calling session keys short-term and everything else long-term. I
 %% %  know I was being sloppy. (I _have_ written papers formalizing
 %% %  concepts of relative freshness.) But, there's some questions lurking
 %% %  here. First up, I don't see why the onion-skin encryption key should
 %% %  be any shorter term than the signature key in terms of threat
 %% %  resistance. I understand that how we update onion-skin encryption
 %% %  keys makes them depend on the signature keys. But, this is not the
 %% %  basis on which we should be deciding about key rotation. Another
 %% %  question is whether we want to bother with someone who breaks a
 %% %  signature key as a particular adversary. He should be able to do
 %% %  nearly the same as a compromised tor-node, although they're not the
 %% %  same. I reworded above, I'm thinking we should leave other concerns
 %% %  for later. -PS
 %% \item[Hostile Tor node:] can arbitrarily manipulate the
 %%   connections under its control, as well as creating new connections
 %%   (that pass through itself).
 %% \end{tightlist}
 %
 %% All feasible adversaries can be composed out of these basic
 %% adversaries. This includes combinations such as one or more
 %% compromised Tor-nodes cooperating with disrupters of links on which
 %% those nodes are not adjacent, or such as combinations of hostile
 %% outsiders and link observers (who watch links between adjacent
 %% Tor-nodes).  Note that one type of observer might be a Tor-node. This
 %% is sometimes called an honest-but-curious adversary. While an observer
 %% Tor-node will perform only correct protocol interactions, it might
 %% share information about connections and cannot be assumed to destroy
 %% session keys at end of a session.  Note that a compromised Tor-node is
 %% stronger than any other adversary component in the sense that
 %% replacing a component of any adversary with a compromised Tor-node
 %% results in a stronger overall adversary (assuming that the compromised
 %% Tor-node retains the same signature keys and other private
 %% state-information as the component it replaces).
-First, we assume that a threshold of directory servers are honest,
+In low-latency anonymity systems that use layered encryption, the
-reliable, accurate, and trustworthy.
+adversary's typical goal is to observe both the initiator and the
-%% the rest of this isn't needed, if dirservers do threshold concensus dirs
+receiver. Passive attackers can confirm a suspicion that Alice is
-%  To augment this, users can periodically cross-check 
+talking to Bob if the timing and volume properties of the traffic on the
-%directories from each directory server (trust, but verify).
+connection are unique enough; active attackers are even more effective
-%, and that they always have access to at least one directory server that they trust.
+because they can induce timing signatures on the traffic. Tor provides
 some defenses against these \emph{traffic confirmation} attacks, for
 example by encouraging users to run their own onion routers, but it does
 not provide complete protection. Rather, we aim to prevent \emph{traffic
 analysis} attacks, where the adversary uses traffic patterns to learn
 which points in the network he should attack.
-Second, we assume that somewhere between ten percent and twenty
+Our adversary might try to link an initiator Alice with any of her
-percent\footnote{In some circumstances---for example, if the Tor network is
+communication partners, or he might try to build a profile of Alice's
-  running on a hardened network where all operators have had background
+behavior. He might mount passive attacks by observing the edges of the
-  checks---the number of compromised nodes could be much lower.} 
+network and correlating traffic entering and leaving the network---either
-of the Tor nodes accepted by the directory servers are compromised, hostile,
+because of relationships in packet timing; relationships in the volume
-and collaborating in an off-line clique.  These compromised nodes can
+of data sent; or relationships in any externally visible user-selected
-arbitrarily manipulate the connections that pass through them, as well as
+options. The adversary can also mount active attacks by compromising
-creating new connections that pass through themselves.  They can observe
+routers or keys; by replaying traffic; by selectively DoSing trustworthy
-traffic, and record it for later analysis.  Honest participants do not know
+routers to encourage users to send their traffic through compromised
-which servers these are.
+routers, or DoSing users to see if the traffic elsewhere in the
-
+network stops; or by introducing patterns into traffic that can later be
-(In reality, many adversaries might have `bad' servers that are not
+detected. The adversary might attack the directory servers to give users
-fully compromised but simply under observation, or that have had their keys
+differing views of network state. Additionally, he can try to decrease
-compromised.  But for the sake of analysis, we ignore, this possibility,
+the network's reliability by attacking nodes or by performing antisocial
-since the threat model we assume is strictly stronger.)
+activities from reliable servers and trying to get them taken down;
-
+making the network unreliable flushes users to other less anonymous
-% This next paragraph is also more about analysis than it is about our
+systems, where they may be easier to attack.
 % threat model.  Perhaps we can say, ``users can connect to the network and
 % use it in any way; we consider abusive attacks separately.'' ? -NM
 Third, we constrain the impact of hostile users.  Users are assumed to vary
 widely in both the duration and number of times they are connected to the Tor
 network. They can also be assumed to vary widely in the volume and shape of
 the traffic they send and receive. Hostile users are, by definition, limited
 to creating and varying their own connections into or through a Tor
 network. They may attack their own connections to try to gain identity
 information of the responder in a rendezvous connection. They can also try to
 attack sites through the Onion Routing network; however we will consider this
 abuse rather than an attack per se (see
 Section~\ref{subsec:exitpolicies}). Other than abuse, a hostile user's
 motivation to attack his own connections is limited to the network effects of
 such actions, such as denial of service (DoS) attacks.  Thus, in this case,
 we can view user as simply an extreme case of the ordinary user; although
 ordinary users are not likely to engage in, e.g., IP spoofing, to gain their
 objectives.
 In general, we are more focused on traffic analysis attacks than
 traffic confirmation attacks. 
 %A user who runs a Tor proxy on his own
 %machine, connects to some remote Tor-node and makes a connection to an
 %open Internet site, such as a public web server, is vulnerable to
 %traffic confirmation.
 That is, an active attacker who suspects that
 a particular client is communicating with a particular server can
 confirm this if she can modify and observe both the
 connection between the Tor network and the client and that between the
 Tor network and the server. Even a purely passive attacker can
 confirm traffic if the timing and volume properties of the traffic on
 the connection are unique enough.  (This is not to say that Tor offers
 no resistance to traffic confirmation; it does.  We defer discussion
 of this point and of particular attacks until Section~\ref{sec:attacks},
 after we have described Tor in more detail.)
 % XXX We need to say what traffic analysis is:  How about...
 On the other hand, we {\it do} try to prevent an attacker from
 performing traffic analysis: that is, attempting to learn the communication
 partners of an arbitrary user.
 % XXX If that's not right, what is?  It would be silly to have a
 % threat model section without saying what we want to prevent the
 % attacker from doing. -NM
 % XXX Also, do we want to mention linkability or building profiles? -NM
 Our assumptions about our adversary's capabilities imply a number of
 possible attacks against users' anonymity.  Our adversary might try to
 mount passive attacks by observing the edges of the network and
 correlating traffic entering and leaving the network: either because
 of relationships in packet timing; relationships in the volume of data
 sent; [XXX simple observation??]; or relationships in any externally
 visible user-selected options.  The adversary can also mount active
 attacks by trying to compromise all the servers' keys in a
 path---either through illegitimate means or through legal coercion in
 unfriendly jurisdiction; by selectively DoSing trustworthy servers; by
 introducing patterns into entering traffic that can later be detected;
 or by modifying data entering the network and hoping that trashed data
 comes out the other end.  The attacker can additionally try to
 decrease the network's reliability by performing antisocial activities
 from reliable servers and trying to get them taken down.
 % XXX Should there be more or less?  Should we turn this into a
 % bulleted list?  Should we cut it entirely?
 We consider these attacks and more, and describe our defenses against them
 in Section~\ref{sec:attacks}.
 We consider each of these attacks in more detail below, and summarize
 in Section~\ref{sec:attacks} how well the Tor design defends against
 each of them.
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
@ -2004,7 +1861,7 @@ issues remaining to be ironed out. In particular:
 % Many of these (Scalability, cover traffic) are duplicates from open problems.
 %
-\begin{itemize}
+\begin{tightlist}
 \item \emph{Scalability:} Tor's emphasis on design simplicity and
  deployability has led us to adopt a clique topology, a
  semi-centralized model for directories and trusts, and a
@ -2049,7 +1906,7 @@ issues remaining to be ironed out. In particular:
  able to evaluate some of our design decisions, including our
  robustness/latency tradeoffs, our abuse-prevention mechanisms, and
  our overall usability.
-\end{itemize}
+\end{tightlist}
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%