diff --git a/doc/cell-struct.png b/doc/cell-struct.png new file mode 100644 index 0000000000..c0afa1c4a1 Binary files /dev/null and b/doc/cell-struct.png differ diff --git a/doc/interaction.png b/doc/interaction.png new file mode 100644 index 0000000000..44db52a126 Binary files /dev/null and b/doc/interaction.png differ diff --git a/doc/tor-design.html b/doc/tor-design.html new file mode 100644 index 0000000000..a02731f174 --- /dev/null +++ b/doc/tor-design.html @@ -0,0 +1,2486 @@ + + + + + + + + + Tor: The Second-Generation Onion Router + + + +

Tor: The Second-Generation Onion Router

+
+ +

+Roger Dingledine, The Free Haven Project, arma@freehaven.net
+Nick Mathewson, The Free Haven Project, nickm@freehaven.net
+Paul Syverson, Naval Research Lab, syverson@itd.nrl.navy.mil

+ +
+ +
+ +

Abstract

+We present Tor, a circuit-based low-latency anonymous communication +service. This second-generation Onion Routing system addresses limitations +in the original design by adding perfect forward secrecy, congestion +control, directory servers, integrity checking, configurable exit policies, +and a practical design for location-hidden services via rendezvous +points. Tor works on the real-world +Internet, requires no special privileges or kernel modifications, requires +little synchronization or coordination between nodes, and provides a +reasonable tradeoff between anonymity, usability, and efficiency. +We briefly describe our experiences with an international network of +more than 30 nodes. We close with a list of open problems in anonymous communication. + +
+ +
+ +
+

+1  Overview

+ + + +
+Onion Routing is a distributed overlay network designed to anonymize +TCP-based applications like web browsing, secure shell, +and instant messaging. Clients choose a path through the network and +build a circuit, in which each node (or "onion router" or "OR") +in the path knows its predecessor and successor, but no other nodes in +the circuit. Traffic flows down the circuit in fixed-size +cells, which are unwrapped by a symmetric key at each node +(like the layers of an onion) and relayed downstream. The +Onion Routing project published several design and analysis +papers [27,41,48,49]. While a wide area Onion +Routing network was deployed briefly, the only long-running +public implementation was a fragile +proof-of-concept that ran on a single machine. Even this simple deployment +processed connections from over sixty thousand distinct IP addresses from +all over the world at a rate of about fifty thousand per day. +But many critical design and deployment issues were never +resolved, and the design has not been updated in years. Here +we describe Tor, a protocol for asynchronous, loosely federated onion +routers that provides the following improvements over the old Onion +Routing design: + +
+Perfect forward secrecy: In the original Onion Routing design, +a single hostile node could record traffic and +later compromise successive nodes in the circuit and force them +to decrypt it. Rather than using a single multiply encrypted data +structure (an onion) to lay each circuit, +Tor now uses an incremental or telescoping path-building design, +where the initiator negotiates session keys with each successive hop in +the circuit. Once these keys are deleted, subsequently compromised nodes +cannot decrypt old traffic. As a side benefit, onion replay detection +is no longer necessary, and the process of building circuits is more +reliable, since the initiator knows when a hop fails and can then try +extending to a new node. + +
+Separation of "protocol cleaning" from anonymity: +Onion Routing originally required a separate "application +proxy" for each supported application protocol-most of which were +never written, so many applications were never supported. Tor uses the +standard and near-ubiquitous SOCKS [32] proxy interface, allowing +us to support most TCP-based programs without modification. Tor now +relies on the filtering features of privacy-enhancing +application-level proxies such as Privoxy [39], without trying +to duplicate those features itself. + +
+No mixing, padding, or traffic shaping (yet): Onion +Routing originally called for batching and reordering cells as they arrived, +assumed padding between ORs, and in +later designs added padding between onion proxies (users) and +ORs [27,41]. Tradeoffs between padding protection +and cost were discussed, and traffic shaping algorithms were +theorized [49] to provide good security without expensive +padding, but no concrete padding scheme was suggested. +Recent research [1] +and deployment experience [4] suggest that this +level of resource use is not practical or economical; and even full +link padding is still vulnerable [33]. Thus, +until we have a proven and convenient design for traffic shaping or +low-latency mixing that improves anonymity against a realistic +adversary, we leave these strategies out. + +
+Many TCP streams can share one circuit: Onion Routing originally +built a separate circuit for each +application-level request, but this required +multiple public key operations for every request, and also presented +a threat to anonymity from building so many circuits; see +Section 9. Tor multiplexes multiple TCP +streams along each circuit to improve efficiency and anonymity. + +
+Leaky-pipe circuit topology: Through in-band signaling +within the circuit, Tor initiators can direct traffic to nodes partway +down the circuit. This novel approach +allows traffic to exit the circuit from the middle-possibly +frustrating traffic shape and volume attacks based on observing the end +of the circuit. (It also allows for long-range padding if +future research shows this to be worthwhile.) + +
+Congestion control: Earlier anonymity designs do not +address traffic bottlenecks. Unfortunately, typical approaches to +load balancing and flow control in overlay networks involve inter-node +control communication and global views of traffic. Tor's decentralized +congestion control uses end-to-end acks to maintain anonymity +while allowing nodes at the edges of the network to detect congestion +or flooding and send less data until the congestion subsides. + +
+Directory servers: The earlier Onion Routing design +planned to flood state information through the network-an approach +that can be unreliable and complex. Tor takes a simplified view toward distributing this +information. Certain more trusted nodes act as directory +servers: they provide signed directories describing known +routers and their current state. Users periodically download them +via HTTP. + +
+Variable exit policies: Tor provides a consistent mechanism +for each node to advertise a policy describing the hosts +and ports to which it will connect. These exit policies are critical +in a volunteer-based distributed infrastructure, because each operator +is comfortable with allowing different types of traffic to exit +from his node. + +
+End-to-end integrity checking: The original Onion Routing +design did no integrity checking on data. Any node on the +circuit could change the contents of data cells as they passed by-for +example, to alter a connection request so it would connect +to a different webserver, or to `tag' encrypted traffic and look for +corresponding corrupted traffic at the network edges [15]. +Tor hampers these attacks by verifying data integrity before it leaves +the network. + +
+ +
+Rendezvous points and hidden services: +Tor provides an integrated mechanism for responder anonymity via +location-protected servers. Previous Onion Routing designs included +long-lived "reply onions" that could be used to build circuits +to a hidden server, but these reply onions did not provide forward +security, and became useless if any node in the path went down +or rotated its keys. In Tor, clients negotiate rendezvous points +to connect with hidden servers; reply onions are no longer required. + +
+Unlike Freedom [8], Tor does not require OS kernel +patches or network stack support. This prevents us from anonymizing +non-TCP protocols, but has greatly helped our portability and +deployability. + +
+ +
+We have implemented all of the above features, including rendezvous +points. Our source code is +available under a free license, and Tor +is not covered by the patent that affected distribution and use of +earlier versions of Onion Routing. +We have deployed a wide-area alpha network +to test the design, to get more experience with usability +and users, and to provide a research platform for experimentation. +As of this writing, the network stands at 32 nodes spread over two continents. + +
+We review previous work in Section 2, describe +our goals and assumptions in Section 3, +and then address the above list of improvements in +Sections 45, and 6. +We summarize +in Section 7 how our design stands up to +known attacks, and talk about our early deployment experiences in +Section 8. We conclude with a list of open problems in +Section 9 and future work for the Onion +Routing project in Section 10. + +
+ +
+

+2  Related work

+ + + +
+Modern anonymity systems date to Chaum's Mix-Net +design [10]. Chaum +proposed hiding the correspondence between sender and recipient by +wrapping messages in layers of public-key cryptography, and relaying them +through a path composed of "mixes." Each mix in turn +decrypts, delays, and re-orders messages before relaying them +onward. + +
+Subsequent relay-based anonymity designs have diverged in two +main directions. Systems like Babel [28], +Mixmaster [36], +and Mixminion [15] have tried +to maximize anonymity at the cost of introducing comparatively large and +variable latencies. Because of this decision, these high-latency +networks resist strong global adversaries, +but introduce too much lag for interactive tasks like web browsing, +Internet chat, or SSH connections. + +
+Tor belongs to the second category: low-latency designs that +try to anonymize interactive network traffic. These systems handle +a variety of bidirectional protocols. They also provide more convenient +mail delivery than the high-latency anonymous email +networks, because the remote mail server provides explicit and timely +delivery confirmation. But because these designs typically +involve many packets that must be delivered quickly, it is +difficult for them to prevent an attacker who can eavesdrop both ends of the +communication from correlating the timing and volume +of traffic entering the anonymity network with traffic leaving it [45]. +These +protocols are similarly vulnerable to an active adversary who introduces +timing patterns into traffic entering the network and looks +for correlated patterns among exiting traffic. +Although some work has been done to frustrate these attacks, most designs +protect primarily against traffic analysis rather than traffic +confirmation (see Section 3.1). + +
+The simplest low-latency designs are single-hop proxies such as the +Anonymizer [3]: a single trusted server strips the +data's origin before relaying it. These designs are easy to +analyze, but users must trust the anonymizing proxy. +Concentrating the traffic to this single point increases the anonymity set +(the people a given user is hiding among), but it is vulnerable if the +adversary can observe all traffic entering and leaving the proxy. + +
+More complex are distributed-trust, circuit-based anonymizing systems. +In these designs, a user establishes one or more medium-term bidirectional +end-to-end circuits, and tunnels data in fixed-size cells. +Establishing circuits is computationally expensive and typically +requires public-key +cryptography, whereas relaying cells is comparatively inexpensive and +typically requires only symmetric encryption. +Because a circuit crosses several servers, and each server only knows +the adjacent servers in the circuit, no single server can link a +user to her communication partners. + +
+The Java Anon Proxy (also known as JAP or Web MIXes) uses fixed shared +routes known as cascades. As with a single-hop proxy, this +approach aggregates users into larger anonymity sets, but again an +attacker only needs to observe both ends of the cascade to bridge all +the system's traffic. The Java Anon Proxy's design +calls for padding between end users and the head of the +cascade [7]. However, it is not demonstrated whether the current +implementation's padding policy improves anonymity. + +
+PipeNet [5,12], another low-latency design proposed +around the same time as Onion Routing, gave +stronger anonymity but allowed a single user to shut +down the network by not sending. Systems like ISDN +mixes [38] were designed for other environments with +different assumptions. + +
+In P2P designs like Tarzan [24] and +MorphMix [43], all participants both generate +traffic and relay traffic for others. These systems aim to conceal +whether a given peer originated a request +or just relayed it from another peer. While Tarzan and MorphMix use +layered encryption as above, Crowds [42] simply assumes +an adversary who cannot observe the initiator: it uses no public-key +encryption, so any node on a circuit can read users' traffic. + +
+Hordes [34] is based on Crowds but also uses multicast +responses to hide the initiator. Herbivore [25] and +P5 [46] go even further, requiring broadcast. +These systems are designed primarily for communication among peers, +although Herbivore users can make external connections by +requesting a peer to serve as a proxy. + +
+Systems like Freedom and the original Onion Routing build circuits +all at once, using a layered "onion" of public-key encrypted messages, +each layer of which provides session keys and the address of the +next server in the circuit. Tor as described herein, Tarzan, MorphMix, +Cebolla [9], and Rennhard's Anonymity Network [44] +build circuits +in stages, extending them one hop at a time. +Section 4.2 describes how this +approach enables perfect forward secrecy. + +
+Circuit-based designs must choose which protocol layer +to anonymize. They may intercept IP packets directly, and +relay them whole (stripping the source address) along the +circuit [8,24]. Like +Tor, they may accept TCP streams and relay the data in those streams, +ignoring the breakdown of that data into TCP +segments [43,44]. Finally, like Crowds, they may accept +application-level protocols such as HTTP and relay the application +requests themselves. +Making this protocol-layer decision requires a compromise between flexibility +and anonymity. For example, a system that understands HTTP +can strip +identifying information from requests, can take advantage of caching +to limit the number of requests that leave the network, and can batch +or encode requests to minimize the number of connections. +On the other hand, an IP-level anonymizer can handle nearly any protocol, +even ones unforeseen by its designers (though these systems require +kernel-level modifications to some operating systems, and so are more +complex and less portable). TCP-level anonymity networks like Tor present +a middle approach: they are application neutral (so long as the +application supports, or can be tunneled across, TCP), but by treating +application connections as data streams rather than raw TCP packets, +they avoid the inefficiencies of tunneling TCP over +TCP. + +
+Distributed-trust anonymizing systems need to prevent attackers from +adding too many servers and thus compromising user paths. +Tor relies on a small set of well-known directory servers, run by +independent parties, to decide which nodes can +join. Tarzan and MorphMix allow unknown users to run servers, and use +a limited resource (like IP addresses) to prevent an attacker from +controlling too much of the network. Crowds suggests requiring +written, notarized requests from potential crowd members. + +
+Anonymous communication is essential for censorship-resistant +systems like Eternity [2], Free Haven [19], +Publius [53], and Tangler [52]. Tor's rendezvous +points enable connections between mutually anonymous entities; they +are a building block for location-hidden servers, which are needed by +Eternity and Free Haven. + +
+ +
+

+3  Design goals and assumptions

+ + + +
+Goals
+Like other low-latency anonymity designs, Tor seeks to frustrate +attackers from linking communication partners, or from linking +multiple communications to or from a single user. Within this +main goal, however, several considerations have directed +Tor's evolution. + +
+Deployability: The design must be deployed and used in the +real world. Thus it +must not be expensive to run (for example, by requiring more bandwidth +than volunteers are willing to provide); must not place a heavy +liability burden on operators (for example, by allowing attackers to +implicate onion routers in illegal activities); and must not be +difficult or expensive to implement (for example, by requiring kernel +patches, or separate proxies for every protocol). We also cannot +require non-anonymous parties (such as websites) +to run our software. (Our rendezvous point design does not meet +this goal for non-anonymous users talking to hidden servers, +however; see Section 5.) + +
+Usability: A hard-to-use system has fewer users-and because +anonymity systems hide users among users, a system with fewer users +provides less anonymity. Usability is thus not only a convenience: +it is a security requirement [1,5]. Tor should +therefore not +require modifying familiar applications; should not introduce prohibitive +delays; +and should require as few configuration decisions +as possible. Finally, Tor should be easily implementable on all common +platforms; we cannot require users to change their operating system +to be anonymous. (Tor currently runs on Win32, Linux, +Solaris, BSD-style Unix, MacOS X, and probably others.) + +
+Flexibility: The protocol must be flexible and well-specified, +so Tor can serve as a test-bed for future research. +Many of the open problems in low-latency anonymity +networks, such as generating dummy traffic or preventing Sybil +attacks [22], may be solvable independently from the issues +solved by +Tor. Hopefully future systems will not need to reinvent Tor's design. + +
+Simple design: The protocol's design and security +parameters must be well-understood. Additional features impose implementation +and complexity costs; adding unproven techniques to the design threatens +deployability, readability, and ease of security analysis. Tor aims to +deploy a simple and stable system that integrates the best accepted +approaches to protecting anonymity.
+ +
+Non-goals +
+In favoring simple, deployable designs, we have explicitly deferred +several possible goals, either because they are solved elsewhere, or because +they are not yet solved. + +
+Not peer-to-peer: Tarzan and MorphMix aim to scale to completely +decentralized peer-to-peer environments with thousands of short-lived +servers, many of which may be controlled by an adversary. This approach +is appealing, but still has many open +problems [24,43]. + +
+Not secure against end-to-end attacks: Tor does not claim +to completely solve end-to-end timing or intersection +attacks. Some approaches, such as having users run their own onion routers, +may help; +see Section 9 for more discussion. + +
+No protocol normalization: Tor does not provide protocol +normalization like Privoxy or the Anonymizer. If senders want anonymity from +responders while using complex and variable +protocols like HTTP, Tor must be layered with a filtering proxy such +as Privoxy to hide differences between clients, and expunge protocol +features that leak identity. +Note that by this separation Tor can also provide services that +are anonymous to the network yet authenticated to the responder, like +SSH. Similarly, Tor does not integrate +tunneling for non-stream-based protocols like UDP; this must be +provided by an external service if appropriate. + +
+Not steganographic: Tor does not try to conceal who is connected +to the network. + +
+

+3.1  Threat Model

+ + + +
+A global passive adversary is the most commonly assumed threat when +analyzing theoretical anonymity designs. But like all practical +low-latency systems, Tor does not protect against such a strong +adversary. Instead, we assume an adversary who can observe some fraction +of network traffic; who can generate, modify, delete, or delay +traffic; who can operate onion routers of his own; and who can +compromise some fraction of the onion routers. + +
+In low-latency anonymity systems that use layered encryption, the +adversary's typical goal is to observe both the initiator and the +responder. By observing both ends, passive attackers can confirm a +suspicion that Alice is +talking to Bob if the timing and volume patterns of the traffic on the +connection are distinct enough; active attackers can induce timing +signatures on the traffic to force distinct patterns. Rather +than focusing on these traffic confirmation attacks, +we aim to prevent traffic +analysis attacks, where the adversary uses traffic patterns to learn +which points in the network he should attack. + +
+Our adversary might try to link an initiator Alice with her +communication partners, or try to build a profile of Alice's +behavior. He might mount passive attacks by observing the network edges +and correlating traffic entering and leaving the network-by +relationships in packet timing, volume, or externally visible +user-selected +options. The adversary can also mount active attacks by compromising +routers or keys; by replaying traffic; by selectively denying service +to trustworthy routers to move users to +compromised routers, or denying service to users to see if traffic +elsewhere in the +network stops; or by introducing patterns into traffic that can later be +detected. The adversary might subvert the directory servers to give users +differing views of network state. Additionally, he can try to decrease +the network's reliability by attacking nodes or by performing antisocial +activities from reliable nodes and trying to get them taken down-making +the network unreliable flushes users to other less anonymous +systems, where they may be easier to attack. We summarize +in Section 7 how well the Tor design defends against +each of these attacks. + +
+ +
+

+4  The Tor Design

+ + + +
+The Tor network is an overlay network; each onion router (OR) +runs as a normal +user-level process without any special privileges. +Each onion router maintains a TLS [17] +connection to every other onion router. +Each user +runs local software called an onion proxy (OP) to fetch directories, +establish circuits across the network, +and handle connections from user applications. These onion proxies accept +TCP streams and multiplex them across the circuits. The onion +router on the other side +of the circuit connects to the requested destinations +and relays data. + +
+Each onion router maintains a long-term identity key and a short-term +onion key. The identity +key is used to sign TLS certificates, to sign the OR's router +descriptor (a summary of its keys, address, bandwidth, exit policy, +and so on), and (by directory servers) to sign directories. The onion key is used to decrypt requests +from users to set up a circuit and negotiate ephemeral keys. +The TLS protocol also establishes a short-term link key when communicating +between ORs. Short-term keys are rotated periodically and +independently, to limit the impact of key compromise. + +
+Section 4.1 presents the fixed-size +cells that are the unit of communication in Tor. We describe +in Section 4.2 how circuits are +built, extended, truncated, and destroyed. Section 4.3 +describes how TCP streams are routed through the network. We address +integrity checking in Section 4.4, +and resource limiting in Section 4.5. +Finally, +Section 4.6 talks about congestion control and +fairness issues. + +
+

+4.1  Cells

+ + + +
+Onion routers communicate with one another, and with users' OPs, via +TLS connections with ephemeral keys. Using TLS conceals the data on +the connection with perfect forward secrecy, and prevents an attacker +from modifying data on the wire or impersonating an OR. + +
+Traffic passes along these connections in fixed-size cells. Each cell +is 512 bytes, and consists of a header and a payload. The header includes a circuit +identifier (circID) that specifies which circuit the cell refers to +(many circuits can be multiplexed over the single TLS connection), and +a command to describe what to do with the cell's payload. (Circuit +identifiers are connection-specific: each circuit has a different +circID on each OP/OR or OR/OR connection it traverses.) +Based on their command, cells are either control cells, which are +always interpreted by the node that receives them, or relay cells, +which carry end-to-end stream data. The control cell commands are: +padding (currently used for keepalive, but also usable for link +padding); create or created (used to set up a new circuit); +and destroy (to tear down a circuit). + +
+Relay cells have an additional header (the relay header) at the front +of the payload, containing a streamID (stream identifier: many streams can +be multiplexed over a circuit); an end-to-end checksum for integrity +checking; the length of the relay payload; and a relay command. +The entire contents of the relay header and the relay cell payload +are encrypted or decrypted together as the relay cell moves along the +circuit, using the 128-bit AES cipher in counter mode to generate a +cipher stream. The relay commands are: relay +data (for data flowing down the stream), relay begin (to open a +stream), relay end (to close a stream cleanly), relay +teardown (to close a broken stream), relay connected +(to notify the OP that a relay begin has succeeded), relay +extend and relay extended (to extend the circuit by a hop, +and to acknowledge), relay truncate and relay truncated +(to tear down only part of the circuit, and to acknowledge), relay +sendme (used for congestion control), and relay drop (used to +implement long-range dummies). +We give a visual overview of cell structure plus the details of relay +cell structure, and then describe each of these cell types and commands +in more detail below. + +
+ +
+ +
+ +
cell-struct.png +
+
+

+4.2  Circuits and streams

+ + + +
+Onion Routing originally built one circuit for each +TCP stream. Because building a circuit can take several tenths of a +second (due to public-key cryptography and network latency), +this design imposed high costs on applications like web browsing that +open many TCP streams. + +
+In Tor, each circuit can be shared by many TCP streams. To avoid +delays, users construct circuits preemptively. To limit linkability +among their streams, users' OPs build a new circuit +periodically if the previous ones have been used, +and expire old used circuits that no longer have any open streams. +OPs consider rotating to a new circuit once a minute: thus +even heavy users spend negligible time +building circuits, but a limited number of requests can be linked +to each other through a given exit node. Also, because circuits are built +in the background, OPs can recover from failed circuit creation +without harming user experience.
+ +
+ +
+ +
interaction.png + +
Figure 1: Alice builds a two-hop circuit and begins fetching a web page.
+ + +
+
+Constructing a circuit +
+A user's OP constructs circuits incrementally, negotiating a +symmetric key with each OR on the circuit, one hop at a time. To begin +creating a new circuit, the OP (call her Alice) sends a +create cell to the first node in her chosen path (call him Bob). +(She chooses a new +circID CAB not currently used on the connection from her to Bob.) +The create cell's +payload contains the first half of the Diffie-Hellman handshake +(gx), encrypted to the onion key of the OR (call him Bob). Bob +responds with a created cell containing gy +along with a hash of the negotiated key K=gxy. + +
+Once the circuit has been established, Alice and Bob can send one +another relay cells encrypted with the negotiated +key.1 More detail is given in +the next section. + +
+To extend the circuit further, Alice sends a relay extend cell +to Bob, specifying the address of the next OR (call her Carol), and +an encrypted gx2 for her. Bob copies the half-handshake into a +create cell, and passes it to Carol to extend the circuit. +(Bob chooses a new circID CBC not currently used on the connection +between him and Carol. Alice never needs to know this circID; only Bob +associates CAB on his connection with Alice to CBC on +his connection with Carol.) +When Carol responds with a created cell, Bob wraps the payload +into a relay extended cell and passes it back to Alice. Now +the circuit is extended to Carol, and Alice and Carol share a common key +K2 = gx2 y2. + +
+To extend the circuit to a third node or beyond, Alice +proceeds as above, always telling the last node in the circuit to +extend one hop further. + +
+This circuit-level handshake protocol achieves unilateral entity +authentication (Alice knows she's handshaking with the OR, but +the OR doesn't care who is opening the circuit-Alice uses no public key +and remains anonymous) and unilateral key authentication +(Alice and the OR agree on a key, and Alice knows only the OR learns +it). It also achieves forward +secrecy and key freshness. More formally, the protocol is as follows +(where EPKBob(·) is encryption with Bob's public key, +H is a secure hash function, and | is concatenation): + +
+ + + + +
Alice -> Bob : EPKBob(gx)
Bob -> Alice : gy, H(K | "handshake") +
+ + +
+ In the second step, Bob proves that it was he who received gx, +and who chose y. We use PK encryption in the first step +(rather than, say, using the first two steps of STS, which has a +signature in the second step) because a single cell is too small to +hold both a public key and a signature. Preliminary analysis with the +NRL protocol analyzer [35] shows this protocol to be +secure (including perfect forward secrecy) under the +traditional Dolev-Yao model.
+ +
+Relay cells
+Once Alice has established the circuit (so she shares keys with each +OR on the circuit), she can send relay cells. +Upon receiving a relay +cell, an OR looks up the corresponding circuit, and decrypts the relay +header and payload with the session key for that circuit. +If the cell is headed away from Alice the OR then checks whether the +decrypted cell has a valid digest (as an optimization, the first +two bytes of the integrity check are zero, so in most cases we can avoid +computing the hash). +If valid, it accepts the relay cell and processes it as described +below. Otherwise, +the OR looks up the circID and OR for the +next step in the circuit, replaces the circID as appropriate, and +sends the decrypted relay cell to the next OR. (If the OR at the end +of the circuit receives an unrecognized relay cell, an error has +occurred, and the circuit is torn down.) + +
+OPs treat incoming relay cells similarly: they iteratively unwrap the +relay header and payload with the session keys shared with each +OR on the circuit, from the closest to farthest. +If at any stage the digest is valid, the cell must have +originated at the OR whose encryption has just been removed. + +
+To construct a relay cell addressed to a given OR, Alice assigns the +digest, and then iteratively +encrypts the cell payload (that is, the relay header and payload) with +the symmetric key of each hop up to that OR. Because the digest is +encrypted to a different value at each step, only at the targeted OR +will it have a meaningful value.2 +This leaky pipe circuit topology +allows Alice's streams to exit at different ORs on a single circuit. +Alice may choose different exit points because of their exit policies, +or to keep the ORs from knowing that two streams +originate from the same person. + +
+When an OR later replies to Alice with a relay cell, it +encrypts the cell's relay header and payload with the single key it +shares with Alice, and sends the cell back toward Alice along the +circuit. Subsequent ORs add further layers of encryption as they +relay the cell back to Alice. + +
+To tear down a circuit, Alice sends a destroy control +cell. Each OR in the circuit receives the destroy cell, closes +all streams on that circuit, and passes a new destroy cell +forward. But just as circuits are built incrementally, they can also +be torn down incrementally: Alice can send a relay +truncate cell to a single OR on a circuit. That OR then sends a +destroy cell forward, and acknowledges with a +relay truncated cell. Alice can then extend the circuit to +different nodes, without signaling to the intermediate nodes (or +a limited observer) that she has changed her circuit. +Similarly, if a node on the circuit goes down, the adjacent +node can send a relay truncated cell back to Alice. Thus the +"break a node and see which circuits go down" +attack [4] is weakened. + +
+

+4.3  Opening and closing streams

+ + + +
+When Alice's application wants a TCP connection to a given +address and port, it asks the OP (via SOCKS) to make the +connection. The OP chooses the newest open circuit (or creates one if +needed), and chooses a suitable OR on that circuit to be the +exit node (usually the last node, but maybe others due to exit policy +conflicts; see Section 6.2.) The OP then opens +the stream by sending a relay begin cell to the exit node, +using a new random streamID. Once the +exit node connects to the remote host, it responds +with a relay connected cell. Upon receipt, the OP sends a +SOCKS reply to notify the application of its success. The OP +now accepts data from the application's TCP stream, packaging it into +relay data cells and sending those cells along the circuit to +the chosen OR. + +
+There's a catch to using SOCKS, however-some applications pass the +alphanumeric hostname to the Tor client, while others resolve it into +an IP address first and then pass the IP address to the Tor client. If +the application does DNS resolution first, Alice thereby reveals her +destination to the remote DNS server, rather than sending the hostname +through the Tor network to be resolved at the far end. Common applications +like Mozilla and SSH have this flaw. + +
+With Mozilla, the flaw is easy to address: the filtering HTTP +proxy called Privoxy gives a hostname to the Tor client, so Alice's +computer never does DNS resolution. +But a portable general solution, such as is needed for +SSH, is +an open problem. Modifying or replacing the local nameserver +can be invasive, brittle, and unportable. Forcing the resolver +library to prefer TCP rather than UDP is hard, and also has +portability problems. Dynamically intercepting system calls to the +resolver library seems a promising direction. We could also provide +a tool similar to dig to perform a private lookup through the +Tor network. Currently, we encourage the use of privacy-aware proxies +like Privoxy wherever possible. + +
+Closing a Tor stream is analogous to closing a TCP stream: it uses a +two-step handshake for normal operation, or a one-step handshake for +errors. If the stream closes abnormally, the adjacent node simply sends a +relay teardown cell. If the stream closes normally, the node sends +a relay end cell down the circuit, and the other side responds with +its own relay end cell. Because +all relay cells use layered encryption, only the destination OR knows +that a given relay cell is a request to close a stream. This two-step +handshake allows Tor to support TCP-based applications that use half-closed +connections. + +
+

+4.4  Integrity checking on streams

+ + + +
+Because the old Onion Routing design used a stream cipher without integrity +checking, traffic was +vulnerable to a malleability attack: though the attacker could not +decrypt cells, any changes to encrypted data +would create corresponding changes to the data leaving the network. +This weakness allowed an adversary who could guess the encrypted content +to change a padding cell to a destroy +cell; change the destination address in a relay begin cell to the +adversary's webserver; or change an FTP command from +dir to rm *. (Even an external +adversary could do this, because the link encryption similarly used a +stream cipher.) + +
+Because Tor uses TLS on its links, external adversaries cannot modify +data. Addressing the insider malleability attack, however, is +more complex. + +
+We could do integrity checking of the relay cells at each hop, either +by including hashes or by using an authenticating cipher mode like +EAX [6], but there are some problems. First, these approaches +impose a message-expansion overhead at each hop, and so we would have to +either leak the path length or waste bytes by padding to a maximum +path length. Second, these solutions can only verify traffic coming +from Alice: ORs would not be able to produce suitable hashes for +the intermediate hops, since the ORs on a circuit do not know the +other ORs' session keys. Third, we have already accepted that our design +is vulnerable to end-to-end timing attacks; so tagging attacks performed +within the circuit provide no additional information to the attacker. + +
+Thus, we check integrity only at the edges of each stream. (Remember that +in our leaky-pipe circuit topology, a stream's edge could be any hop +in the circuit.) When Alice +negotiates a key with a new hop, they each initialize a SHA-1 +digest with a derivative of that key, +thus beginning with randomness that only the two of them know. +Then they each incrementally add to the SHA-1 digest the contents of +all relay cells they create, and include with each relay cell the +first four bytes of the current digest. Each also keeps a SHA-1 +digest of data received, to verify that the received hashes are correct. + +
+To be sure of removing or modifying a cell, the attacker must be able +to deduce the current digest state (which depends on all +traffic between Alice and Bob, starting with their negotiated key). +Attacks on SHA-1 where the adversary can incrementally add to a hash +to produce a new valid hash don't work, because all hashes are +end-to-end encrypted across the circuit. The computational overhead +of computing the digests is minimal compared to doing the AES +encryption performed at each hop of the circuit. We use only four +bytes per cell to minimize overhead; the chance that an adversary will +correctly guess a valid hash +is +acceptably low, given that the OP or OR tear down the circuit if they +receive a bad hash. + +
+

+4.5  Rate limiting and fairness

+ + + +
+Volunteers are more willing to run services that can limit +their bandwidth usage. To accommodate them, Tor servers use a +token bucket approach [50] to +enforce a long-term average rate of incoming bytes, while still +permitting short-term bursts above the allowed bandwidth. + +
+ +
+Because the Tor protocol outputs about the same number of bytes as it +takes in, it is sufficient in practice to limit only incoming bytes. +With TCP streams, however, the correspondence is not one-to-one: +relaying a single incoming byte can require an entire 512-byte cell. +(We can't just wait for more bytes, because the local application may +be awaiting a reply.) Therefore, we treat this case as if the entire +cell size had been read, regardless of the cell's fullness. + +
+Further, inspired by Rennhard et al's design in [44], a +circuit's edges can heuristically distinguish interactive streams from bulk +streams by comparing the frequency with which they supply cells. We can +provide good latency for interactive streams by giving them preferential +service, while still giving good overall throughput to the bulk +streams. Such preferential treatment presents a possible end-to-end +attack, but an adversary observing both +ends of the stream can already learn this information through timing +attacks. + +
+

+4.6  Congestion control

+ + + +
+Even with bandwidth rate limiting, we still need to worry about +congestion, either accidental or intentional. If enough users choose the +same OR-to-OR connection for their circuits, that connection can become +saturated. For example, an attacker could send a large file +through the Tor network to a webserver he runs, and then +refuse to read any of the bytes at the webserver end of the +circuit. Without some congestion control mechanism, these bottlenecks +can propagate back through the entire network. We don't need to +reimplement full TCP windows (with sequence numbers, +the ability to drop cells when we're full and retransmit later, and so +on), +because TCP already guarantees in-order delivery of each +cell. +We describe our response below. + +
+Circuit-level throttling: +To control a circuit's bandwidth usage, each OR keeps track of two +windows. The packaging window tracks how many relay data cells the OR is +allowed to package (from incoming TCP streams) for transmission back to the OP, +and the delivery window tracks how many relay data cells it is willing +to deliver to TCP streams outside the network. Each window is initialized +(say, to 1000 data cells). When a data cell is packaged or delivered, +the appropriate window is decremented. When an OR has received enough +data cells (currently 100), it sends a relay sendme cell towards the OP, +with streamID zero. When an OR receives a relay sendme cell with +streamID zero, it increments its packaging window. Either of these cells +increments the corresponding window by 100. If the packaging window +reaches 0, the OR stops reading from TCP connections for all streams +on the corresponding circuit, and sends no more relay data cells until +receiving a relay sendme cell. + +
+The OP behaves identically, except that it must track a packaging window +and a delivery window for every OR in the circuit. If a packaging window +reaches 0, it stops reading from streams destined for that OR. + +
+Stream-level throttling: +The stream-level congestion control mechanism is similar to the +circuit-level mechanism. ORs and OPs use relay sendme cells +to implement end-to-end flow control for individual streams across +circuits. Each stream begins with a packaging window (currently 500 cells), +and increments the window by a fixed value (50) upon receiving a relay +sendme cell. Rather than always returning a relay sendme cell as soon +as enough cells have arrived, the stream-level congestion control also +has to check whether data has been successfully flushed onto the TCP +stream; it sends the relay sendme cell only when the number of bytes pending +to be flushed is under some threshold (currently 10 cells' worth). + +
+ +
+These arbitrarily chosen parameters seem to give tolerable throughput +and delay; see Section 8. + +
+

+5  Rendezvous Points and hidden services

+ + + +
+Rendezvous points are a building block for location-hidden +services (also known as responder anonymity) in the Tor +network. Location-hidden services allow Bob to offer a TCP +service, such as a webserver, without revealing his IP address. +This type of anonymity protects against distributed DoS attacks: +attackers are forced to attack the onion routing network +because they do not know Bob's IP address. + +
+Our design for location-hidden servers has the following goals. +Access-control: Bob needs a way to filter incoming requests, +so an attacker cannot flood Bob simply by making many connections to him. +Robustness: Bob should be able to maintain a long-term pseudonymous +identity even in the presence of router failure. Bob's service must +not be tied to a single OR, and Bob must be able to migrate his service +across ORs. Smear-resistance: +A social attacker +should not be able to "frame" a rendezvous router by +offering an illegal or disreputable location-hidden service and +making observers believe the router created that service. +Application-transparency: Although we require users +to run special software to access location-hidden servers, we must not +require them to modify their applications. + +
+We provide location-hiding for Bob by allowing him to advertise +several onion routers (his introduction points) as contact +points. He may do this on any robust efficient +key-value lookup system with authenticated updates, such as a +distributed hash table (DHT) like CFS [11].3 Alice, the client, chooses an OR as her +rendezvous point. She connects to one of Bob's introduction +points, informs him of her rendezvous point, and then waits for him +to connect to the rendezvous point. This extra level of indirection +helps Bob's introduction points avoid problems associated with serving +unpopular files directly (for example, if Bob serves +material that the introduction point's community finds objectionable, +or if Bob's service tends to get attacked by network vandals). +The extra level of indirection also allows Bob to respond to some requests +and ignore others. + +
+

+5.1  Rendezvous points in Tor

+ +
+The following steps are +performed on behalf of Alice and Bob by their local OPs; +application integration is described more fully below. + +
+ +
+ +
+
  • Bob generates a long-term public key pair to identify his service.
  • +
    +
  • Bob chooses some introduction points, and advertises them on + the lookup service, signing the advertisement with his public key. He + can add more later.
  • +
    +
  • Bob builds a circuit to each of his introduction points, and tells + them to wait for requests.
  • +
    +
  • Alice learns about Bob's service out of band (perhaps Bob told her, + or she found it on a website). She retrieves the details of Bob's + service from the lookup service. If Alice wants to access Bob's + service anonymously, she must connect to the lookup service via Tor.
  • +
    +
  • Alice chooses an OR as the rendezvous point (RP) for her connection to + Bob's service. She builds a circuit to the RP, and gives it a + randomly chosen "rendezvous cookie" to recognize Bob.
  • +
    +
  • Alice opens an anonymous stream to one of Bob's introduction + points, and gives it a message (encrypted with Bob's public key) + telling it about herself, + her RP and rendezvous cookie, and the + start of a DH + handshake. The introduction point sends the message to Bob.
  • +
    +
  • If Bob wants to talk to Alice, he builds a circuit to Alice's + RP and sends the rendezvous cookie, the second half of the DH + handshake, and a hash of the session + key they now share. By the same argument as in + Section 4.2, Alice knows she + shares the key only with Bob.
  • +
    +
  • The RP connects Alice's circuit to Bob's. Note that RP can't + recognize Alice, Bob, or the data they transmit.
  • +
    +
  • Alice sends a relay begin cell along the circuit. It + arrives at Bob's OP, which connects to Bob's + webserver.
  • +
    +
  • An anonymous stream has been established, and Alice and Bob + communicate as normal. +
  • +
    + +
    +When establishing an introduction point, Bob provides the onion router +with the public key identifying his service. Bob signs his +messages, so others cannot usurp his introduction point +in the future. He uses the same public key to establish the other +introduction points for his service, and periodically refreshes his +entry in the lookup service. + +
    +The message that Alice gives +the introduction point includes a hash of Bob's public key and an optional initial authorization token (the +introduction point can do prescreening, for example to block replays). Her +message to Bob may include an end-to-end authorization token so Bob +can choose whether to respond. +The authorization tokens can be used to provide selective access: +important users can get uninterrupted access. +During normal situations, Bob's service might simply be offered +directly from mirrors, while Bob gives out tokens to high-priority users. If +the mirrors are knocked down, +those users can switch to accessing Bob's service via +the Tor rendezvous system. + +
    +Bob's introduction points are themselves subject to DoS-he must +open many introduction points or risk such an attack. +He can provide selected users with a current list or future schedule of +unadvertised introduction points; +this is most practical +if there is a stable and large group of introduction points +available. Bob could also give secret public keys +for consulting the lookup service. All of these approaches +limit exposure even when +some selected users collude in the DoS. + +
    +

    +5.2  Integration with user applications

    + +
    +Bob configures his onion proxy to know the local IP address and port of his +service, a strategy for authorizing clients, and his public key. The onion +proxy anonymously publishes a signed statement of Bob's +public key, an expiration time, and +the current introduction points for his service onto the lookup service, +indexed +by the hash of his public key. Bob's webserver is unmodified, +and doesn't even know that it's hidden behind the Tor network. + +
    +Alice's applications also work unchanged-her client interface +remains a SOCKS proxy. We encode all of the necessary information +into the fully qualified domain name (FQDN) Alice uses when establishing her +connection. Location-hidden services use a virtual top level domain +called .onion: thus hostnames take the form x.y.onion where +x is the authorization cookie and y encodes the hash of +the public key. Alice's onion proxy +examines addresses; if they're destined for a hidden server, it decodes +the key and starts the rendezvous as described above. + +
    +

    +5.3  Previous rendezvous work

    + +
    +Rendezvous points in low-latency anonymity systems were first +described for use in ISDN telephony [30,38]. +Later low-latency designs used rendezvous points for hiding location +of mobile phones and low-power location +trackers [23,40]. Rendezvous for +anonymizing low-latency +Internet connections was suggested in early Onion Routing +work [27], but the first published design was by Ian +Goldberg [26]. His design differs from +ours in three ways. First, Goldberg suggests that Alice should manually +hunt down a current location of the service via Gnutella; our approach +makes lookup transparent to the user, as well as faster and more robust. +Second, in Tor the client and server negotiate session keys +with Diffie-Hellman, so plaintext is not exposed even at the rendezvous +point. Third, +our design minimizes the exposure from running the +service, to encourage volunteers to offer introduction and rendezvous +services. Tor's introduction points do not output any bytes to the +clients; the rendezvous points don't know the client or the server, +and can't read the data being transmitted. The indirection scheme is +also designed to include authentication/authorization-if Alice doesn't +include the right cookie with her request for service, Bob need not even +acknowledge his existence. + +
    +

    +6  Other design decisions

    + + + +
    +

    +6.1  Denial of service

    + + + +
    +Providing Tor as a public service creates many opportunities for +denial-of-service attacks against the network. While +flow control and rate limiting (discussed in +Section 4.6) prevent users from consuming more +bandwidth than routers are willing to provide, opportunities remain for +users to +consume more network resources than their fair share, or to render the +network unusable for others. + +
    +First of all, there are several CPU-consuming denial-of-service +attacks wherein an attacker can force an OR to perform expensive +cryptographic operations. For example, an attacker can +fake the start of a TLS handshake, forcing the OR to carry out its +(comparatively expensive) half of the handshake at no real computational +cost to the attacker. + +
    +We have not yet implemented any defenses for these attacks, but several +approaches are possible. First, ORs can +require clients to solve a puzzle [16] while beginning new +TLS handshakes or accepting create cells. So long as these +tokens are easy to verify and computationally expensive to produce, this +approach limits the attack multiplier. Additionally, ORs can limit +the rate at which they accept create cells and TLS connections, +so that +the computational work of processing them does not drown out the +symmetric cryptography operations that keep cells +flowing. This rate limiting could, however, allow an attacker +to slow down other users when they build new circuits. + +
    + +
    +Adversaries can also attack the Tor network's hosts and network +links. Disrupting a single circuit or link breaks all streams passing +along that part of the circuit. Users similarly lose service +when a router crashes or its operator restarts it. The current +Tor design treats such attacks as intermittent network failures, and +depends on users and applications to respond or recover as appropriate. A +future design could use an end-to-end TCP-like acknowledgment protocol, +so no streams are lost unless the entry or exit point is +disrupted. This solution would require more buffering at the network +edges, however, and the performance and anonymity implications from this +extra complexity still require investigation. + +
    +

    +6.2  Exit policies and abuse

    + + + +
    + +
    +Exit abuse is a serious barrier to wide-scale Tor deployment. Anonymity +presents would-be vandals and abusers with an opportunity to hide +the origins of their activities. Attackers can harm the Tor network by +implicating exit servers for their abuse. Also, applications that commonly +use IP-based authentication (such as institutional mail or webservers) +can be fooled by the fact that anonymous connections appear to originate +at the exit OR. + +
    +We stress that Tor does not enable any new class of abuse. Spammers +and other attackers already have access to thousands of misconfigured +systems worldwide, and the Tor network is far from the easiest way +to launch attacks. +But because the +onion routers can be mistaken for the originators of the abuse, +and the volunteers who run them may not want to deal with the hassle of +explaining anonymity networks to irate administrators, we must block or limit +abuse through the Tor network. + +
    +To mitigate abuse issues, each onion router's exit policy +describes to which external addresses and ports the router will +connect. On one end of the spectrum are open exit +nodes that will connect anywhere. On the other end are middleman +nodes that only relay traffic to other Tor nodes, and private exit +nodes that only connect to a local host or network. A private +exit can allow a client to connect to a given host or +network more securely-an external adversary cannot eavesdrop traffic +between the private exit and the final destination, and so is less sure of +Alice's destination and activities. Most onion routers in the current +network function as +restricted exits that permit connections to the world at large, +but prevent access to certain abuse-prone addresses and services such +as SMTP. +The OR might also be able to authenticate clients to +prevent exit abuse without harming anonymity [48]. + +
    + +
    +Many administrators use port restrictions to support only a +limited set of services, such as HTTP, SSH, or AIM. +This is not a complete solution, of course, since abuse opportunities for these +protocols are still well known. + +
    +We have not yet encountered any abuse in the deployed network, but if +we do we should consider using proxies to clean traffic for certain +protocols as it leaves the network. For example, much abusive HTTP +behavior (such as exploiting buffer overflows or well-known script +vulnerabilities) can be detected in a straightforward manner. +Similarly, one could run automatic spam filtering software (such as +SpamAssassin) on email exiting the OR network. + +
    +ORs may also rewrite exiting traffic to append +headers or other information indicating that the traffic has passed +through an anonymity service. This approach is commonly used +by email-only anonymity systems. ORs can also +run on servers with hostnames like anonymous to further +alert abuse targets to the nature of the anonymous traffic. + +
    +A mixture of open and restricted exit nodes allows the most +flexibility for volunteers running servers. But while having many +middleman nodes provides a large and robust network, +having only a few exit nodes reduces the number of points +an adversary needs to monitor for traffic analysis, and places a +greater burden on the exit nodes. This tension can be seen in the +Java Anon Proxy +cascade model, wherein only one node in each cascade needs to handle +abuse complaints-but an adversary only needs to observe the entry +and exit of a cascade to perform traffic analysis on all that +cascade's users. The hydra model (many entries, few exits) presents a +different compromise: only a few exit nodes are needed, but an +adversary needs to work harder to watch all the clients; see +Section 10. + +
    +Finally, we note that exit abuse must not be dismissed as a peripheral +issue: when a system's public image suffers, it can reduce the number +and diversity of that system's users, and thereby reduce the anonymity +of the system itself. Like usability, public perception is a +security parameter. Sadly, preventing abuse of open exit nodes is an +unsolved problem, and will probably remain an arms race for the +foreseeable future. The abuse problems faced by Princeton's CoDeeN +project [37] give us a glimpse of likely issues. + +
    +

    +6.3  Directory Servers

    + + + +
    +First-generation Onion Routing designs [8,41] used +in-band network status updates: each router flooded a signed statement +to its neighbors, which propagated it onward. But anonymizing networks +have different security goals than typical link-state routing protocols. +For example, delays (accidental or intentional) +that can cause different parts of the network to have different views +of link-state and topology are not only inconvenient: they give +attackers an opportunity to exploit differences in client knowledge. +We also worry about attacks to deceive a +client about the router membership list, topology, or current network +state. Such partitioning attacks on client knowledge help an +adversary to efficiently deploy resources +against a target [15]. + +
    +Tor uses a small group of redundant, well-known onion routers to +track changes in network topology and node state, including keys and +exit policies. Each such directory server acts as an HTTP +server, so clients can fetch current network state +and router lists, and so other ORs can upload +state information. Onion routers periodically publish signed +statements of their state to each directory server. The directory servers +combine this information with their own views of network liveness, +and generate a signed description (a directory) of the entire +network state. Client software is +pre-loaded with a list of the directory servers and their keys, +to bootstrap each client's view of the network. + +
    +When a directory server receives a signed statement for an OR, it +checks whether the OR's identity key is recognized. Directory +servers do not advertise unrecognized ORs-if they did, +an adversary could take over the network by creating many +servers [22]. Instead, new nodes must be approved by the +directory +server administrator before they are included. Mechanisms for automated +node approval are an area of active research, and are discussed more +in Section 9. + +
    +Of course, a variety of attacks remain. An adversary who controls +a directory server can track clients by providing them different +information-perhaps by listing only nodes under its control, or by +informing only certain clients about a given node. Even an external +adversary can exploit differences in client knowledge: clients who use +a node listed on one directory server but not the others are vulnerable. + +
    +Thus these directory servers must be synchronized and redundant, so +that they can agree on a common directory. Clients should only trust +this directory if it is signed by a threshold of the directory +servers. + +
    +The directory servers in Tor are modeled after those in +Mixminion [15], but our situation is easier. First, +we make the +simplifying assumption that all participants agree on the set of +directory servers. Second, while Mixminion needs to predict node +behavior, Tor only needs a threshold consensus of the current +state of the network. Third, we assume that we can fall back to the +human administrators to discover and resolve problems when a consensus +directory cannot be reached. Since there are relatively few directory +servers (currently 3, but we expect as many as 9 as the network scales), +we can afford operations like broadcast to simplify the consensus-building +protocol. + +
    +To avoid attacks where a router connects to all the directory servers +but refuses to relay traffic from other routers, the directory servers +must also build circuits and use them to anonymously test router +reliability [18]. Unfortunately, this defense is not yet +designed or +implemented. + +
    +Using directory servers is simpler and more flexible than flooding. +Flooding is expensive, and complicates the analysis when we +start experimenting with non-clique network topologies. Signed +directories can be cached by other +onion routers, +so directory servers are not a performance +bottleneck when we have many users, and do not aid traffic analysis by +forcing clients to announce their existence to any +central point. + +
    +

    +7  Attacks and Defenses

    + + + +
    +Below we summarize a variety of attacks, and discuss how well our +design withstands them.
    + +
    +Passive attacks
    +Observing user traffic patterns. Observing a user's connection +will not reveal her destination or data, but it will +reveal traffic patterns (both sent and received). Profiling via user +connection patterns requires further processing, because multiple +application streams may be operating simultaneously or in series over +a single circuit. + +
    +Observing user content. While content at the user end is encrypted, +connections to responders may not be (indeed, the responding website +itself may be hostile). While filtering content is not a primary goal +of Onion Routing, Tor can directly use Privoxy and related +filtering services to anonymize application data streams. + +
    +Option distinguishability. We allow clients to choose +configuration options. For example, clients concerned about request +linkability should rotate circuits more often than those concerned +about traceability. Allowing choice may attract users with different +needs; but clients who are +in the minority may lose more anonymity by appearing distinct than they +gain by optimizing their behavior [1]. + +
    +End-to-end timing correlation. Tor only minimally hides +such correlations. An attacker watching patterns of +traffic at the initiator and the responder will be +able to confirm the correspondence with high probability. The +greatest protection currently available against such confirmation is to hide +the connection between the onion proxy and the first Tor node, +by running the OP on the Tor node or behind a firewall. This approach +requires an observer to separate traffic originating at the onion +router from traffic passing through it: a global observer can do this, +but it might be beyond a limited observer's capabilities. + +
    +End-to-end size correlation. Simple packet counting +will also be effective in confirming +endpoints of a stream. However, even without padding, we may have some +limited protection: the leaky pipe topology means different numbers +of packets may enter one end of a circuit than exit at the other. + +
    +Website fingerprinting. All the effective passive +attacks above are traffic confirmation attacks, +which puts them outside our design goals. There is also +a passive traffic analysis attack that is potentially effective. +Rather than searching exit connections for timing and volume +correlations, the adversary may build up a database of +"fingerprints" containing file sizes and access patterns for +targeted websites. He can later confirm a user's connection to a given +site simply by consulting the database. This attack has +been shown to be effective against SafeWeb [29]. +It may be less effective against Tor, since +streams are multiplexed within the same circuit, and +fingerprinting will be limited to +the granularity of cells (currently 512 bytes). Additional +defenses could include +larger cell sizes, padding schemes to group websites +into large sets, and link +padding or long-range dummies.4
    + +
    +Active attacks
    +Compromise keys. An attacker who learns the TLS session key can +see control cells and encrypted relay cells on every circuit on that +connection; learning a circuit +session key lets him unwrap one layer of the encryption. An attacker +who learns an OR's TLS private key can impersonate that OR for the TLS +key's lifetime, but he must +also learn the onion key to decrypt create cells (and because of +perfect forward secrecy, he cannot hijack already established circuits +without also compromising their session keys). Periodic key rotation +limits the window of opportunity for these attacks. On the other hand, +an attacker who learns a node's identity key can replace that node +indefinitely by sending new forged descriptors to the directory servers. + +
    +Iterated compromise. A roving adversary who can +compromise ORs (by system intrusion, legal coercion, or extralegal +coercion) could march down the circuit compromising the +nodes until he reaches the end. Unless the adversary can complete +this attack within the lifetime of the circuit, however, the ORs +will have discarded the necessary information before the attack can +be completed. (Thanks to the perfect forward secrecy of session +keys, the attacker cannot force nodes to decrypt recorded +traffic once the circuits have been closed.) Additionally, building +circuits that cross jurisdictions can make legal coercion +harder-this phenomenon is commonly called "jurisdictional +arbitrage." The Java Anon Proxy project recently experienced the +need for this approach, when +a German court forced them to add a backdoor to +their nodes [51]. + +
    +Run a recipient. An adversary running a webserver +trivially learns the timing patterns of users connecting to it, and +can introduce arbitrary patterns in its responses. +End-to-end attacks become easier: if the adversary can induce +users to connect to his webserver (perhaps by advertising +content targeted to those users), he now holds one end of their +connection. There is also a danger that application +protocols and associated programs can be induced to reveal information +about the initiator. Tor depends on Privoxy and similar protocol cleaners +to solve this latter problem. + +
    +Run an onion proxy. It is expected that end users will +nearly always run their own local onion proxy. However, in some +settings, it may be necessary for the proxy to run +remotely-typically, in institutions that want +to monitor the activity of those connecting to the proxy. +Compromising an onion proxy compromises all future connections +through it. + +
    +DoS non-observed nodes. An observer who can only watch some +of the Tor network can increase the value of this traffic +by attacking non-observed nodes to shut them down, reduce +their reliability, or persuade users that they are not trustworthy. +The best defense here is robustness. + +
    +Run a hostile OR. In addition to being a local observer, +an isolated hostile node can create circuits through itself, or alter +traffic patterns to affect traffic at other nodes. Nonetheless, a hostile +node must be immediately adjacent to both endpoints to compromise the +anonymity of a circuit. If an adversary can +run multiple ORs, and can persuade the directory servers +that those ORs are trustworthy and independent, then occasionally +some user will choose one of those ORs for the start and another +as the end of a circuit. If an adversary +controls m > 1 of N nodes, he can correlate at most +([m/N])2 of the traffic-although an +adversary +could still attract a disproportionately large amount of traffic +by running an OR with a permissive exit policy, or by +degrading the reliability of other routers. + +
    +Introduce timing into messages. This is simply a stronger +version of passive timing attacks already discussed earlier. + +
    +Tagging attacks. A hostile node could "tag" a +cell by altering it. If the +stream were, for example, an unencrypted request to a Web site, +the garbled content coming out at the appropriate time would confirm +the association. However, integrity checks on cells prevent +this attack. + +
    +Replace contents of unauthenticated protocols. When +relaying an unauthenticated protocol like HTTP, a hostile exit node +can impersonate the target server. Clients +should prefer protocols with end-to-end authentication. + +
    +Replay attacks. Some anonymity protocols are vulnerable +to replay attacks. Tor is not; replaying one side of a handshake +will result in a different negotiated session key, and so the rest +of the recorded session can't be used. + +
    +Smear attacks. An attacker could use the Tor network for +socially disapproved acts, to bring the +network into disrepute and get its operators to shut it down. +Exit policies reduce the possibilities for abuse, but +ultimately the network requires volunteers who can tolerate +some political heat. + +
    +Distribute hostile code. An attacker could trick users +into running subverted Tor software that did not, in fact, anonymize +their connections-or worse, could trick ORs into running weakened +software that provided users with less anonymity. We address this +problem (but do not solve it completely) by signing all Tor releases +with an official public key, and including an entry in the directory +that lists which versions are currently believed to be secure. To +prevent an attacker from subverting the official release itself +(through threats, bribery, or insider attacks), we provide all +releases in source code form, encourage source audits, and +frequently warn our users never to trust any software (even from +us) that comes without source.
    + +
    +Directory attacks
    +Destroy directory servers. If a few directory +servers disappear, the others still decide on a valid +directory. So long as any directory servers remain in operation, +they will still broadcast their views of the network and generate a +consensus directory. (If more than half are destroyed, this +directory will not, however, have enough signatures for clients to +use it automatically; human intervention will be necessary for +clients to decide whether to trust the resulting directory.) + +
    +Subvert a directory server. By taking over a directory server, +an attacker can partially influence the final directory. Since ORs +are included or excluded by majority vote, the corrupt directory can +at worst cast a tie-breaking vote to decide whether to include +marginal ORs. It remains to be seen how often such marginal cases +occur in practice. + +
    +Subvert a majority of directory servers. An adversary who controls +more than half the directory servers can include as many compromised +ORs in the final directory as he wishes. We must ensure that directory +server operators are independent and attack-resistant. + +
    +Encourage directory server dissent. The directory +agreement protocol assumes that directory server operators agree on +the set of directory servers. An adversary who can persuade some +of the directory server operators to distrust one another could +split the quorum into mutually hostile camps, thus partitioning +users based on which directory they use. Tor does not address +this attack. + +
    +Trick the directory servers into listing a hostile OR. +Our threat model explicitly assumes directory server operators will +be able to filter out most hostile ORs. + +
    +Convince the directories that a malfunctioning OR is +working. In the current Tor implementation, directory servers +assume that an OR is running correctly if they can start a TLS +connection to it. A hostile OR could easily subvert this test by +accepting TLS connections from ORs but ignoring all cells. Directory +servers must actively test ORs by building circuits and streams as +appropriate. The tradeoffs of a similar approach are discussed +in [18].
    + +
    +Attacks against rendezvous points
    +Make many introduction requests. An attacker could +try to deny Bob service by flooding his introduction points with +requests. Because the introduction points can block requests that +lack authorization tokens, however, Bob can restrict the volume of +requests he receives, or require a certain amount of computation for +every request he receives. + +
    +Attack an introduction point. An attacker could +disrupt a location-hidden service by disabling its introduction +points. But because a service's identity is attached to its public +key, the service can simply re-advertise +itself at a different introduction point. Advertisements can also be +done secretly so that only high-priority clients know the address of +Bob's introduction points or so that different clients know of different +introduction points. This forces the attacker to disable all possible +introduction points. + +
    +Compromise an introduction point. An attacker who controls +Bob's introduction point can flood Bob with +introduction requests, or prevent valid introduction requests from +reaching him. Bob can notice a flood, and close the circuit. To notice +blocking of valid requests, however, he should periodically test the +introduction point by sending rendezvous requests and making +sure he receives them. + +
    +Compromise a rendezvous point. A rendezvous +point is no more sensitive than any other OR on +a circuit, since all data passing through the rendezvous is encrypted +with a session key shared by Alice and Bob. + +
    +

    +8  Early experiences: Tor in the Wild

    + + + +
    +As of mid-May 2004, the Tor network consists of 32 nodes +(24 in the US, 8 in Europe), and more are joining each week as the code +matures. (For comparison, the current remailer network +has about 40 nodes.) Each node has at least a 768Kb/768Kb connection, and +many have 10Mb. The number of users varies (and of course, it's hard to +tell for sure), but we sometimes have several hundred users-administrators at +several companies have begun sending their entire departments' web +traffic through Tor, to block other divisions of +their company from reading their traffic. Tor users have reported using +the network for web browsing, FTP, IRC, AIM, Kazaa, SSH, and +recipient-anonymous email via rendezvous points. One user has anonymously +set up a Wiki as a hidden service, where other users anonymously publish +the addresses of their hidden services. + +
    +Each Tor node currently processes roughly 800,000 relay +cells (a bit under half a gigabyte) per week. On average, about 80% +of each 498-byte payload is full for cells going back to the client, +whereas about 40% is full for cells coming from the client. (The difference +arises because most of the network's traffic is web browsing.) Interactive +traffic like SSH brings down the average a lot-once we have more +experience, and assuming we can resolve the anonymity issues, we may +partition traffic into two relay cell sizes: one to handle +bulk traffic and one for interactive traffic. + +
    +Based in part on our restrictive default exit policy (we +reject SMTP requests) and our low profile, we have had no abuse +issues since the network was deployed in October +2003. Our slow growth rate gives us time to add features, +resolve bugs, and get a feel for what users actually want from an +anonymity system. Even though having more users would bolster our +anonymity sets, we are not eager to attract the Kazaa or warez +communities-we feel that we must build a reputation for privacy, human +rights, research, and other socially laudable activities. + +
    +As for performance, profiling shows that Tor spends almost +all its CPU time in AES, which is fast. Current latency is attributable +to two factors. First, network latency is critical: we are +intentionally bouncing traffic around the world several times. Second, +our end-to-end congestion control algorithm focuses on protecting +volunteer servers from accidental DoS rather than on optimizing +performance. To quantify these effects, we did some informal tests using a network of 4 +nodes on the same machine (a heavily loaded 1GHz Athlon). We downloaded a 60 +megabyte file from debian.org every 30 minutes for 54 hours (108 sample +points). It arrived in about 300 seconds on average, compared to 210s for a +direct download. We ran a similar test on the production Tor network, +fetching the front page of cnn.com (55 kilobytes): +while a direct +download consistently took about 0.3s, the performance through Tor varied. +Some downloads were as fast as 0.4s, with a median at 2.8s, and +90% finishing within 5.3s. It seems that as the network expands, the chance +of building a slow circuit (one that includes a slow or heavily loaded node +or link) is increasing. On the other hand, as our users remain satisfied +with this increased latency, we can address our performance incrementally as we +proceed with development. +
    + +
    + +
    +Although Tor's clique topology and full-visibility directories present +scaling problems, we still expect the network to support a few hundred +nodes and maybe 10,000 users before we're forced to become +more distributed. With luck, the experience we gain running the current +topology will help us choose among alternatives when the time comes. + +
    +

    +9  Open Questions in Low-latency Anonymity

    + + + +
    +In addition to the non-goals in +Section 3, many questions must be solved +before we can be confident of Tor's security. + +
    +Many of these open issues are questions of balance. For example, +how often should users rotate to fresh circuits? Frequent rotation +is inefficient, expensive, and may lead to intersection attacks and +predecessor attacks [54], but infrequent rotation makes the +user's traffic linkable. Besides opening fresh circuits, clients can +also exit from the middle of the circuit, +or truncate and re-extend the circuit. More analysis is +needed to determine the proper tradeoff. + +
    + +
    +How should we choose path lengths? If Alice always uses two hops, +then both ORs can be certain that by colluding they will learn about +Alice and Bob. In our current approach, Alice always chooses at least +three nodes unrelated to herself and her destination. +Should Alice choose a random path length (e.g. from a geometric +distribution) to foil an attacker who +uses timing to learn that he is the fifth hop and thus concludes that +both Alice and the responder are running ORs? + +
    +Throughout this paper, we have assumed that end-to-end traffic +confirmation will immediately and automatically defeat a low-latency +anonymity system. Even high-latency anonymity systems can be +vulnerable to end-to-end traffic confirmation, if the traffic volumes +are high enough, and if users' habits are sufficiently +distinct [14,31]. Can anything be +done to +make low-latency systems resist these attacks as well as high-latency +systems? Tor already makes some effort to conceal the starts and ends of +streams by wrapping long-range control commands in identical-looking +relay cells. Link padding could frustrate passive observers who count +packets; long-range padding could work against observers who own the +first hop in a circuit. But more research remains to find an efficient +and practical approach. Volunteers prefer not to run constant-bandwidth +padding; but no convincing traffic shaping approach has been +specified. Recent work on long-range padding [33] +shows promise. One could also try to reduce correlation in packet timing +by batching and re-ordering packets, but it is unclear whether this could +improve anonymity without introducing so much latency as to render the +network unusable. + +
    +A cascade topology may better defend against traffic confirmation by +aggregating users, and making padding and +mixing more affordable. Does the hydra topology (many input nodes, +few output nodes) work better against some adversaries? Are we going +to get a hydra anyway because most nodes will be middleman nodes? + +
    +Common wisdom suggests that Alice should run her own OR for best +anonymity, because traffic coming from her node could plausibly have +come from elsewhere. How much mixing does this approach need? Is it +immediately beneficial because of real-world adversaries that can't +observe Alice's router, but can run routers of their own? + +
    +To scale to many users, and to prevent an attacker from observing the +whole network, it may be necessary +to support far more servers than Tor currently anticipates. +This introduces several issues. First, if approval by a central set +of directory servers is no longer feasible, what mechanism should be used +to prevent adversaries from signing up many colluding servers? Second, +if clients can no longer have a complete picture of the network, +how can they perform discovery while preventing attackers from +manipulating or exploiting gaps in their knowledge? Third, if there +are too many servers for every server to constantly communicate with +every other, which non-clique topology should the network use? +(Restricted-route topologies promise comparable anonymity with better +scalability [13], but whatever topology we choose, we +need some way to keep attackers from manipulating their position within +it [21].) Fourth, if no central authority is tracking +server reliability, how do we stop unreliable servers from making +the network unusable? Fifth, do clients receive so much anonymity +from running their own ORs that we should expect them all to do +so [1], or do we need another incentive structure to +motivate them? Tarzan and MorphMix present possible solutions. + +
    + +
    +When a Tor node goes down, all its circuits (and thus streams) must break. +Will users abandon the system because of this brittleness? How well +does the method in Section 6.1 allow streams to survive +node failure? If affected users rebuild circuits immediately, how much +anonymity is lost? It seems the problem is even worse in a peer-to-peer +environment-such systems don't yet provide an incentive for peers to +stay connected when they're done retrieving content, so we would expect +a higher churn rate. + +
    + +
    +

    +10  Future Directions

    + + + +
    +Tor brings together many innovations into a unified deployable system. The +next immediate steps include: + +
    +Scalability: Tor's emphasis on deployability and design simplicity +has led us to adopt a clique topology, semi-centralized +directories, and a full-network-visibility model for client +knowledge. These properties will not scale past a few hundred servers. +Section 9 describes some promising +approaches, but more deployment experience will be helpful in learning +the relative importance of these bottlenecks. + +
    +Bandwidth classes: This paper assumes that all ORs have +good bandwidth and latency. We should instead adopt the MorphMix model, +where nodes advertise their bandwidth level (DSL, T1, T3), and +Alice avoids bottlenecks by choosing nodes that match or +exceed her bandwidth. In this way DSL users can usefully join the Tor +network. + +
    +Incentives: Volunteers who run nodes are rewarded with publicity +and possibly better anonymity [1]. More nodes means increased +scalability, and more users can mean more anonymity. We need to continue +examining the incentive structures for participating in Tor. Further, +we need to explore more approaches to limiting abuse, and understand +why most people don't bother using privacy systems. + +
    +Cover traffic: Currently Tor omits cover traffic-its costs +in performance and bandwidth are clear but its security benefits are +not well understood. We must pursue more research on link-level cover +traffic and long-range cover traffic to determine whether some simple padding +method offers provable protection against our chosen adversary. + +
    + +
    +Caching at exit nodes: Perhaps each exit node should run a +caching web proxy [47], to improve anonymity for cached pages +(Alice's request never +leaves the Tor network), to improve speed, and to reduce bandwidth cost. +On the other hand, forward security is weakened because caches +constitute a record of retrieved files. We must find the right +balance between usability and security. + +
    +Better directory distribution: +Clients currently download a description of +the entire network every 15 minutes. As the state grows larger +and clients more numerous, we may need a solution in which +clients receive incremental updates to directory state. +More generally, we must find more +scalable yet practical ways to distribute up-to-date snapshots of +network status without introducing new attacks. + +
    +Further specification review: Our public +byte-level specification [20] needs +external review. We hope that as Tor +is deployed, more people will examine its +specification. + +
    +Multisystem interoperability: We are currently working with the +designer of MorphMix to unify the specification and implementation of +the common elements of our two systems. So far, this seems +to be relatively straightforward. Interoperability will allow testing +and direct comparison of the two designs for trust and scalability. + +
    +Wider-scale deployment: The original goal of Tor was to +gain experience in deploying an anonymizing overlay network, and +learn from having actual users. We are now at a point in design +and development where we can start deploying a wider network. Once +we have many actual users, we will doubtlessly be better +able to evaluate some of our design decisions, including our +robustness/latency tradeoffs, our performance tradeoffs (including +cell size), our abuse-prevention mechanisms, and +our overall usability. + +
    + +
    + +

    Acknowledgments

    + We thank Peter Palfrader, Geoff Goodell, Adam Shostack, Joseph Sokol-Margolis, + John Bashinski, and Zack Brown + for editing and comments; + Matej Pfajfar, Andrei Serjantov, Marc Rennhard for design discussions; + Bram Cohen for congestion control discussions; + Adam Back for suggesting telescoping circuits; and + Cathy Meadows for formal analysis of the extend protocol. + This work has been supported by ONR and DARPA. + +
    + +
    + +
    +

    References

    + +
    +
    [1]
    +A. Acquisti, R. Dingledine, and P. Syverson. + On the economics of anonymity. + In R. N. Wright, editor, Financial Cryptography. + Springer-Verlag, LNCS 2742, 2003. + +
    +
    +
    [2]
    +R. Anderson. + The eternity service. + In Pragocrypt '96, 1996. + +
    +
    +
    [3]
    +The Anonymizer. + <http://anonymizer.com/>. + +
    +
    +
    [4]
    +A. Back, I. Goldberg, and A. Shostack. + Freedom systems 2.1 security issues and analysis. + White paper, Zero Knowledge Systems, Inc., May 2001. + +
    +
    +
    [5]
    +A. Back, U. Möller, and A. Stiglic. + Traffic analysis attacks and trade-offs in anonymity providing + systems. + In I. S. Moskowitz, editor, Information Hiding (IH 2001), pages + 245-257. Springer-Verlag, LNCS 2137, 2001. + +
    +
    +
    [6]
    +M. Bellare, P. Rogaway, and D. Wagner. + The EAX mode of operation: A two-pass authenticated-encryption + scheme optimized for simplicity and efficiency. + In Fast Software Encryption 2004, February 2004. + +
    +
    +
    [7]
    +O. Berthold, H. Federrath, and S. Köpsell. + Web MIXes: A system for anonymous and unobservable Internet + access. + In H. Federrath, editor, Designing Privacy Enhancing + Technologies: Workshop on Design Issue in Anonymity and Unobservability. + Springer-Verlag, LNCS 2009, 2000. + +
    +
    +
    [8]
    +P. Boucher, A. Shostack, and I. Goldberg. + Freedom systems 2.0 architecture. + White paper, Zero Knowledge Systems, Inc., December 2000. + +
    +
    +
    [9]
    +Z. Brown. + Cebolla: Pragmatic IP Anonymity. + In Ottawa Linux Symposium, June 2002. + +
    +
    +
    [10]
    +D. Chaum. + Untraceable electronic mail, return addresses, and digital + pseudo-nyms. + Communications of the ACM, 4(2), February 1981. + +
    +
    +
    [11]
    +F. Dabek, M. F. Kaashoek, D. Karger, R. Morris, and I. Stoica. + Wide-area cooperative storage with CFS. + In 18th ACM Symposium on Operating Systems Principles + (SOSP '01), Chateau Lake Louise, Banff, Canada, October 2001. + +
    +
    +
    [12]
    +W. Dai. + Pipenet 1.1. + Usenet post, August 1996. + <http://www.eskimo.com/ weidai/pipenet.txt> First mentioned in a + post to the cypherpunks list, Feb. 1995. + +
    +
    +
    [13]
    +G. Danezis. + Mix-networks with restricted routes. + In R. Dingledine, editor, Privacy Enhancing Technologies (PET + 2003). Springer-Verlag LNCS 2760, 2003. + +
    +
    +
    [14]
    +G. Danezis. + Statistical disclosure attacks. + In Security and Privacy in the Age of Uncertainty (SEC2003), + pages 421-426, Athens, May 2003. IFIP TC11, Kluwer. + +
    +
    +
    [15]
    +G. Danezis, R. Dingledine, and N. Mathewson. + Mixminion: Design of a type III anonymous remailer protocol. + In 2003 IEEE Symposium on Security and Privacy, pages 2-15. + IEEE CS, May 2003. + +
    +
    +
    [16]
    +D. Dean and A. Stubblefield. + Using Client Puzzles to Protect TLS. + In Proceedings of the 10th USENIX Security Symposium. USENIX, + Aug. 2001. + +
    +
    +
    [17]
    +T. Dierks and C. Allen. + The TLS Protocol - Version 1.0. + IETF RFC 2246, January 1999. + +
    +
    +
    [18]
    +R. Dingledine, M. J. Freedman, D. Hopwood, and D. Molnar. + A Reputation System to Increase MIX-net Reliability. + In I. S. Moskowitz, editor, Information Hiding (IH 2001), pages + 126-141. Springer-Verlag, LNCS 2137, 2001. + +
    +
    +
    [19]
    +R. Dingledine, M. J. Freedman, and D. Molnar. + The free haven project: Distributed anonymous storage service. + In H. Federrath, editor, Designing Privacy Enhancing + Technologies: Workshop on Design Issue in Anonymity and Unobservability. + Springer-Verlag, LNCS 2009, July 2000. + +
    +
    +
    [20]
    +R. Dingledine and N. Mathewson. + Tor protocol specifications. + <http://freehaven.net/tor/tor-spec.txt>. + +
    +
    +
    [21]
    +R. Dingledine and P. Syverson. + Reliable MIX Cascade Networks through Reputation. + In M. Blaze, editor, Financial Cryptography. Springer-Verlag, + LNCS 2357, 2002. + +
    +
    +
    [22]
    +J. Douceur. + The Sybil Attack. + In Proceedings of the 1st International Peer To Peer Systems + Workshop (IPTPS), Mar. 2002. + +
    +
    +
    [23]
    +H. Federrath, A. Jerichow, and A. Pfitzmann. + MIXes in mobile communication systems: Location management with + privacy. + In R. Anderson, editor, Information Hiding, First International + Workshop, pages 121-135. Springer-Verlag, LNCS 1174, May 1996. + +
    +
    +
    [24]
    +M. J. Freedman and R. Morris. + Tarzan: A peer-to-peer anonymizing network layer. + In 9th ACM Conference on Computer and Communications + Security (CCS 2002), Washington, DC, November 2002. + +
    +
    +
    [25]
    +S. Goel, M. Robson, M. Polte, and E. G. Sirer. + Herbivore: A scalable and efficient protocol for anonymous + communication. + Technical Report TR2003-1890, Cornell University Computing and + Information Science, February 2003. + +
    +
    +
    [26]
    +I. Goldberg. + A Pseudonymous Communications Infrastructure for the Internet. + PhD thesis, UC Berkeley, Dec 2000. + +
    +
    +
    [27]
    +D. M. Goldschlag, M. G. Reed, and P. F. Syverson. + Hiding routing information. + In R. Anderson, editor, Information Hiding, First International + Workshop, pages 137-150. Springer-Verlag, LNCS 1174, May 1996. + +
    +
    +
    [28]
    +C. Gülcü and G. Tsudik. + Mixing E-mail with Babel. + In Network and Distributed Security Symposium (NDSS 96), + pages 2-16. IEEE, February 1996. + +
    +
    +
    [29]
    +A. Hintz. + Fingerprinting websites using traffic analysis. + In R. Dingledine and P. Syverson, editors, Privacy Enhancing + Technologies (PET 2002), pages 171-178. Springer-Verlag, LNCS 2482, 2002. + +
    +
    +
    [30]
    +A. Jerichow, J. Müller, A. Pfitzmann, B. Pfitzmann, and M. Waidner. + Real-time mixes: A bandwidth-efficient anonymity protocol. + IEEE Journal on Selected Areas in Communications, + 16(4):495-509, May 1998. + +
    +
    +
    [31]
    +D. Kesdogan, D. Agrawal, and S. Penz. + Limits of anonymity in open environments. + In F. Petitcolas, editor, Information Hiding Workshop (IH + 2002). Springer-Verlag, LNCS 2578, October 2002. + +
    +
    +
    [32]
    +D. Koblas and M. R. Koblas. + SOCKS. + In UNIX Security III Symposium (1992 USENIX Security + Symposium), pages 77-83. USENIX, 1992. + +
    +
    +
    [33]
    +B. N. Levine, M. K. Reiter, C. Wang, and M. Wright. + Timing analysis in low-latency mix-based systems. + In A. Juels, editor, Financial Cryptography. Springer-Verlag, + LNCS (forthcoming), 2004. + +
    +
    +
    [34]
    +B. N. Levine and C. Shields. + Hordes: A multicast-based protocol for anonymity. + Journal of Computer Security, 10(3):213-240, 2002. + +
    +
    +
    [35]
    +C. Meadows. + The NRL protocol analyzer: An overview. + Journal of Logic Programming, 26(2):113-131, 1996. + +
    +
    +
    [36]
    +U. Möller, L. Cottrell, P. Palfrader, and L. Sassaman. + Mixmaster Protocol - Version 2. + Draft, July 2003. + <http://www.abditum.com/mixmaster-spec.txt>. + +
    +
    +
    [37]
    +V. S. Pai, L. Wang, K. Park, R. Pang, and L. Peterson. + The Dark Side of the Web: An Open Proxy's View. + <http://codeen.cs.princeton.edu/>. + +
    +
    +
    [38]
    +A. Pfitzmann, B. Pfitzmann, and M. Waidner. + ISDN-mixes: Untraceable communication with very small bandwidth + overhead. + In GI/ITG Conference on Communication in Distributed Systems, + pages 451-463, February 1991. + +
    +
    +
    [39]
    +Privoxy. + <http://www.privoxy.org/>. + +
    +
    +
    [40]
    +M. G. Reed, P. F. Syverson, and D. M. Goldschlag. + Protocols using anonymous connections: Mobile applications. + In B. Christianson, B. Crispo, M. Lomas, and M. Roe, editors, + Security Protocols: 5th International Workshop, pages 13-23. + Springer-Verlag, LNCS 1361, April 1997. + +
    +
    +
    [41]
    +M. G. Reed, P. F. Syverson, and D. M. Goldschlag. + Anonymous connections and onion routing. + IEEE Journal on Selected Areas in Communications, + 16(4):482-494, May 1998. + +
    +
    +
    [42]
    +M. K. Reiter and A. D. Rubin. + Crowds: Anonymity for web transactions. + ACM TISSEC, 1(1):66-92, June 1998. + +
    +
    +
    [43]
    +M. Rennhard and B. Plattner. + Practical anonymity for the masses with morphmix. + In A. Juels, editor, Financial Cryptography. Springer-Verlag, + LNCS (forthcoming), 2004. + +
    +
    +
    [44]
    +M. Rennhard, S. Rafaeli, L. Mathy, B. Plattner, and D. Hutchison. + Analysis of an Anonymity Network for Web Browsing. + In IEEE 7th Intl. Workshop on Enterprise Security (WET ICE + 2002), Pittsburgh, USA, June 2002. + +
    +
    +
    [45]
    +A. Serjantov and P. Sewell. + Passive attack analysis for connection-based anonymity systems. + In Computer Security - ESORICS 2003. Springer-Verlag, LNCS + 2808, October 2003. + +
    +
    +
    [46]
    +R. Sherwood, B. Bhattacharjee, and A. Srinivasan. + p5: A protocol for scalable anonymous communication. + In IEEE Symposium on Security and Privacy, pages 58-70. IEEE + CS, 2002. + +
    +
    +
    [47]
    +A. Shubina and S. Smith. + Using caching for browsing anonymity. + ACM SIGEcom Exchanges, 4(2), Sept 2003. + +
    +
    +
    [48]
    +P. Syverson, M. Reed, and D. Goldschlag. + Onion Routing access configurations. + In DARPA Information Survivability Conference and Exposition + (DISCEX 2000), volume 1, pages 34-40. IEEE CS Press, 2000. + +
    +
    +
    [49]
    +P. Syverson, G. Tsudik, M. Reed, and C. Landwehr. + Towards an Analysis of Onion Routing Security. + In H. Federrath, editor, Designing Privacy Enhancing + Technologies: Workshop on Design Issue in Anonymity and Unobservability, + pages 96-114. Springer-Verlag, LNCS 2009, July 2000. + +
    +
    +
    [50]
    +A. Tannenbaum. + Computer networks, 1996. + +
    +
    +
    [51]
    +The AN.ON Project. + German police proceeds against anonymity service. + Press release, September 2003. + + <http://www.datenschutzzentrum.de/material/themen/presse/anon-bka_e.htm>. + +
    +
    +
    [52]
    +M. Waldman and D. Mazières. + Tangler: A censorship-resistant publishing system based on document + entanglements. + In 8th ACM Conference on Computer and Communications + Security (CCS-8), pages 86-135. ACM Press, 2001. + +
    +
    +
    [53]
    +M. Waldman, A. Rubin, and L. Cranor. + Publius: A robust, tamper-evident, censorship-resistant and + source-anonymous web publishing system. + In Proc. 9th USENIX Security Symposium, pages 59-72, August + 2000. + +
    +
    +
    [54]
    +M. Wright, M. Adler, B. N. Levine, and C. Shields. + Defending anonymous communication against passive logging attacks. + In IEEE Symposium on Security and Privacy, pages 28-41. IEEE + CS, May 2003.
    +
    + + +
    +

    Footnotes:

    + +
    +1Actually, the negotiated key is used to derive two + symmetric keys: one for each direction. +
    +2 + With 48 bits of digest per cell, the probability of an accidental +collision is far lower than the chance of hardware failure. +
    +3 +Rather than rely on an external infrastructure, the Onion Routing network +can run the lookup service itself. Our current implementation provides a +simple lookup system on the +directory servers. +
    +4Note that this fingerprinting +attack should not be confused with the much more complicated latency +attacks of [5], which require a fingerprint of the latencies +of all circuits through the network, combined with those from the +network edges to the target user and the responder website. +


    File translated from +TEX +by +TTH, +version 3.59.
    On 18 May 2004, 10:45.
    + diff --git a/doc/tor-design.pdf b/doc/tor-design.pdf index 19dce9ba44..6db122d7d5 100644 Binary files a/doc/tor-design.pdf and b/doc/tor-design.pdf differ