rewrite exit abuse section

svn:r721
This commit is contained in:
Roger Dingledine 2003-11-03 01:03:00 +00:00
parent 49b1c0e95c
commit fed6cb8e68

View File

@ -83,23 +83,13 @@ papers
\cite{or-ih96,or-jsac98,or-discex00,or-pet00}. While \cite{or-ih96,or-jsac98,or-discex00,or-pet00}. While
a wide area Onion Routing network was deployed for some weeks, a wide area Onion Routing network was deployed for some weeks,
the only long-running and publicly accessible the only long-running and publicly accessible
implementation was a fragile proof-of-concept that ran on a single implementation of the original design was a fragile proof-of-concept
machine. that ran on a single machine. Even this simple deployment processed tens
% (which nonetheless processed several tens of thousands of connections of thousands of connections daily from thousands of users worldwide. But
%daily from thousands of global users). many critical design and deployment issues were never resolved, and the
%%Do we really want to say this? It softens our motivation for the paper. -RD design has not been updated in several years. Here we describe Tor, a
% protocol for asynchronous, loosely federated onion routers that provides
% In general, I try to emphasize rather than understate past the following improvements over the old Onion Routing design:
% accomplishments so I am giving an accurate comparison,
% which strengthens the claims in the paper. This is true whether
% it is my work or someone else's.
% This is also the only experimental basic viability result we
% can point to for Onion Routing in general at this point. -PS
Many critical design and deployment issues were never resolved,
and the design has not been updated in several years.
Here we describe Tor, a protocol for asynchronous, loosely
federated onion routers that provides the following improvements over
the old Onion Routing design:
\begin{tightlist} \begin{tightlist}
@ -275,8 +265,12 @@ trade-off, these \emph{high-latency} networks are well-suited for anonymous
email, but introduce too much lag for interactive tasks such as web browsing, email, but introduce too much lag for interactive tasks such as web browsing,
internet chat, or SSH connections. internet chat, or SSH connections.
Tor belongs to the second category: \emph{low-latency} designs that attempt Tor belongs to the second category: \emph{low-latency} designs that
to anonymize interactive network traffic. Because these protocols typically attempt to anonymize interactive network traffic. These systems handle
a variety of bidirectional protocols. They also provide more convenient
mail delivery than the high-latency fire-and-forget anonymous email
networks, because the remote mail server provides explicit delivery
confirmation. But because these designs typically
involve a large number of packets that must be delivered quickly, it is involve a large number of packets that must be delivered quickly, it is
difficult for them to prevent an attacker who can eavesdrop both ends of the difficult for them to prevent an attacker who can eavesdrop both ends of the
communication from correlating the timing and volume communication from correlating the timing and volume
@ -373,8 +367,8 @@ protocols (such as HTTP) and relay the application requests themselves
along the circuit. along the circuit.
This protocol-layer decision represents a compromise between flexibility This protocol-layer decision represents a compromise between flexibility
and anonymity. For example, a system that understands HTTP can strip and anonymity. For example, a system that understands HTTP can strip
identifying information from those requests; can take advantage of caching identifying information from those requests, can take advantage of caching
to limit the number of requests that leave the network; and can batch to limit the number of requests that leave the network, and can batch
or encode those requests in order to minimize the number of connections. or encode those requests in order to minimize the number of connections.
On the other hand, an IP-level anonymizer can handle nearly any protocol, On the other hand, an IP-level anonymizer can handle nearly any protocol,
even ones unforeseen by their designers (though these systems require even ones unforeseen by their designers (though these systems require
@ -384,7 +378,7 @@ a middle approach: they are fairly application neutral (so long as the
application supports, or can be tunneled across, TCP), but by treating application supports, or can be tunneled across, TCP), but by treating
application connections as data streams rather than raw TCP packets, application connections as data streams rather than raw TCP packets,
they avoid the well-known inefficiencies of tunneling TCP over TCP they avoid the well-known inefficiencies of tunneling TCP over TCP
\cite{tcp-over-tcp-is-bad}. [XXX what's a better cite?] \cite{tcp-over-tcp-is-bad}.
Distributed-trust anonymizing systems need to prevent attackers from Distributed-trust anonymizing systems need to prevent attackers from
adding too many servers and thus compromising too many user paths. adding too many servers and thus compromising too many user paths.
@ -396,12 +390,12 @@ from becoming too much of the network based on a limited resource such
as number of IPs controlled. Crowds suggests requiring written, notarized as number of IPs controlled. Crowds suggests requiring written, notarized
requests from potential crowd members. requests from potential crowd members.
Anonymous communication is an essential component of censorship-resistant Anonymous communication is essential for censorship-resistant
systems like Eternity \cite{eternity}, Free~Haven \cite{freehaven-berk}, systems like Eternity \cite{eternity}, Free~Haven \cite{freehaven-berk},
Publius \cite{publius}, and Tangler \cite{tangler}. Tor's rendezvous Publius \cite{publius}, and Tangler \cite{tangler}. Tor's rendezvous
points enable connections between mutually anonymous entities; they points enable connections between mutually anonymous entities; they
are a building block for location-hidden servers, which are needed by are a building block for location-hidden servers, which are needed by
Eternity and Free Haven. Eternity and Free~Haven.
% didn't include rewebbers. No clear place to put them, so I'll leave % didn't include rewebbers. No clear place to put them, so I'll leave
% them out for now. -RD % them out for now. -RD
@ -781,7 +775,7 @@ cell to create corresponding changes to the data leaving the network.
This weakness allowed an adversary to change a padding cell to a destroy This weakness allowed an adversary to change a padding cell to a destroy
cell; change the destination address in a relay begin cell to the cell; change the destination address in a relay begin cell to the
adversary's webserver; or change a user on an ftp connection from adversary's webserver; or change a user on an ftp connection from
typing ``dir'' to typing ``delete *''. Any node or external adversary typing ``dir'' to typing ``delete~*''. Any node or external adversary
along the circuit could introduce such corruption in a stream. along the circuit could introduce such corruption in a stream.
Tor prevents external adversaries from mounting this attack simply by Tor prevents external adversaries from mounting this attack simply by
@ -960,7 +954,7 @@ circuit. Indeed, this same loss of service occurs when a router crashes
or its operator restarts it. The current Tor design treats such attacks or its operator restarts it. The current Tor design treats such attacks
as intermittent network failures, and depends on users and applications as intermittent network failures, and depends on users and applications
to respond or recover as appropriate. A future design could use an to respond or recover as appropriate. A future design could use an
end-to-end based TCP-like acknowledgment protocol, so that no streams are end-to-end TCP-like acknowledgment protocol, so that no streams are
lost unless the entry or exit point itself is disrupted. This solution lost unless the entry or exit point itself is disrupted. This solution
would require more buffering at the network edges, however, and the would require more buffering at the network edges, however, and the
performance and anonymity implications from this extra complexity still performance and anonymity implications from this extra complexity still
@ -969,48 +963,38 @@ require investigation.
\SubSection{Exit policies and abuse} \SubSection{Exit policies and abuse}
\label{subsec:exitpolicies} \label{subsec:exitpolicies}
Exit abuse is a serious barrier to wide-scale Tor deployment. Not Exit abuse is a serious barrier to wide-scale Tor deployment. Anonymity
only does anonymity present would-be vandals and abusers with an presents would-be vandals and abusers with an opportunity to hide
opportunity to hide the origins of their activities---but also, the origins of their activities. Attackers can harm the Tor network by
existing sanctions against abuse present an easy way for attackers to implicating exit servers for their abuse. Also, applications that commonly
harm the Tor network by implicating exit servers for their abuse. use IP-based authentication (such as institutional mail or web servers)
Thus, must block or limit attacks and other abuse that travel through can be fooled by the fact that anonymous connections appear to originate
the Tor network. at the exit OR.
Also, applications that commonly use IP-based authentication (such We stress that Tor does not enable any new class of abuse. Spammers and
institutional mail or web servers) can be fooled by the fact that other attackers already have access to thousands of misconfigured systems
anonymous connections appear to originate at the exit OR. Rather than worldwide, and the Tor network is far from the easiest way to launch
expose a private service, an administrator may prefer to prevent Tor these antisocial or illegal attacks. But because the onion routers can
users from connecting to those services from a local OR. easily be mistaken for the originators of the abuse, and the volunteers
who run them may not want to deal with the hassle of repeatedly explaining
anonymity networks, we must block or limit attacks and other abuse that
travel through the Tor network.
To mitigate abuse issues, in Tor, each onion router's \emph{exit To mitigate abuse issues, in Tor, each onion router's \emph{exit policy}
policy} describes to which external addresses and ports the router describes to which external addresses and ports the router will permit
will permit stream connections. On one end of the spectrum are stream connections. On one end of the spectrum are \emph{open exit}
\emph{open exit} nodes that will connect anywhere. As a compromise, nodes that will connect anywhere. On the other end are \emph{middleman}
most onion routers will function as \emph{restricted exits} that nodes that only relay traffic to other Tor nodes, and \emph{private exit}
permit connections to the world at large, but prevent access to nodes that only connect to a local host or network. Using a private
certain abuse-prone addresses and services. on the other end are exit (if one exists) is a more secure way for a client to connect to a
\emph{middleman} nodes that only relay traffic to other Tor nodes, and given host or network---an external adversary cannot eavesdrop traffic
\emph{private exit} nodes that only connect to a local host or between the private exit and the final destination, and so is less sure of
network. (Using a private exit (if one exists) is a more secure way Alice's destination and activities. Most onion routers will function as
for a client to connect to a given host or network---an external \emph{restricted exits} that permit connections to the world at large,
adversary cannot eavesdrop traffic between the private exit and the but prevent access to certain abuse-prone addresses and services. In
final destination, and so is less sure of Alice's destination and general, nodes can require a variety of forms of traffic authentication
activities.) is less sure of Alice's destination. In general,
nodes can require a variety of forms of traffic authentication
\cite{or-discex00}. \cite{or-discex00}.
%Tor offers more reliability than the high-latency fire-and-forget
%anonymous email networks, because the sender opens a TCP stream
%with the remote mail server and receives an explicit confirmation of
%acceptance. But ironically, the private exit node model works poorly for
%email, when Tor nodes are run on volunteer machines that also do other
%things, because it's quite hard to configure mail transport agents so
%normal users can send mail normally, but the Tor process can only deliver
%mail locally. Further, most organizations have specific hosts that will
%deliver mail on behalf of certain IP ranges; Tor operators must be aware
%of these hosts and consider putting them in the Tor exit policy.
%The abuse issues on closed (e.g. military) networks are different %The abuse issues on closed (e.g. military) networks are different
%from the abuse on open networks like the Internet. While these IP-based %from the abuse on open networks like the Internet. While these IP-based
%access controls are still commonplace on the Internet, on closed networks, %access controls are still commonplace on the Internet, on closed networks,
@ -1020,8 +1004,8 @@ nodes can require a variety of forms of traffic authentication
Many administrators will use port restrictions to support only a Many administrators will use port restrictions to support only a
limited set of well-known services, such as HTTP, SSH, or AIM. limited set of well-known services, such as HTTP, SSH, or AIM.
This is not a complete solution, since abuse opportunities for these This is not a complete solution, since abuse opportunities for these
protocols are still well known. Nonetheless, the benefits are real, protocols are still well known. Nonetheless, the benefits are real,
since administrators seem used to the concept of port 80 abuse not since administrators seem used to the concept of port 80 abuse not
coming from the machine's owner. coming from the machine's owner.
A further solution may be to use proxies to clean traffic for certain A further solution may be to use proxies to clean traffic for certain
@ -1029,54 +1013,28 @@ protocols as it leaves the network. For example, much abusive HTTP
behavior (such as exploiting buffer overflows or well-known script behavior (such as exploiting buffer overflows or well-known script
vulnerabilities) can be detected in a straightforward manner. vulnerabilities) can be detected in a straightforward manner.
Similarly, one could run automatic spam filtering software (such as Similarly, one could run automatic spam filtering software (such as
SpamAssassin) on email exiting the OR network. A generic SpamAssassin) on email exiting the OR network.
intrusion detection system (IDS) could be adapted to these purposes.
[XXX Mention possibility of filtering spam-like habits--e.g., many
recipients. -NM]
ORs may also choose to rewrite exiting traffic in order to append ORs may also choose to rewrite exiting traffic in order to append
headers or other information to indicate that the traffic has passed headers or other information to indicate that the traffic has passed
through an anonymity service. This approach is commonly used, to some through an anonymity service. This approach is commonly used
success, by email-only anonymity systems. When possible, ORs can also by email-only anonymity systems. When possible, ORs can also
run on servers with hostnames such as {\it anonymous}, to further run on servers with hostnames such as {\it anonymous}, to further
alert abuse targets to the nature of the anonymous traffic. alert abuse targets to the nature of the anonymous traffic.
%we should run a squid at each exit node, to provide comparable anonymity
%to private exit nodes for cache hits, to speed everything up, and to
%have a buffer for funny stuff coming out of port 80. we could similarly
%have other exit proxies for other protocols, like mail, to check
%delivered mail for being spam.
%[XXX Um, I'm uncomfortable with this for several reasons.
%It's not good for keeping honest nodes honest about discarding
%state after it's no longer needed. Granted it keeps an external
%observer from noticing how often sites are visited, but it also
%allows fishing expeditions. ``We noticed you went to this prohibited
%site an hour ago. Kindly turn over your caches to the authorities.''
%I previously elsewhere suggested bulk transfer proxies to carve
%up big things so that they could be downloaded in less noticeable
%pieces over several normal looking connections. We could suggest
%similarly one or a handful of squid nodes that might serve up
%some of the more sensitive but common material, especially if
%the relevant sites didn't want to or couldn't run their own OR.
%This would be better than having everyone run a squid which would
%just help identify after the fact the different history of that
%node's activity. All this kind of speculation needs to move to
%future work section I guess. -PS]
A mixture of open and restricted exit nodes will allow the most A mixture of open and restricted exit nodes will allow the most
flexibility for volunteers running servers. But while a large number flexibility for volunteers running servers. But while many
of middleman nodes is useful to provide a large and robust network, middleman nodes help provide a large and robust network,
having only a small number of exit nodes reduces the number of nodes having only a small number of exit nodes reduces the number of nodes
an adversary needs to monitor for traffic analysis, and places a an adversary needs to monitor for traffic analysis, and places a
greater burden on the exit nodes. This tension can be seen in the JAP greater burden on the exit nodes. This tension can be seen in the JAP
cascade model, wherein only one node in each cascade needs to handle cascade model, wherein only one node in each cascade needs to handle
abuse complaints---but an adversary only needs to observe the entry abuse complaints---but an adversary only needs to observe the entry
and exit of a cascade to perform traffic analysis on all that and exit of a cascade to perform traffic analysis on all that
cascade's users. The Hydra model (many entries, few exits) presents a cascade's users. The Hydra model (many entries, few exits) presents a
different compromise: only a few exit nodes are needed, but an different compromise: only a few exit nodes are needed, but an
adversary needs to work harder to watch all the clients. adversary needs to work harder to watch all the clients; see
Section~\ref{sec:conclusion}.
Finally, we note that exit abuse must not be dismissed as a peripheral Finally, we note that exit abuse must not be dismissed as a peripheral
issue: when a system's public image suffers, it can reduce the number issue: when a system's public image suffers, it can reduce the number
@ -1090,8 +1048,7 @@ project \cite{darkside} give us a glimpse of likely issues.
\SubSection{Directory Servers} \SubSection{Directory Servers}
\label{subsec:dirservers} \label{subsec:dirservers}
First-generation Onion Routing designs \cite{or-jsac98,freedom2-arch} did First-generation Onion Routing designs \cite{freedom2-arch,or-jsac98} used
% is or-jsac98 the right cite here? what's our stock OR cite? -RD
in-band network status updates: each router flooded a signed statement in-band network status updates: each router flooded a signed statement
to its neighbors, which propagated it onward. But anonymizing networks to its neighbors, which propagated it onward. But anonymizing networks
have different security goals than typical link-state routing protocols. have different security goals than typical link-state routing protocols.
@ -1208,25 +1165,20 @@ privacy also seeks to provide some protection against distributed DoS attacks:
attackers are forced to attack the onion routing network as a whole attackers are forced to attack the onion routing network as a whole
rather than just Bob's IP. rather than just Bob's IP.
\subsection{Goals for rendezvous points} Our design for location-hidden servers has the following properties.
\label{subsec:rendezvous-goals} \textbf{Flood-proof:} An attacker should not be able to flood Bob
Our design for location-hidden servers has the following properties: with traffic simply by sending many requests to talk to Bob. Thus,
\begin{tightlist} Bob needs a way to filter incoming requests. \textbf{Robust:} Bob
\item[Flood-proof:] An attacker should not be able to flood Bob with traffic should be able to maintain a long-term pseudonymous identity even
simply by sending many requests to talk to Bob. Thus, Bob needs a in the presence of router failure. Thus, Bob's service must not be
way to filter incoming requests. tied to a single OR, and Bob must be able to tie his service to new
\item[Robust:] Bob should be able to maintain a long-term pseudonymous ORs. \textbf{Smear-resistant:} An attacker should not be able to use
identity even in the presence of router failure. Thus, Bob's service rendezvous points to smear an OR. That is, if a social attacker tries
must not be tied to a single OR, and Bob must be able to tie his service to host a location-hidden service that is illegal or disreputable, it
to new ORs. should not appear---even to a casual observer---that the OR is hosting
\item[Smear-resistant:] An attacker should not be able to use rendezvous that service. \textbf{Application-transparent:} Although we are willing to
points to smear an OR. That is, if a social attacker tries to host a require users to run special software to access location-hidden servers,
location-hidden service that is illegal or disreputable, it should not we are not willing to require them to modify their applications.
appear---even to a casual observer---that the OR is hosting that service.
\item[Application-transparent:] Although we are willing to require users to
run special software to access location-hidden servers, we are not willing
to require them to modify their applications.
\end{tightlist}
\subsection{Rendezvous design} \subsection{Rendezvous design}
We provide location-hiding for Bob by allowing him to advertise We provide location-hiding for Bob by allowing him to advertise
@ -1404,7 +1356,7 @@ and its resistance to attacks.
% Do we want to say this? I don't think we should talk about this % Do we want to say this? I don't think we should talk about this
% kind of discussion till we have more positive results. % kind of discussion till we have more positive results.
\item[Conservative design:] Tor opts for practicality when there is no \item[Simple design:] Tor opts for practicality when there is no
clear resolution of anonymity tradeoffs or practical means to clear resolution of anonymity tradeoffs or practical means to
achieve resolution. Thus, we do not currently pad or mix; although achieve resolution. Thus, we do not currently pad or mix; although
it would be easy to add either of these. Indeed, our system allows it would be easy to add either of these. Indeed, our system allows
@ -1899,6 +1851,21 @@ presence of unreliable nodes.
% section. After all, we will doubtlessly learn very much about why % section. After all, we will doubtlessly learn very much about why
% people do or don't run and use Tor in the near future. -NM % people do or don't run and use Tor in the near future. -NM
%We should run a squid at each exit node, to provide comparable anonymity
%to private exit nodes for cache hits, to speed everything up, and to
%have a buffer for funny stuff coming out of port 80.
% on the other hand, it hampers PFS, because ORs have pages in the cache.
%I previously elsewhere suggested bulk transfer proxies to carve
%up big things so that they could be downloaded in less noticeable
%pieces over several normal looking connections. We could suggest
%similarly one or a handful of squid nodes that might serve up
%some of the more sensitive but common material, especially if
%the relevant sites didn't want to or couldn't run their own OR.
%This would be better than having everyone run a squid which would
%just help identify after the fact the different history of that
%node's activity. All this kind of speculation needs to move to
%future work section I guess. -PS]
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
@ -1962,6 +1929,8 @@ issues remaining to be ironed out. In particular:
able to evaluate some of our design decisions, including our able to evaluate some of our design decisions, including our
robustness/latency tradeoffs, our abuse-prevention mechanisms, and robustness/latency tradeoffs, our abuse-prevention mechanisms, and
our overall usability. our overall usability.
work with morphmix spec
small cells vs large cells
\end{tightlist} \end{tightlist}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%