mirror of
https://gitlab.torproject.org/tpo/core/tor.git
synced 2025-02-21 22:12:03 +01:00
277 lines
9.1 KiB
Text
277 lines
9.1 KiB
Text
Hacking Tor: An Incomplete Guide
|
|
================================
|
|
|
|
|
|
Useful tools
|
|
------------
|
|
|
|
The buildbot
|
|
~~~~~~~~~~~~
|
|
|
|
https://buildbot.vidalia-project.net/one_line_per_build
|
|
|
|
Useful command-lines
|
|
~~~~~~~~~~~~~~~~~~~~
|
|
|
|
Dmalloc
|
|
^^^^^^^
|
|
|
|
dmalloc -l ~/dmalloc.log
|
|
(run the commands it tells you)
|
|
./configure --with-dmalloc
|
|
|
|
Valgrind
|
|
^^^^^^^^
|
|
|
|
valgrind --leak-check=yes --error-limit=no --show-reachable=yes src/or/tor
|
|
|
|
(Note that if you get a zillion openssl warnings, you will also need to
|
|
pass --undef-value-errors=no to valgrind, or rebuild your openssl
|
|
with -DPURIFY.)
|
|
|
|
Running gcov for unit test coverage
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
-----
|
|
make clean
|
|
make CFLAGS='-g -fprofile-arcs -ftest-coverage'
|
|
./src/test/test
|
|
cd src/common; gcov *.[ch]
|
|
cd ../or; gcov *.[ch]
|
|
-----
|
|
|
|
Then, look at the .gcov files. '-' before a line means that the
|
|
compiler generated no code for that line. '######' means that the
|
|
line was never reached. Lines with numbers were called that number
|
|
of times.
|
|
|
|
Coding conventions
|
|
------------------
|
|
|
|
Whitespace and C conformance
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
Invoke "make check-spaces" from time to time, so it can tell you about
|
|
deviations from our C whitespace style. Generally, we use:
|
|
|
|
- Unix-style line endings
|
|
- K&R-style indentation
|
|
- No space before newlines
|
|
- A blank line at the end of each file
|
|
- Never more than one blank line in a row
|
|
- Always spaces, never tabs
|
|
- No more than 79-columns per line.
|
|
- Two spaces per indent.
|
|
- A space between control keywords and their corresponding paren
|
|
"if (x)", "while (x)", and "switch (x)", never "if(x)", "while(x)", or
|
|
"switch(x)".
|
|
- A space between anything and an open brace.
|
|
- No space between a function name and an opening paren. "puts(x)", not
|
|
"puts (x)".
|
|
- Function declarations at the start of the line.
|
|
|
|
We try hard to build without warnings everywhere. In particular, if you're
|
|
using gcc, you should invoke the configure script with the option
|
|
"--enable-gcc-warnings". This will give a bunch of extra warning flags to
|
|
the compiler, and help us find divergences from our preferred C style.
|
|
|
|
Getting emacs to edit Tor source properly
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Hi, folks! Nick here. I like to put the following snippet in my .emacs
|
|
file:
|
|
|
|
-----
|
|
(add-hook 'c-mode-hook
|
|
(lambda ()
|
|
(font-lock-mode 1)
|
|
(set-variable 'show-trailing-whitespace t)
|
|
|
|
(let ((fname (expand-file-name (buffer-file-name))))
|
|
(cond
|
|
((string-match "^/home/nickm/src/libevent" fname)
|
|
(set-variable 'indent-tabs-mode t)
|
|
(set-variable 'c-basic-offset 4)
|
|
(set-variable 'tab-width 4))
|
|
((string-match "^/home/nickm/src/tor" fname)
|
|
(set-variable 'indent-tabs-mode nil)
|
|
(set-variable 'c-basic-offset 2))
|
|
((string-match "^/home/nickm/src/openssl" fname)
|
|
(set-variable 'indent-tabs-mode t)
|
|
(set-variable 'c-basic-offset 8)
|
|
(set-variable 'tab-width 8))
|
|
))))
|
|
-----
|
|
|
|
You'll note that it defaults to showing all trailing whitespace. The "cond"
|
|
test detects whether the file is one of a few C free software projects that I
|
|
often edit, and sets up the indentation level and tab preferences to match
|
|
what they want.
|
|
|
|
If you want to try this out, you'll need to change the filename regex
|
|
patterns to match where you keep your Tor files.
|
|
|
|
If you *only* use emacs to edit Tor, you could always just say:
|
|
|
|
-----
|
|
(add-hook 'c-mode-hook
|
|
(lambda ()
|
|
(font-lock-mode 1)
|
|
(set-variable 'show-trailing-whitespace t)
|
|
(set-variable 'indent-tabs-mode nil)
|
|
(set-variable 'c-basic-offset 2)))
|
|
-----
|
|
|
|
There is probably a better way to do this. No, we are probably not going
|
|
to clutter the files with emacs stuff.
|
|
|
|
Details
|
|
~~~~~~~
|
|
|
|
Use tor_malloc, tor_free, tor_strdup, and tor_gettimeofday instead of their
|
|
generic equivalents. (They always succeed or exit.)
|
|
|
|
You can get a full list of the compatibility functions that Tor provides by
|
|
looking through src/common/util.h and src/common/compat.h. You can see the
|
|
available containers in src/common/containers.h. You should probably
|
|
familiarize yourself with these modules before you write too much code, or
|
|
else you'll wind up reinventing the wheel.
|
|
|
|
Use 'INLINE' instead of 'inline', so that we work properly on Windows.
|
|
|
|
Calling and naming conventions
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
Whenever possible, functions should return -1 on error and 0 on success.
|
|
|
|
For multi-word identifiers, use lowercase words combined with
|
|
underscores. (e.g., "multi_word_identifier"). Use ALL_CAPS for macros and
|
|
constants.
|
|
|
|
Typenames should end with "_t".
|
|
|
|
Function names should be prefixed with a module name or object name. (In
|
|
general, code to manipulate an object should be a module with the same name
|
|
as the object, so it's hard to tell which convention is used.)
|
|
|
|
Functions that do things should have imperative-verb names
|
|
(e.g. buffer_clear, buffer_resize); functions that return booleans should
|
|
have predicate names (e.g. buffer_is_empty, buffer_needs_resizing).
|
|
|
|
If you find that you have four or more possible return code values, it's
|
|
probably time to create an enum. If you find that you are passing three or
|
|
more flags to a function, it's probably time to create a flags argument that
|
|
takes a bitfield.
|
|
|
|
What To Optimize
|
|
~~~~~~~~~~~~~~~~
|
|
|
|
Don't optimize anything if it's not in the critical path. Right now, the
|
|
critical path seems to be AES, logging, and the network itself. Feel free to
|
|
do your own profiling to determine otherwise.
|
|
|
|
Log conventions
|
|
~~~~~~~~~~~~~~~
|
|
|
|
https://wiki.torproject.org/noreply/TheOnionRouter/TorFAQ#LogLevels
|
|
|
|
No error or warning messages should be expected during normal OR or OP
|
|
operation.
|
|
|
|
If a library function is currently called such that failure always means ERR,
|
|
then the library function should log WARN and let the caller log ERR.
|
|
|
|
[XXX Proposed convention: every message of severity INFO or higher should
|
|
either (A) be intelligible to end-users who don't know the Tor source; or (B)
|
|
somehow inform the end-users that they aren't expected to understand the
|
|
message (perhaps with a string like "internal error"). Option (A) is to be
|
|
preferred to option (B). -NM]
|
|
|
|
Doxygen
|
|
~~~~~~~~
|
|
|
|
We use the 'doxygen' utility to generate documentation from our
|
|
source code. Here's how to use it:
|
|
|
|
1. Begin every file that should be documented with
|
|
/**
|
|
* \file filename.c
|
|
* \brief Short description of the file.
|
|
**/
|
|
|
|
(Doxygen will recognize any comment beginning with /** as special.)
|
|
|
|
2. Before any function, structure, #define, or variable you want to
|
|
document, add a comment of the form:
|
|
|
|
/** Describe the function's actions in imperative sentences.
|
|
*
|
|
* Use blank lines for paragraph breaks
|
|
* - and
|
|
* - hyphens
|
|
* - for
|
|
* - lists.
|
|
*
|
|
* Write <b>argument_names</b> in boldface.
|
|
*
|
|
* \code
|
|
* place_example_code();
|
|
* between_code_and_endcode_commands();
|
|
* \endcode
|
|
*/
|
|
|
|
3. Make sure to escape the characters "<", ">", "\", "%" and "#" as "\<",
|
|
"\>", "\\", "\%", and "\#".
|
|
|
|
4. To document structure members, you can use two forms:
|
|
|
|
struct foo {
|
|
/** You can put the comment before an element; */
|
|
int a;
|
|
int b; /**< Or use the less-than symbol to put the comment
|
|
* after the element. */
|
|
};
|
|
|
|
5. To generate documentation from the Tor source code, type:
|
|
|
|
$ doxygen -g
|
|
|
|
To generate a file called 'Doxyfile'. Edit that file and run
|
|
'doxygen' to generate the API documentation.
|
|
|
|
6. See the Doxygen manual for more information; this summary just
|
|
scratches the surface.
|
|
|
|
Doxygen comment conventions
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Say what functions do as a series of one or more imperative sentences, as
|
|
though you were telling somebody how to be the function. In other words, DO
|
|
NOT say:
|
|
|
|
/** The strtol function parses a number.
|
|
*
|
|
* nptr -- the string to parse. It can include whitespace.
|
|
* endptr -- a string pointer to hold the first thing that is not part
|
|
* of the number, if present.
|
|
* base -- the numeric base.
|
|
* returns: the resulting number.
|
|
*/
|
|
long strtol(const char *nptr, char **nptr, int base);
|
|
|
|
Instead, please DO say:
|
|
|
|
/** Parse a number in radix <b>base</b> from the string <b>nptr</b>,
|
|
* and return the result. Skip all leading whitespace. If
|
|
* <b>endptr</b> is not NULL, set *<b>endptr</b> to the first character
|
|
* after the number parsed.
|
|
**/
|
|
long strtol(const char *nptr, char **nptr, int base);
|
|
|
|
Doxygen comments are the contract in our abstraction-by-contract world: if
|
|
the functions that call your function rely on it doing something, then your
|
|
function should mention that it does that something in the documentation. If
|
|
you rely on a function doing something beyond what is in its documentation,
|
|
then you should watch out, or it might do something else later.
|
|
|
|
|