Arti 1.0.0 is released: Our Rust Tor implementation is ready for production use

system · September 2, 2022, 8:42pm

by nickm | September 2, 2022

Back in 2020, we started work on a new implementation of the Tor protocols in the Rust programming language. Now we believe it's ready for wider use.

In this blog post, we'll tell you more about the history of the Arti project, where it is now, and where it will go next.

Background: Why Arti? And How?

Why rewrite Tor in Rust? Because despite (or because of) its maturity, the C Tor implementation is showing its age. While C was a reasonable choice back when we started working on Tor 2001, we've always suffered from its limitations: it encourages a needlessly low-level approach to many programming problems, and using it safely requires painstaking care and effort. Because of these limitations, that pace of development in C has always been slower than we would have liked.

What's more, our existing C implementation has grown over the years to have a not-so-modular design: nearly everything is connected to everything else, which makes it even more difficult to analyze the code and make safe improvements.

A movement to Rust seemed like a good answer. Started in 2010 at Mozilla, and now maintained by the Rust Foundation, Rust has grown over the years to become an independently maintained programming language with great ergonomics and performance, and strong safety properties. In 2017, we started experimenting with adding Rust inside the C Tor codebase, with a view to replacing the code bit by bit.

One thing that we found, however, was that our existing C code was not modular enough to be easily rewritten. (Rust's security guarantees depend on Rust code interacting with other Rust code, so to get any benefit, you need to rewrite a module at a time rather than just one function at a time.) The parts of the code that were isolated enough to replace were mostly trivial, and seemed not worth the effort—whereas the parts that most needed replacement were to intertwined with each other to practically disentangle. We tried to disentangle our modules, but it proved impractical to do so without destabilizing the codebase.

So in 2020, we started on a Rust Tor implementation that eventually became Arti. At first, it was a personal project to improve my Rust skills, but by the end of the summer, it could connect to the Tor network, and by September it sent its first anonymized traffic. After some discussion, we decided to adopt Arti as an official part of the Tor Project, and see how far we could take it.

Thanks to generous support from Zcash Community Grants starting in 2021, we were able to hire more developers and speed up the pace of development enormously. By October, we had our first "no major privacy holes" release (0.0.1), and we started putting out monthly releases. In March of this year, we had enough of a public API to be confident in recommending Arti for experimental embedding, and so we released version 0.1.0.

And now, with our latest release, we've reached our 1.0.0 milestone. Let's talk more about what that means.

Arti 1.0.0: Ready for production use

When we defined our set of milestones, we defined Arti 1.0.0 as "ready for production use": You should be able to use it in the real world, to get a similar degree of privacy, usability, and stability to what you would with a C client Tor. The APIs should be (more or less) stable for embedders.

We believe we have achieved this. You can now use arti proxy to connect to the Tor network to anonymize your network connections.

Note that we don't recommend pointing a conventional web browser at arti (or, indeed, C Tor): web browsers leak much private and identifying information. To browse the web anonymously, use Tor Browser; we have instructions for using it with Arti.

Recent work

To achieve this, we we've made many improvements to Arti. (Items marked as NEW are new or substantially improved since last month's 0.6.0 release.)

For a complete list of changes, including a list of just the changes since 0.6.0, see our CHANGELOG.

So, how's Rust been?

Our experience with Rust has been a definite success.

At every stage, we've encountered way fewer bugs than during comparable C development. The bugs that we have encountered have almost all been semantic/algorithmic mistakes (real programming issues), not mistakes in using the Rust language and its facilities. Rust has a reputation for being a difficult language with a picky compiler - but the pickiness of the compiler has been a great boon. Generally speaking, if our Rust code compiles and passes its tests, it is much likelier to be correct than our C code under the same conditions.

Development of comparable features has gone way faster, even considering that we're building most things for the second time. Some of the speed improvement is due to Rust's more expressive semantics and more usable library ecosystem—but a great deal is due to the confidence Rust's safety brings.

Portability has been far easier than C, though sometimes we're forced to deal with differences between operating systems. (For example, when we've had to get into the fine details of filesystem permissions, we've found that most everything we do takes different handling on Windows.)

One still-uncracked challenge is binary size. Unlike C's standard library, Rust's standard library doesn't come installed by default on our target systems, and so it adds to the size of our downloads. Rust's approach to high-level programming and generic code can make fast code, but also large executables. We've been able to offset this somewhat with the Rust ecosystem's improved support for working with platform-native TLS implementations, but there's more work to do here.

Embedding has been practical so far. We have preliminary work embedding Arti in both Java and Python.

We've found that Arti has attracted volunteer contributions in greater volume and with less friction than C Tor. New contributors are greatly assisted by Rust's strong type system, excellent API documentation support, and safety properties. These features help them find where to make a change, and also enable making changes to unfamiliar code with much greater confidence.

What's coming next?

Our primary focus in Arti 1.1.0 will be to implement Tor's anticensorship features, including support for bridges and pluggable transports. We've identified our primary architectural challenges there, and are working through them now.

In addition, we intend to further solidify our compliance with semantic versioning in our high-level arti-client crate. We are confident that our intentionally exposed APIs there are stable, but before we can promise long-term stability we need to make sure that we have a way to detect and prevent changes to the lower-level APIs that arti-client re-exports. The cargo-public-api and cargo-semver-checks crates both seem promising, but we may need additional thinking.

(This semantic versioning difficulty is the primary reason why arti-client is still at 0.6.0 instead of 1.0.0. When we declare 1.0.0 for arti-client, we want to be sure that we can keep backward compatibility for as long as possible.)

We expect that Arti 1.1.0 will be complete around the end of October. We had originally estimated one month of the team's time for this work, but since we'll all be off for a week for a meeting, and then a few of us have vacations, it seems that we'll need to allocate two months in order to find a month of hacking time. (Such is life!)

And then?

After Arti 1.1.0, we're going to focus on onion services in Arti 1.2.0. They're a complex and important part of the Tor protocols, and will take a significant amount of effort to build. Making onion services work securely and efficiently will require a number of related protocol features, including support for congestion control, DOS protection, vanguards, and circuit padding machines.

After that, Arti 2.0.0 will focus on feature parity with the C tor client implementation, and support for embedding Arti in different languages. (Preliminary embedding work is promising: we have the beginnings of a VPN tool for mobile, embedding Arti in Java.) When we're done, we intend that Arti will be a suitable replacement for C tor as a client implementation in all (or nearly all) use contexts.

We've applied to the Zcash Community Grants for funding to support these next two phases, and we're waiting hopefully to see what they say.

And after that?

We intend that, in the long run, Arti will replace our C tor implementation completely, not only for clients, but also for relays and directory authorities. This will take several more years of work, but we're confident that it's the right direction forward.

(We won't stop support for the C implementation right away; we expect that it will take some time for people to migrate.)

How can you try Arti now?

We rely on users and volunteers to find problems in our software and suggest directions for its improvement. You can test Arti as a SOCKS proxy (if you're willing to compile from source) and as an embeddable library (if you don't mind a little API instability).

Assuming you've installed Arti (with cargo install arti, or directly from a cloned repository), you can use it to start a simple SOCKS proxy for making connections via Tor with:

$ arti proxy -p 9150

and use it more or less as you would use the C Tor implementation!

(It doesn't support onion services yet. If compilation doesn't work, make sure you have development files for libsqlite installed on your platform.)

If you want to build a program with Arti, you probably want to start with the arti-client crate. Be sure to check out the examples too.

For more information, check out the README file. (For now, it assumes that you're comfortable building Rust programs from the command line). Our CONTRIBUTING file has more information on installing development tools, and on using Arti inside of Tor Browser. (If you want to try that, please be aware that Arti doesn't support onion services yet.)

When you find bugs, please report them on our bugtracker. You can request an account or report a bug anonymously.

And if this documentation doesn't make sense, please ask questions! The questions you ask today might help improve the documentation tomorrow.

Whether you're a user or a developer, please give Arti a try, and let us know what you think. The sooner we learn what you need, the better our chances of getting it into an early milestone.

Acknowledgments

Thanks to everybody who has helped take us here from Arti 0.1.0, including: 0x4ndy, Alexander Færøy, Alex Xu, Arturo Marquez, Christian Grigis, Dimitris Apostolou, Emptycup, FAMASoon, feelingnothing, Jim Newsome, Lennart Kloock, Michael, Michael Mccune, Neel Chauhan, Orhun Parmaksız, Richard Pospesel, Samanta Navarro, solanav, spongechameleon, Steven Murdoch, Trinity Pointard, and Yuan Lyu!

And, of course, thanks to Zcash Community Grants for their support of this critical work! The Zcash Community Grants program (formerly known as ZOMG) funds independent teams entering the Zcash ecosystem to perform major ongoing development (or other work) for the public good of the Zcash ecosystem. Zcash is a privacy-focused cryptocurrency, which pioneered the use of zk-SNARKs. The Zcash ecosystem is driven to further individual privacy and freedom.

This is a companion discussion topic for the original entry at https://blog.torproject.org/arti_100_released/

Vort · September 3, 2022, 12:15pm

How much performance of Rust version differs from C version?

QxXw4vK4PvW · September 3, 2022, 8:13pm

Can it do stream isolation based on SOCKS5 user/password? That would let me help testing it.

nickm · September 5, 2022, 1:30pm

Yes. Socks isolation is on by default.

nickm · September 5, 2022, 1:33pm

They should be roughly comparable for client uses: we’ve done preliminary testing, and gotten similar results.

(If you find cases where the Arti performance is way worse than the C tor performance, please let us know!)

There are some cases where we expect the Rust implementation to be more efficient than C for now: Arti is thoroughly multithreaded by default, whereas the C tor implementation only uses multithreading for limited calculations. There are other cases where we expect the C implementation to be more efficient: in C we have the improved RTT-based congestion control logic, which we have not yet built in Arti.

Vort · September 5, 2022, 1:47pm

Thank you for the answer.
I remember that TLS library have heavy CPU-specific optimizations in C code.
And was wondering if it is possible to make such low level optimizations in Rust.
However I’m not entirely sure if I remember everything correctly.

nickm · September 5, 2022, 3:05pm

@Vort said:

I remember that TLS library have heavy CPU-specific optimizations in C code.
And was wondering if it is possible to make such low level optimizations in Rust.

Well, by default, Arti will use your own operating system’s TLS implementation (SecureTransport, schannel, or OpenSSL), so it will get whatever optimizations that has for TLS. If you build with rustls instead, you’ll get the optimized implementations from ring. There are additional options you can set at compile time to use optimized crypto from other sources: see documentation for the arti crate for details.

All that said, though, if you’ve got a reasonable desktop or laptop environment, I’d expect CPU-bound cryptography won’t be a major performance issue for client usage. The CPU efficiency of your cryptography will only be noticeable on low-end mobile (where the CPU is pretty slow itself), or for relays or onion services (since they are processing a lot more traffic—also, Arti doesn’t support them yet).

Vort · September 5, 2022, 3:49pm

I doubt that TLS implementation from my Windows 7 is usable for modern programs at all.

When relay support will be implemented it may be late to change TLS mechanisms.
But I hope that other options mentioned by you will have comparable to C version performance.

shadykaty · September 7, 2022, 8:06am

Some thoughts / a bit of context about performance comparability between Rust and C, for those not familiar.

One nice thing about Rust’s compiler is that it is built on LLVM. For those not familiar, LLVM is a compiler infrastructure. It provides an intermediate form called Intermediate Representation or IR, and various optimizations which work on IR semantics. So if you can write a compiler which produces IR, you can use LLVM optimizations. This is a good thing for making new systems langs because you can use many of the same optimizations which allow C to be fast.

If you have some Rust code, and some perfectly equivalent C code, and compile them using rustc and clang, you will get the same or very close to the same machine code.

The subtle thing often missed in discussions of “X lang has C-like performance” is that equivalent code usually isn’t actually equivalent due to differences in the languages’ semantics. One of the ways Rust is different from C is that it injects runtime bounds checks for things which are only checked in C if you wrote a check yourself. So, there are potentially extra branches injected, which may be unreachable, which may impede performance. If your C isn’t bounds checking (and doesn’t need to) and your Rust is, your Rust is doing extra work and that work has a nonzero cost. But:

these can be explicitly guaranteed to be unreachable, thus removing the extra code and giving you identical perf
nearly all of the the overhead from a check is failing branch predictions, which should never or almost never happen
if the code in question is not executed very frequently, the average overhead over the process’s lifetime will be negligible

Another tricky thing about the “C-like performance” we often demand from other langs is: writing performant C is actually pretty hard. You can’t just write some working C and be done, your C has to be written in a way that the compiler will produce sane code. The relative lack of high level abstractions in C are both a blessing and a curse. Idioms provided by the language are usually highly optimized. Lacking many high level abstractions in C leaves many idioms up to the author, and if the author isn’t an expert, their code will be worse and less performant than if they’d used a high level abstraction someone else wrote. So while Rust’s abstractions may be worse than expertly written C code, they may also be better than an average programmer’s C code. High level abstractions can be a performance improvement - they aren’t necessarily a performance detriment.

tl;dr: As someone who has written Rust and C and has a light obsession with performant code, I would expect the rewrite to be a little better in some places, a little worse in others, and roughly equivalent on the whole. Spots with drastic performance losses can be optimized to as good as they previously were, with near certainty. While I personally don’t enjoy writing Rust and it wouldn’t be my first choice for a rewrite, I still think that on the whole this is an improvement to the Tor project.

Vort · September 7, 2022, 10:53am

@shadykaty thanks for clarification.
I agree with most of your explanations.
But there is one thing missing, which I implied when said about TLS:
C allows to go even deeper - to replace some code with parts, written in assembly language.
Since not every instruction is available on every CPU, such code usually written in several modifications, wrapped with selection logic.
This is, as far as I know, what compilers usually not doing.
At the same time such code may be very important, for example when it uses hardware crypto (while allowing less optimized code run on CPUs, which do not have such accelerations).
Because of it I wonder if it is possible to effectively mix assembly code and Rust code.
By the way, GCC is so good at it, that allows mixing code even within single function.
While MSVC (al least for x64 code) require assembly and C functions to be in separate files.

shadykaty · September 7, 2022, 3:45pm

100% possible. Support for FFI is crucial in any systems language and Rust is no exception. You just need to link the external file and specify the appropriate calling convention for its functions:

Iheartcake · September 17, 2022, 12:19pm

Why not just use a newer c++ standard to address these old C safety problems ? Or contribute to these ?
I don’t get how the “not-so-modular” design has anything to do with C. Get rid of your globals then.

I wouldn’t want to use any language that declares a variable integer like this let mut x = 5;
But I also despise any web languages.

katofrobo · September 18, 2022, 8:00am

I think you guys made a great choice going with Rust. I know that C++ has some great features but to be honest Rust just does stuff out of the box the right way. Rust was a good choice for a production language especially considering the vulnerabilities that can be created in code. I think the ability to use multi-threading and get it up and running fast likely was made possible by Rust- as I’ve used Rust and it does help enforce good practices with threading.

I am using the Arti in proxy mode now. I just replaced C-Tor with it and it is working well. I feel a lot more confident that a lot of the simple out of bounds accesses, use after free, and other problems associated with the C-language can now be a thing of the past.