Stream Redirection API for Arti

JeremyRand · April 8, 2023, 2:06am

Hi Arti devs!

I’m interested in drafting an API for Arti’s replacement for the control port protocol, for redirecting/cancelling streams. My interest is derived in part from several years of experience as the maintainer of StemNS, but is also influenced by other use cases.

Usually I would study in detail the existing API docs you have, but alas Nick has recently informed me that getting our use cases documented is time-sensitive, so I’m putting this out there now with less careful thought than I usually would. If you believe there’s some alternate way to implement the high-level goals behind what I describe, please feel free to suggest it – whatever implementation details may exist in this writeup are definitely influenced by c-tor, and whatever solutions we converge on don’t have to be.

Background

c-tor has the REDIRECTSTREAM and CLOSESTREAM controller commands. These are useful for allowing a controller to intercept stream creation and either redirect it to another destination, or cancel the stream from being allowed to connect at all. Use cases include:

Alternative naming protocols such as Namecoin or GNS.
Curated registries such as FPF’s SecureDrop eTLD.
Pointing DNS names to onion services via TXT records + DNSSEC.
User-specified aliases for hostnames (like a hosts file).
Blocking known-malicious sites (like Mozilla Safe Browsing).
Blocking known-insecure ports (e.g. TCP port 80) for domains on the HSTS preload list.

Unfortunately, these commands in c-tor, in combination with the ATTACHSTREAM command that REDIRECTSTREAM requires to be used, are much too powerful, and expose a lot of security-sensitive functionality to the controller. This is particularly problematic in environments where Tor is being shared by multiple applications (e.g. a system Tor instance in Debian), and is even worse in high-security environments like Tails and Whonix where the application is assumed to be malicious.

Base API

Arti should support a mode in which it sends destination+port of each stream to a controller, and waits for the controller to reply with one of the following responses:

Redirect to some other destination+port.
Cancel the stream.
Handle the stream without changes.

The redirect/cancel is then applied prior to attaching the stream to a circuit.

Stream isolation

The controller may need to perform network requests or access caches in order to decide what action to take. To facilitate this, Arti should pass a hash of the stream isolation data (SOCKS5 auth user+pass, source IP, and all other variables that are used as keys to stream isolation). The hash should be salted with a unique high-entropy string that is unique to the specific controller connection, so that the controller can’t easily detect what the SOCKS5 username (and other potentially sensitive data) are.

Limiting stream visibility

The controller may be exclusively operated by a single application, or a single client IP. Thus, Arti should support only passing resolution requests to the controller if they originate from a specific application and/or a specific client IP. For the client IP, I think it would be sufficient to have a policy “only send resolve requests to the controller running on $IP if the stream originated from $IP”. For application-specific support, I think it could be done by having the application insert a high-entropy string into the SOCKS5 username, and the controller would only receive resolve requests for streams created with that string in the SOCKS5 username.

Of course, there may also be cases where the controller handles resolution requests from all applications and IP’s that use Arti.

Limiting to a domain suffix

Some controllers only operate for a specific set of domain suffixes. E.g. Namecoin only handles .bit; FPF only handles securedrop.tor.onion; local aliases might desire to only handle home.arpa, etc. Arti should support only passing resolve requests to a controller if the hostname matches a specific domain suffix (but should also support controllers that have an empty suffix, meaning that they are passed resolve requests for all streams).

Limiting to cancel-only

Some controllers don’t need to redirect streams, but rather only cancel or allow them. Arti should support this restriction, so that controllers can’t do redirects if that’s not wanted.

Of course, some controllers do need to redirect and cancel streams.

Fingerprinting resistance

In some protocols, like HTTP, subresource loading behavior can be a fingerprinting hazard. Streams opened by Tor Browser can be disambiguated between 1st-party (safe to tamper with) and 3rd-party (fingerprint risk if tampered with) by whether the SOCKS5 username contains the eTLD+1 of the destination hostname. Thus, Arti should support a mode where the controller only is sent resolve requests if the destination eTLD+1 is a substring of the SOCKS5 username.

Of course, some controllers need to handle both 1st-party and 3rd-party streams.

Disallowing circuit selection

In c-tor, ATTACHSTREAM allows tampering with the circuit selection. Controllers that act as resolvers should not be required to have this privilege.

Support both IP and domain names

Resolve requests should be sent regardless of whether the destination address is an IP or a domain name (unless restricted to a specific domain suffix, see above), and regardless of whether the stream is a SOCKS5 CONNECT command or a SOCKS5 RESOLVE command.

Thanks for your time!

Looking forward to fine-tuning and/or rewriting these ideas as we converge on a working API. And apologies again if some of these ideas would look better if I were more familiar with the existing API docs… trying to get this out as fast as possible at Nick’s request, and the quality is definitely impacted by this.

Cheers!

nickm · April 12, 2023, 1:41pm

Thanks, Jeremy!

Right now, we’re at a stage where we’ve drafted a meta-design for our RPC protocol, and we’re starting to refine it and implement it. Obviously, name redirection support is probably not going to be in the first version, but I think that the design can handle some version of it if we need to move that direction.

We should try to make sure that, as we go forward with this design, we’re not actively precluding any functionality that naming will rely on.

Some notes from our current design that might be relevant here:

We’re using a capabilities-based model to ensure that controllers can exist in both “regular user” and “super-user” models. Regular user capabilities should only be able to get access to their own streams and circuits; “super user” capabilities can be used to reconfigure and monitor the entire Arti process.
- Since you’d need a whole-process capability (super-user) to change all stream targets, you’ll
- We don’t currently have a plan for intermediary access levels, but they shouldn’t be too hard to build out of capabilities if we need them.
Right now, our data flows for the design assume that everything works as an RPC from the application to Arti. But this naming support seems to naturally fit into a model where the RPC flows from Arti to the naming tool. I see three ways to work around that:
- We can define “installing yourself as a plug-in” to mean running one RPC command to receive a stream of requests, and then sending replies to them as other commands.
- We could adapt our design so that Arti can be the RPC client as well under some circumstances.
- We could extend out design to allow bidirectional RPC.

And here’s another possibility to think about:

The existing prop279 design was built around the assumptions of what C tor would need in order to have external naming support, and was made to look very much like the pluggable transport design.

If there are currently prop279 naming tools that we should use, it might be a good idea to simply implement prop279 it in Rust. The Rust PT code (in tor-ptmgr) is much cleaner and more extensible than its C equivalent, and it might be possible to extract the common parts and have them shared with a naming-plugin tool. If we did that, we could remove a layer from our naming story, and not have to go through our controller/RPC layer for it at all.

Alternatively, if the prop279 interface isn’t suitable to the naming tools’ needs, then since it was never implemented in C tor, maybe we should just make an updated plugin or the Rust implementation.