Perhaps we can have this discussion here…
I’ve posted some initial architecure notes ; I’m wondering how well they match (or don’t match) other people’s thinking.
I’ve attempted to summarize the API requirements (not including any architectural requirements) of all of the various potential (listed) arti consumers have based off the original pad. Some of them may be peculiar to how the current tor-daemon integration story works so may not necessarily required based on how the various apps now are forced to interact with tor (particularly around specifically how apps connect to endpoints through tor; SOCKS5 vs ip addresses vs sockets/fds) but I’ve included them anyway.
Once we figure out what a potential API looks like we can more easily figure out what are the RPC requirements.
Summarized/Collated API Requirements
Administration
bootstrapping
network configuration; firewall, proxy, etc
managing onion auth client keys (Tor Browser, Orbot, Gosling)
clearing local Tor data (Orbot)
circuit managaement: close, new (Orbot, Tor Browser)
guard/exit node selection (Orbot
relay allowlist/blocklist (Orbot)
custom/advanced stuffs; bridge+directory authorities (Orbot)
IPV4/IPV6 control (Orbot)
memory controls (Orbot(iOS))
stop all onion services (OnionShare, Gosling)
stop all tor services/daemon (OnionShare)
Debugging/Information
async bootstrap/status events (Gosling, Tor Browser, Orbot)
tor connection status query (Orbot)
log serialization to file on disk (Orbot)
circuit information: IP, geolocation, etc (Tor Browser, Orbot)
all circuits details overview (Orbot)
exit node countries query (Orbot)
getting an onion service’s service id/address (OnionShare)
Pluggable Transports
using tor through PTs
bridge configuration (Tor Browser, Gosling)
standalone PTs for anti-censorship for anti-censorship (Tor Browser)
Connectivity
accessing Tor network via SOCKS5 etc proxies (Tor Browser)
clearnet circuit selection/circuit token (Tor Browser)
communication over Tor via SOCKET/fd/TcpStream (Gosling)
proxy-bypass mode; Tor only for onions, clearnet elsewhere (Orbot)
proxy-bypass mode; use own Tor rather than system (OnionShare)
onion service only mode; only allow connections to onion services (Gosling)
customizable proxy-bypass mode; clarnet connections for some apps; this may make more sense to leave to the embedding app rather than exposed via some API(Orbot)
Onion Services
starting onion service w/ ‘TcpListener’ analog (Gosling)
starting onion service pointing to existing TcpListener (OnionShare)
connecting to onion service w/ ‘TcpStream’ analog (Gosling)
connecting to onion service via Proxy interface (Tor Browser, OnionShare)
connecting to onion service w/ authkey (Tor Browser, Gosling)
onion service domain to local IP Address translation (Orbot)
Cryptography
ed25519 keypair primitive generation, conversion, signing (Gosling)
v3onionservice <-> ed25519 public key conversions (Gosling)
v3onionservice validation (Gosling, Tor Browser)
x25519 keypair primitive generation, conversion (Gosling)
x25519 <-> ed25519 keypair conversions (Gosling)
Here is also the brainstorming system architecture I sketched out/mentioned in the mail thread along with that post since we’re migrating discussion to the forum.
(Three-letter acronym appendix at the end just in case)
Too big architecture diagram
Drafted up a diagram of how I would go about designing the arti ecosystem components (spoiler alert I don’t have any meaningful experience w/ iOS or Android so this is coming from a desktop-centric point-of-view):
Here’s some definitions of the various components (.a for static libs, .so for dynamically-linked libraries):
- arti_server.a : some public API surface based on the ‘arti’ crate hand waving; this is where the hand-written Rust code goes
- art_server_rpc.a/arti_client_rpc.a : possibly platform specific glue code that handles the actual RPC mechanism
- arti_client.a : client library which exposes the same API as arti_server.a, but transparently routes things through the RPC layer
A project should be able to program against the same ABI and link to either the client or server.
- arti_client_proxy_ffi.a : cbindgen C FFI wrapper around art_client.a
- arti_server_ffi.a : cbindgen C FFI wrapper around arti_server.a
Again these two libs should have exactly the same ABI in the end.
- arti.hpp, JNI, Python Bindings, etc : language specific glue to cbindgen’s generated C header
^these should be generated automatically either from cbdindgen’s generated C header with 3rd party tools if they exist or from using some clang monstrosity if necessary (or even better, from the original IDL source )
Boilerplate, yuck
So, implicit here is a lot of boilerplate to manage/maintain. I would think at a minimum we would want some IDL system in place that lets us define the shared arti_server/arti_client API surface, traits, data types, etc, which then goes and generates the RPC bridge between client/server but also the extern “C” FFI implementation (which is then used to generate the C FFI header using cbindgen). I don’t think this whole pipeline would be unreasonable assuming that the FFI layer is ‘just’ a shim/passthrough layer to the real implementation implemented in arti_server.a.
Some general FFI strats
So one strat here to avoid nasty complications is to only deal in POD C types at the FFI boundary layer (explicitly sized ints except for size_t/usize_t where it makes sense, client-allocated buffers, etc), and to never pass through the actual data out of the Rust layer out into unsafe code (so primitives and handles out only).
Arti’s FFI headers can say functions take pointers to structs but the struct layouts never actually need to be defined anywhere. This way it’s not possible to actually instantiate any of them w/o using arti API calls. So everything public is really uintptr_t handles, but forward declaring as various struct pointers gives us a little type-safety at the boundary layer (to avoid the OpenGL grossness of ‘welp everything is a GLint, good luck with your static analysis’).
This does mean lots of setters and getters everywhere if you need to populate structs member by member (so probably avoid setting up your APIs like that to avoid frequent RPC round-trip to the ‘proxied’ real data on the server).
This setup also helps with ABI stability; you don’t have to worry about accidentally breaking older versions when refactoring your struct layouts if you only ever access them indirectly (and ideally you aren’t changing function signatures between minor releases either).
Further Fanciness
If one wanted to make things easier for downstream apps, it would be nice for the arti_server.so implementation to transparently proxy if a ‘system’ arti server is already in place. This way app developers can ship one binary w/o worrying about whether or not the target system has an arti daemon installed. Alternatively, we can continue with the current approach of ‘set a handful of environment variables to use system tor/arti’, which would still require shipping both the proxying logic and ‘real’ logic in the same arti_server.so library.
SOCKET/file descriptor marshalling?
Another open question which I don’t think was addressed earlier today is how we’re going to handle actual network connections in the proxied scenario (eg arti_client.so ← RPC → arti_server.so <-> Tor Network). Are we going to be marshalling SOCKETs through the RPC layer (I know this is possible on Windows, no idea about elsewhere)? Will the server just stick with the existing local SOCKS5 proxy paradigm? vOv
Alternatively… Less work?
Instead of maintaining the C FFI layer that exposes arti’s entire public API surface we could just say ‘nope part of using arti in your non-Rust app is writing your own FFI using Rust+cbindgen that exposes only what you need for your business logic’.
I suspect the realities of how Firefox is architected will mean that we will end up having to write our own API boundary thing to deal with exposing arti to XPCOM and JavaScript anyway (may as well do that in Rust), so I don’t think a lack of a C FFI would be a blocker for Tor’s applications team at least but that could change.
Acronyms Just in case
- API - application programming interface (your header definitions)
- ABI - application binary interface (the actual binary blobs your linker/CPU uses, important to remain consistent between releases or consumers have fun runtime errors)
- FFI - foreign function interface ( the sort of general name for the ‘bridge’ between between one type of code like Rust/Java/Python/etc to another, usually C)
- JNI - java native interface (the FFI to get native C ABI stuff callable from your Java code)
- IDL - interface description language (some high-level spec of an API that can be used for boiler plate code generation)
- RPC - remote procedure call (a function call that looks like an ordinary function but actually involves data marshaling and data transmission to another process/machine/etc; looks local but actually out-of-process)
- POD - plane old data (in this context primitive types, int8_t, size_t, float, etc; data which trivially passes through FFI boundaries)
I think your diagrams basically unify/generalize mine. I’ve no opinion or investment in the specifics of the RPC protocol/message format, and I very much agree that code-generation is very necessary for the whole pipeline to not become a maintenance nightmare.
Going from your diagram, I believe the ideal scenario is some IDL specification of Rust API which generates:
- RPC Server
- RPC Client
- C FFI
as wells as perhaps some portion of Rust API itself (enums, structs, etc) leaving arti devs to only write the Rust API implementation.
I did a little digging last night and was somewhat surprised to find very little available for a general IDL formats+parsers/tooling for code generation (that is actually used/tested). Lots of stuff for converting/generating bridges from one language to another, or some people’s hobby projects, but nothing like OMG IDL to AST.
One approach we could take here is writing Rust API interface in Rust, and then just using the Rust parser to generate the rest.
EDIT: use the Rust parser to generate the AST and from that write code to generate the rest I mean.
Hi! Here are some ideas I’ve had about making FFI/RPC interface for arti. See also
doc/dev/notes/ffi_and_rpc_sketch.md · api-sketch · Nick Mathewson / arti · GitLab for some of my earlier thoughts.
I’m summarizing some inline comments that @richard made on an earlier draft of this. @diziet was also helpful in getting me to walk back some hasty asssumptions.
Each of these ideas stands more or less independently; I’d like it if people would
think about and react to them.
Initial APIs to prototype
What do we build first? We need to pick a minimal set of options that nonetheless are useful, and that demonstrate the whole space of the API that we want to explore.
I suggest:
- Authenticate
- Watch bootstrap status…
* Poll the current status
* Get a stream of status updates - Open a data stream…
* Poll its status
* Get updates about its status
* And use it.
Do we need anything else to be useful?
Will any of our other functionality work differently enough from this functionality
that we need to prototype that too?
(At this point, Richard notes that we maybe want to expose a set of cryptographic operations for key manipulation and management, and wonders whether they should be remote or in-process.)
Idea: Every operation is observable.
Here is a possible principle: Every operation that does not finish in negligible time
should return a handle that you can poll for status information.
For each such handle, you should be able to wait for it to finish, and poll for status
updates.
Idea: Sessions, views, and capabilities
We’d like to have better isolation between different applications than C tor provides on its control ports. Here is one way to achieve that.
We make our API use a capability-like interface, where you can only get a handle to
an object if you have permission to see it and mess with it.
The root object that you get when you authenticate is a View of a TorClient. With a View, you can see the streams and circuits that were opened for that View, but nothing else. One such view is the Global View; it is equivalent to root access on a TorClient instance.
All the Views of a TorClient share a GuardMgr, a CircMgr, a DirMgr, a ChanMgr, and their configuration. You don’t get to expect or modify anything global unless you have the Global View, or we declare that it is safe to inspect.
Each View receives stream isolation from the other Views. (I’ll explain how to associate a stream with a View later on.)
There may be a way to enumerate the streams and circuits associated with a View. Access to a stream, circuit, or View is given by objects that are sort of like capabilities (If you have one, you are presumed to own the object), and sort of like weak handles (The object can go away according to Tor’s regular expiration rules, whether you hang on to the handle or not.)
Idea: Opening streams
If the RPC port is an HTTP(S)-based thing, let’s use some form of HTTP authentication, and also implement an HTTP CONNECT proxy. That way, if
HTTP 2 or later is in use, we get single-socket multiplexing “for free”.
When you’re opening a request via HTTP CONNECT or via SOCKS, let’s define a
way in the request headers to associate your stream with a View. The reply headers
can contain a handle that you can use within the View to refer to the
stream.
(Even if the RPC isn’t HTTPS-based, we can have implement HTTP CONNECT and/or an extended SOCKS with support for more options to achieve this, though we might need to put those on another port.)
Idea: Uniform object manipulation API
Most (not all!) of the actions on Tor’s control port come down to:
- Observe changes in X
- Inspect the current state of X
- Make changes in X
But currently, each of these operations uses a different syntax and namespace, and they are not consistently supported across all objects.
For example, circuits are created or extended with EXTENDCIRCUIT, but you can also act on them with SETCIRCUITPURPOSE
and CLOSECIRCUIT
. (Those aren’t functions we’re planning to implement in Arti any time soon.) Observing a stream of events where circuits change is SETEVENTS CIRC
and/or SETEVENTS CIRC_MINOR
. And learning about the set of circuits at a single point in time is GETINFO circ/..
By comparison, configuration is changed with SETCONF
and RESETCONF
, but also LOADCONF
. It’s flushed to disk with SAVECONF
. Configuration is observed with SETEVENTS CONF_CHANGED
. And to get the current value of a configuration option, you use GETCONF
.
I suggest that we try to make a more orthogonal API, where objects of each type are discoverable in the same way, observable in the same way, modifiable in the same way. This will ideally turn an M*N API design (operations * objects) into an M+N design (operations + objects).