[tor-project] More resources required for Snowflake bridge

It has been great to see all the support and encouragement for people running Snowflake proxies. Thank you!

But there is a problem: the Snowflake bridge (which all the temporary proxies forward their traffic to) is going as fast as can on its current hardware. The server is running close to 100% on all CPUs more or less constantly. As more people use Snowflake, they each get a smaller share of the limited available performance. The limited capacity of the bridge is the cause of the [recent slowness of Snowflake](https://www.reddit.com/r/TOR/comments/t49i14)โ€”in the past 2 weeks it's gone [from 12,000 to 16,000 users, without a proportional increase in bandwidth](Relay Search).*

We've spent significant engineering resources already to make the most of the hardware, such as [load balancing multiple tor instances](Running a load-balanced Tor bridge ยท Issue #103 ยท net4people/bbs ยท GitHub) since a few weeks ago. This effort has roughly doubled the available bandwidth of the bridge, but it's still not enough. Demand will only continue to rise.

The bridge needs to be moved to faster hardware. Its current hosting is free of charge, but is already on the highest-spec VPS configuration (8 CPUs, 16 GB). Switching to a server with, say, double the CPUs will have an immediate positive effect: the proof of that is that while we were installing the load balancing on the main bridge, I paid for an only slightly higher-spec server to handle Snowflake traffic during the upgrade, and during that week the bandwidth [immediately rose to higher than where it is now](Running a load-balanced Tor bridge ยท Issue #103 ยท net4people/bbs ยท GitHub). I used Snowflake a lot during that week, and the difference was palpable.

The minimum server required has something like 16 CPUs and 32 GB of RAM. meskio found some suitable [dedicated servers for about $200/month]([anti-censorship-team] Snowflake bridge VPS cost estimates) with unlimited bandwidth. (I estimate current needs are something like [100 TB/month of bandwidth](Add load balancing to bridge (#40095) ยท Issues ยท The Tor Project / Anti-censorship / Pluggable Transports / Snowflake ยท GitLab), of course expected to grow.)

I'm writing this to make people aware that the current cause of poor Snowflake performance is known: it's limited CPU capacity at the bridge, not general Tor slowness or slowness of the temporary proxies. Solving the problem will cost a few hundred dollars per month, at least for the near future. I am open to suggestions about what to do. I promised myself I would not again get in the situation of paying out of pocket for important infrastructure. I've already contacted the Open Technology Fund about a possible rapid response grant, but have not gotten a response yet. I'm willing to continue administering the bridge, as I do now.

* Since 2022-02-03, Tor Metrics graphs for the Snowflake bridge are 1/4 what they should be, until the fix for Graphs for multiple relays that have the same fingerprint (#40022) ยท Issues ยท The Tor Project / Network Health / Metrics / Onionoo ยท GitLab is deployed.

ยทยทยท

_______________________________________________
tor-project mailing list
tor-project@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-project

3 Likes

Do all snowflake users have to get pointed at the same machine/IP
address? That is, would a bunch of people running bridges each
contributing a fraction of 100 TB/month (~= 300 Mbps) be helpful?

ยทยทยท

On Tue, Mar 01, 2022 at 04:26:48PM -0700, David Fifield wrote:

It has been great to see all the support and encouragement for people running Snowflake proxies. Thank you!

But there is a problem: the Snowflake bridge (which all the temporary proxies forward their traffic to) is going as fast as can on its current hardware. The server is running close to 100% on all CPUs more or less constantly. As more people use Snowflake, they each get a smaller share of the limited available performance. The limited capacity of the bridge is the cause of the [recent slowness of Snowflake](Reddit - Dive into anything)โ€”in the past 2 weeks it's gone [from 12,000 to 16,000 users, without a proportional increase in bandwidth](Relay Search).*

We've spent significant engineering resources already to make the most of the hardware, such as [load balancing multiple tor instances](Running a load-balanced Tor bridge ยท Issue #103 ยท net4people/bbs ยท GitHub) since a few weeks ago. This effort has roughly doubled the available bandwidth of the bridge, but it's still not enough. Demand will only continue to rise.

_______________________________________________
tor-project mailing list
tor-project@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-project

That's a good question. The answer is yes, for technical reasons it is currently a requirement that go to a centralized-ish bridge. There are two main reasons, one having to do with the "backend" Tor interface and one having to do with the "frontend" Snowflake interface. At the moment, the Tor reason is the more constraining. (For full context, see https://forum.torproject.org/t/1483.) I would like to work toward the situation you propose, but at the moment it's not possible

For reference, the pipeline looks like this: Snowflake proxies connect to a pluggable transport server, called snowflake-server, using WebSocket. snowflake-server is an HTTPS server that receives the WebSocket traffic, decodes it, and forwards it to tor. snowflake-server and tor do not need to run on the same host, though they do currently. Running them separately is an option for scaling, though ideally they are still located near each other (network-wise), which is not an option on the current hosting.

The "backend" issue with Tor is that Tor clients expect a certain bridge fingerprint. In this case it is 2B280B23E1107BB62ABFC40DDCC8824814F80A72, which is hard-coded in Tor Browser configuration files. So even if there were a pool of multiple bridges, they would all need to share the same identity keys; there's a trust boundary there that's not easy to distribute. (In fact, the load balancing configuration works much like that already: many instances that use the same identity keys. We could run those multiple instances on separate hostsโ€”and that's the next planned step for scaling after upgrading the serverโ€”but they can't easily all be administered by different people.)

There are ways to deal with the fingerprint issue, which are considered in Prepare all pieces of the snowflake pipeline for a second snowflake bridge (#28651) ยท Issues ยท The Tor Project / Anti-censorship / Pluggable Transports / Snowflake ยท GitLab. One way would be for the Tor client not to verify its bridge fingerprint, but as I understand it that enables certain traffic tagging attacks. Another way would be for the client to have an allowlist of permissible fingerprints rather than a single value, but I imagine that would not be a quick change in core tor. shelikhoo on the anti-censorship team has been thinking about this issue, and may have other ideas.

The "frontend" issue with Snowflake is that snowflake-server does more than decapsulate WebSocket connections: it also manages all the Turbo Tunnel session state. snowflake-server effectively has a "TCB" for each ongoing session in its memory, so that when a temporary proxy WebSocket connection dies, the client can reconnect through a different proxy without losing its state. snowflake-server is multithreaded and can expand to use any available CPU capacity, but it would require significant rearchitecting to share or somehow partition the Turbo Tunnel state across multiple hosts that do not share memory. At the moment, snowflake-server uses about half the CPU and RAM on the server, and Tor-related process use the other half.

In summary: it's easy to decouple the frontend pluggable transport and the backend Tor, but the backend Tor instances still need to share an identity key, and the frontend snowflake-server cannot currently scale beyond a single host. Removing either of these limitations is not impossible, but it would be a non-trivial project.

ยทยทยท

On Tue, Mar 01, 2022 at 07:06:56PM -0500, Ian Goldberg via tor-project wrote:

Do all snowflake users have to get pointed at the same machine/IP address? That is, would a bunch of people running bridges each contributing a fraction of 100 TB/month (~= 300 Mbps) be helpful?

_______________________________________________
tor-project mailing list
tor-project@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-project

1 Like

Hi David,

Would you consider crowdfunding for donations? Al, Gus and Yan all got
big responses for their great tweets about Snowflake:
https://twitter.com/genderjokes/status/1497284560811225095
https://twitter.com/0xggus/status/1497224413829283877
https://twitter.com/bcrypt/status/1497657352476000259

Arthur

ยทยทยท

On Tue, Mar 1, 2022 at 3:27 PM David Fifield <david@bamsoftware.com> wrote:

It has been great to see all the support and encouragement for people running Snowflake proxies. Thank you!

But there is a problem: the Snowflake bridge (which all the temporary proxies forward their traffic to) is going as fast as can on its current hardware. The server is running close to 100% on all CPUs more or less constantly. As more people use Snowflake, they each get a smaller share of the limited available performance. The limited capacity of the bridge is the cause of the [recent slowness of Snowflake](Reddit - Dive into anything)โ€”in the past 2 weeks it's gone [from 12,000 to 16,000 users, without a proportional increase in bandwidth](Relay Search).*

We've spent significant engineering resources already to make the most of the hardware, such as [load balancing multiple tor instances](Running a load-balanced Tor bridge ยท Issue #103 ยท net4people/bbs ยท GitHub) since a few weeks ago. This effort has roughly doubled the available bandwidth of the bridge, but it's still not enough. Demand will only continue to rise.

The bridge needs to be moved to faster hardware. Its current hosting is free of charge, but is already on the highest-spec VPS configuration (8 CPUs, 16 GB). Switching to a server with, say, double the CPUs will have an immediate positive effect: the proof of that is that while we were installing the load balancing on the main bridge, I paid for an only slightly higher-spec server to handle Snowflake traffic during the upgrade, and during that week the bandwidth [immediately rose to higher than where it is now](Running a load-balanced Tor bridge ยท Issue #103 ยท net4people/bbs ยท GitHub). I used Snowflake a lot during that week, and the difference was palpable.

The minimum server required has something like 16 CPUs and 32 GB of RAM. meskio found some suitable [dedicated servers for about $200/month]([anti-censorship-team] Snowflake bridge VPS cost estimates) with unlimited bandwidth. (I estimate current needs are something like [100 TB/month of bandwidth](https://bugs.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/40095#note_2774428), of course expected to grow.)

I'm writing this to make people aware that the current cause of poor Snowflake performance is known: it's limited CPU capacity at the bridge, not general Tor slowness or slowness of the temporary proxies. Solving the problem will cost a few hundred dollars per month, at least for the near future. I am open to suggestions about what to do. I promised myself I would not again get in the situation of paying out of pocket for important infrastructure. I've already contacted the Open Technology Fund about a possible rapid response grant, but have not gotten a response yet. I'm willing to continue administering the bridge, as I do now.

* Since 2022-02-03, Tor Metrics graphs for the Snowflake bridge are 1/4 what they should be, until the fix for Graphs for multiple relays that have the same fingerprint (#40022) ยท Issues ยท The Tor Project / Network Health / Metrics / Onionoo ยท GitLab is deployed.
_______________________________________________
tor-project mailing list
tor-project@lists.torproject.org
tor-project Info Page

_______________________________________________
tor-project mailing list
tor-project@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-project

1 Like

Thanks for the suggestion. I have thought about it a little, and it's a
possibility. But I haven't done a thing like that before, and I'm not
excited at the idea of managing a fundraising campaign on top of
everything else.

ยทยทยท

On Wed, Mar 02, 2022 at 11:13:15AM -0800, Arthur D. Edelstein wrote:

Would you consider crowdfunding for donations? Al, Gus and Yan all got
big responses for their great tweets about Snowflake:

_______________________________________________
tor-project mailing list
tor-project@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-project

Thanks to the help of friends and donors, we've been able to bootstrap getting the bridge onto better hardware (since 2022-03-16), and have begun a process of putting it on a long-term stable host.

At this link you can see the effect moving the bridge to a faster server has had. Where formerly the bandwidth was maxing out at about 320 Mbps, it's now reaching about double that. Snowflake is working again for applications like streaming video.

The server the bridge is on now has 32 CPU cores and 128 GB of RAM, which is more than adequate for current levels of use. The next performance bottleneck is likely to be available bandwidth on the network link. There are many complications to deal with, but development is underway to make it possible for there to be multiple bridge sites, each running its own Snowflake server pluggable transport and load-balanced tor installation. Rather than there being multiple bridge sites sharing a fraction of 1 Gbps traffic, it's looking more like it will be multiple bridge sites *each* providing about 1 Gbps traffic. Here are summaries of the anti-censorship team's discussions on this topic over the past few weeks:

ยทยทยท

On Tue, Mar 01, 2022 at 04:26:48PM -0700, David Fifield wrote:

The bridge needs to be moved to faster hardware. Its current hosting is free of charge, but is already on the highest-spec VPS configuration (8 CPUs, 16 GB). Switching to a server with, say, double the CPUs will have an immediate positive effect: the proof of that is that while we were installing the load balancing on the main bridge, I paid for an only slightly higher-spec server to handle Snowflake traffic during the upgrade, and during that week the bandwidth [immediately rose to higher than where it is now](Running a load-balanced Tor bridge ยท Issue #103 ยท net4people/bbs ยท GitHub). I used Snowflake a lot during that week, and the difference was palpable.

_______________________________________________
tor-project mailing list
tor-project@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-project

As part of my testing on Orbot iOS, I am regularly streaming podcasts and video (YouTube) over Tor+Snowflake. The experience is definitely good enough. Thanks for the work on this!

ยทยทยท

On Mar 27, 2022, at 10:23 PM, David Fifield <david@bamsoftware.com> wrote:

On Tue, Mar 01, 2022 at 04:26:48PM -0700, David Fifield wrote:

The bridge needs to be moved to faster hardware. Its current hosting is free of charge, but is already on the highest-spec VPS configuration (8 CPUs, 16 GB). Switching to a server with, say, double the CPUs will have an immediate positive effect: the proof of that is that while we were installing the load balancing on the main bridge, I paid for an only slightly higher-spec server to handle Snowflake traffic during the upgrade, and during that week the bandwidth [immediately rose to higher than where it is now](Running a load-balanced Tor bridge ยท Issue #103 ยท net4people/bbs ยท GitHub). I used Snowflake a lot during that week, and the difference was palpable.

Thanks to the help of friends and donors, we've been able to bootstrap getting the bridge onto better hardware (since 2022-03-16), and have begun a process of putting it on a long-term stable host.

At this link you can see the effect moving the bridge to a faster server has had. Where formerly the bandwidth was maxing out at about 320 Mbps, it's now reaching about double that. Snowflake is working again for applications like streaming video.

_______________________________________________
tor-project mailing list
tor-project@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-project

1 Like

David, if you further assistance in the future, or need to signal boost a fundraising page (Open Collective,[1] maybe?), please feel free to reach out to me directly or to send another message to the list--we could share this with Tor donors who may not be on this list and/or on Tor social.

Thanks for everything you do to run this piece of Snowflake.

Al

[1] https://opencollective.com/

ยทยทยท

On 3/27/22 7:23 PM, David Fifield wrote:

Thanks to the help of friends and donors, we've been able to bootstrap getting the bridge onto better hardware (since 2022-03-16), and have begun a process of putting it on a long-term stable host.

_______________________________________________
tor-project mailing list
tor-project@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-project

1 Like