[tor-relays] How to reduce tor CPU load on a single bridge?

The main Snowflake bridge (Relay Search)
is starting to become overloaded, because of a recent substantial
increase in users. I think the host has sufficient CPU and memory
headroom, and pluggable transport process (that receies WebSocket
connections and forwards them to tor) is scaling across multiple cores.
But the tor process is constantly using 100% of one CPU core, and I
suspect that the tor process has become a bottleneck.

Here are issues about a recent CPU upgrade on the bridge, and
observations about the proportion of CPU used by different processes:

I have the impression that tor cannot use more than one CPU core—is that
correct? If so, what can be done to permit a bridge to scale beyond
1×100% CPU? We can fairly easily scale the Snowflake-specific components
around the tor process, but ultimately, a tor client process expects to
connect to a bridge having a certain fingerprint, and that is the part I
don't know how to easily scale.

  • Surely it's not possible to run multiple instances of tor with the
    same fingerprint? Or is it? Does the answer change if all instances are on the same IP address? If the OR ports are never used?
  • OnionBalance does not help with this, correct?
  • Are there configuration options we could set to increase parallelism?
  • Is migrating to a host with better single-core performance the only
    immediate option for scaling the tor process?

Separate from the topic of scaling a single bridge, here is a past issue
with thoughts on scaling beyond one bridge. it looks as though there are
not ways to do it that do not require changes to the way tor handles its
Bridge lines.

  • Using multiple snowflake Bridge lines does not work well, despite that
      we could arrange to have the Snowflake proxy connect the user to the
      expected bridge, because tor will try to connect to all of them, not
      choose one at random.
  • Removing the fingerprint from the snowflake Bridge line in Tor Browser
      would permit the Snowflake proxies to round-robin clients over several
      bridges, but then the first hop would be unauthenticated (at the Tor
      layer). It would be nice if it were possible to specify a small set of
      permitted bridge fingerprints.
···

_______________________________________________
tor-relays mailing list
tor-relays@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays

1 Like

I have the impression that tor cannot use more than one CPU core???is that
correct? If so, what can be done to permit a bridge to scale beyond
1×100% CPU? We can fairly easily scale the Snowflake-specific components
around the tor process, but ultimately, a tor client process expects to
connect to a bridge having a certain fingerprint, and that is the part I
don't know how to easily scale.

* Surely it's not possible to run multiple instances of tor with the
  same fingerprint? Or is it? Does the answer change if all instances
  are on the same IP address? If the OR ports are never used?

Good timing -- Cecylia pointed out the higher load on Flakey a few days
ago, and I've been meaning to post a suggestion somewhere. You actually
*can* run more than one bridge with the same fingerprint. Just set it
up in two places, with the same identity key, and then whichever one the
client connects to, the client will be satisfied that it's reaching the
right bridge.

There are two catches to the idea:

(A) Even though the bridges will have the same identity key, they won't
have the same circuit-level onion key, so it will be smart to "pin"
each client to a single bridge instance -- so when they fetch the bridge
descriptor, which specifies the onion key, they will continue to use
that bridge instance with that onion key. Snowflake in particular might
also want to pin clients to specific bridges because of the KCP state.

(Another option, instead of pinning clients to specific instances,
would be to try to share state among all the bridges on the backend,
e.g. so they use the same onion key, can resume the same KCP sessions,
etc. This option seems hard.)

(B) It's been a long time since anybody tried this, so there might be
surprises. :slight_smile: But it *should* work, so if there are surprises, we should
try to fix them.

This overall idea is similar to the "router twins" idea from the distant
distant past:
https://lists.torproject.org/pipermail/tor-dev/2002-July/001122.html
https://lists.torproject.org/pipermail/tor-commits/2003-October/024388.html
https://lists.torproject.org/pipermail/tor-dev/2003-August/000236.html

* Removing the fingerprint from the snowflake Bridge line in Tor Browser
  would permit the Snowflake proxies to round-robin clients over several
  bridges, but then the first hop would be unauthenticated (at the Tor
  layer). It would be nice if it were possible to specify a small set of
  permitted bridge fingerprints.

This approach would also require clients to pin to a particular bridge,
right? Because of the different state that each bridge will have?

--Roger

···

On Mon, Dec 27, 2021 at 12:05:26PM -0700, David Fifield wrote:

_______________________________________________
tor-relays mailing list
tor-relays@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays

1 Like

David/Roger:

Search the tor-relay mail archive for my previous responses on loadbalancing Tor Relays, which I’ve been successfully doing for the past 6 months with Nginx (it’s possible to do with HAProxy as well). I haven’t had time to implement it with a Tor Bridge, but I assume it will be very similar. Keep in mind it’s critical to configure each Tor instance to use the same DirectoryAuthority and to disable the upstream timeouts on Nginx/HAProxy.

Happy Tor Loadbalancing!

Respectfully,

Gary

P.S. I believe there’s a torrc config option to specify which cpu core a given Tor instance should use, too.

I have the impression that tor cannot use more than one CPU core???is that
correct? If so, what can be done to permit a bridge to scale beyond
1×100% CPU? We can fairly easily scale the Snowflake-specific components
around the tor process, but ultimately, a tor client process expects to
connect to a bridge having a certain fingerprint, and that is the part I
don’t know how to easily scale.

  • Surely it’s not possible to run multiple instances of tor with the
    same fingerprint? Or is it? Does the answer change if all instances
    are on the same IP address? If the OR ports are never used?

Good timing – Cecylia pointed out the higher load on Flakey a few days
ago, and I’ve been meaning to post a suggestion somewhere. You actually
can run more than one bridge with the same fingerprint. Just set it
up in two places, with the same identity key, and then whichever one the
client connects to, the client will be satisfied that it’s reaching the
right bridge.

There are two catches to the idea:

(A) Even though the bridges will have the same identity key, they won’t
have the same circuit-level onion key, so it will be smart to “pin”
each client to a single bridge instance – so when they fetch the bridge
descriptor, which specifies the onion key, they will continue to use
that bridge instance with that onion key. Snowflake in particular might
also want to pin clients to specific bridges because of the KCP state.

(Another option, instead of pinning clients to specific instances,
would be to try to share state among all the bridges on the backend,
e.g. so they use the same onion key, can resume the same KCP sessions,
etc. This option seems hard.)

(B) It’s been a long time since anybody tried this, so there might be
surprises. :slight_smile: But it should work, so if there are surprises, we should
try to fix them.

This overall idea is similar to the “router twins” idea from the distant
distant past:
https://lists.torproject.org/pipermail/tor-dev/2002-July/001122.html
https://lists.torproject.org/pipermail/tor-commits/2003-October/024388.html
https://lists.torproject.org/pipermail/tor-dev/2003-August/000236.html

  • Removing the fingerprint from the snowflake Bridge line in Tor Browser
    would permit the Snowflake proxies to round-robin clients over several
    bridges, but then the first hop would be unauthenticated (at the Tor
    layer). It would be nice if it were possible to specify a small set of
    permitted bridge fingerprints.

This approach would also require clients to pin to a particular bridge,
right? Because of the different state that each bridge will have?

–Roger

···

On Monday, December 27, 2021, 2:00:50 PM MST, Roger Dingledine arma@torproject.org wrote:
On Mon, Dec 27, 2021 at 12:05:26PM -0700, David Fifield wrote:


tor-relays mailing list
tor-relays@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays

1 Like

BTW… I just fact-checked my post-script and the cpu affinity configuration I was thinking of is for Nginx (not Tor). Tor should consider adding a cpu affinity configuration option. What happens if you configure additional Tor instances on the same machine (my Tor instances are on different machines) and start them up? Do they bind to a different or the same cpu core?

Respectfully,

Gary

David/Roger:

Search the tor-relay mail archive for my previous responses on loadbalancing Tor Relays, which I’ve been successfully doing for the past 6 months with Nginx (it’s possible to do with HAProxy as well). I haven’t had time to implement it with a Tor Bridge, but I assume it will be very similar. Keep in mind it’s critical to configure each Tor instance to use the same DirectoryAuthority and to disable the upstream timeouts on Nginx/HAProxy.

Happy Tor Loadbalancing!

Respectfully,

Gary

P.S. I believe there’s a torrc config option to specify which cpu core a given Tor instance should use, too.

I have the impression that tor cannot use more than one CPU core???is that
correct? If so, what can be done to permit a bridge to scale beyond
1×100% CPU? We can fairly easily scale the Snowflake-specific components
around the tor process, but ultimately, a tor client process expects to
connect to a bridge having a certain fingerprint, and that is the part I
don’t know how to easily scale.

  • Surely it’s not possible to run multiple instances of tor with the
    same fingerprint? Or is it? Does the answer change if all instances
    are on the same IP address? If the OR ports are never used?

Good timing – Cecylia pointed out the higher load on Flakey a few days
ago, and I’ve been meaning to post a suggestion somewhere. You actually
can run more than one bridge with the same fingerprint. Just set it
up in two places, with the same identity key, and then whichever one the
client connects to, the client will be satisfied that it’s reaching the
right bridge.

There are two catches to the idea:

(A) Even though the bridges will have the same identity key, they won’t
have the same circuit-level onion key, so it will be smart to “pin”
each client to a single bridge instance – so when they fetch the bridge
descriptor, which specifies the onion key, they will continue to use
that bridge instance with that onion key. Snowflake in particular might
also want to pin clients to specific bridges because of the KCP state.

(Another option, instead of pinning clients to specific instances,
would be to try to share state among all the bridges on the backend,
e.g. so they use the same onion key, can resume the same KCP sessions,
etc. This option seems hard.)

(B) It’s been a long time since anybody tried this, so there might be
surprises. :slight_smile: But it should work, so if there are surprises, we should
try to fix them.

This overall idea is similar to the “router twins” idea from the distant
distant past:
https://lists.torproject.org/pipermail/tor-dev/2002-July/001122.html
https://lists.torproject.org/pipermail/tor-commits/2003-October/024388.html
https://lists.torproject.org/pipermail/tor-dev/2003-August/000236.html

  • Removing the fingerprint from the snowflake Bridge line in Tor Browser
    would permit the Snowflake proxies to round-robin clients over several
    bridges, but then the first hop would be unauthenticated (at the Tor
    layer). It would be nice if it were possible to specify a small set of
    permitted bridge fingerprints.

This approach would also require clients to pin to a particular bridge,
right? Because of the different state that each bridge will have?

–Roger

···

On Monday, December 27, 2021, 2:44:59 PM MST, Gary C. New via tor-relays tor-relays@lists.torproject.org wrote:
On Monday, December 27, 2021, 2:00:50 PM MST, Roger Dingledine arma@torproject.org wrote:
On Mon, Dec 27, 2021 at 12:05:26PM -0700, David Fifield wrote:


tor-relays mailing list
tor-relays@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays


tor-relays mailing list
tor-relays@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays

Hi Kristian,

Thanks for the screenshot. Nice Machine! Not everyone is as fortunate as you when it comes to resources for their Tor deployments. While a cpu affinity option isn’t high on the priority list, as you point out, many operating systems do a decent job of load management and there are third-party options available for cpu affinity, but it might be helpful for some to have an application layer option to tune their implementations natively.

As an aside… Presently, are you using a single, public address with many ports or many, public addresses with a single port for your Tor deployments? Have you ever considered putting all those Tor instances behind a single, public address:port (fingerprint) to create one super bridge/relay? I’m just wondering if it makes sense to conserve and rotate through public address space to stay ahead of the blacklisting curve?

Also… Do you mind disclosing what all your screen instances are for? Are you running your Tor instances manually and not in daemon mode? “Inquiring minds want to know.” :grin:

As always… It is great to engage in dialogue with you.

Respectfully,

Gary

Hi Gary,

why would that be needed? Linux has a pretty good thread scheduler imo and will shuffle loads around as needed.

Even Windows’ thread scheduler is quite decent these days and tools like “Process Lasso” exist if additional fine tuning is needed.

Attached is one of my servers running multiple tor instances on a 12/24C platform. The load is spread quite evenly across all cores.

Best Regards,

Kristian

Dec 27, 2021, 22:08 by tor-relays@lists.torproject.org:

···

On Tuesday, December 28, 2021, 1:39:31 PM MST, abuse@lokodlare.com abuse@lokodlare.com wrote:

BTW… I just fact-checked my post-script and the cpu affinity configuration I was thinking of is for Nginx (not Tor). Tor should consider adding a cpu affinity configuration option. What happens if you configure additional Tor instances on the same machine (my Tor instances are on different machines) and start them up? Do they bind to a different or the same cpu core?

Respectfully,

Gary

On Monday, December 27, 2021, 2:44:59 PM MST, Gary C. New via tor-relays tor-relays@lists.torproject.org wrote:

David/Roger:

Search the tor-relay mail archive for my previous responses on loadbalancing Tor Relays, which I’ve been successfully doing for the past 6 months with Nginx (it’s possible to do with HAProxy as well). I haven’t had time to implement it with a Tor Bridge, but I assume it will be very similar. Keep in mind it’s critical to configure each Tor instance to use the same DirectoryAuthority and to disable the upstream timeouts on Nginx/HAProxy.

Happy Tor Loadbalancing!

Respectfully,

Gary

P.S. I believe there’s a torrc config option to specify which cpu core a given Tor instance should use, too.

On Monday, December 27, 2021, 2:00:50 PM MST, Roger Dingledine arma@torproject.org wrote:

On Mon, Dec 27, 2021 at 12:05:26PM -0700, David Fifield wrote:

I have the impression that tor cannot use more than one CPU core???is that

correct? If so, what can be done to permit a bridge to scale beyond

1×100% CPU? We can fairly easily scale the Snowflake-specific components

around the tor process, but ultimately, a tor client process expects to

connect to a bridge having a certain fingerprint, and that is the part I

don’t know how to easily scale.

  • Surely it’s not possible to run multiple instances of tor with the

same fingerprint? Or is it? Does the answer change if all instances

are on the same IP address? If the OR ports are never used?

Good timing – Cecylia pointed out the higher load on Flakey a few days

ago, and I’ve been meaning to post a suggestion somewhere. You actually

can run more than one bridge with the same fingerprint. Just set it

up in two places, with the same identity key, and then whichever one the

client connects to, the client will be satisfied that it’s reaching the

right bridge.

There are two catches to the idea:

(A) Even though the bridges will have the same identity key, they won’t

have the same circuit-level onion key, so it will be smart to “pin”

each client to a single bridge instance – so when they fetch the bridge

descriptor, which specifies the onion key, they will continue to use

that bridge instance with that onion key. Snowflake in particular might

also want to pin clients to specific bridges because of the KCP state.

(Another option, instead of pinning clients to specific instances,

would be to try to share state among all the bridges on the backend,

e.g. so they use the same onion key, can resume the same KCP sessions,

etc. This option seems hard.)

(B) It’s been a long time since anybody tried this, so there might be

surprises. :slight_smile: But it should work, so if there are surprises, we should

try to fix them.

This overall idea is similar to the “router twins” idea from the distant

distant past:

https://lists.torproject.org/pipermail/tor-dev/2002-July/001122.html

https://lists.torproject.org/pipermail/tor-commits/2003-October/024388.html

https://lists.torproject.org/pipermail/tor-dev/2003-August/000236.html

  • Removing the fingerprint from the snowflake Bridge line in Tor Browser

would permit the Snowflake proxies to round-robin clients over several

bridges, but then the first hop would be unauthenticated (at the Tor

layer). It would be nice if it were possible to specify a small set of

permitted bridge fingerprints.

This approach would also require clients to pin to a particular bridge,

right? Because of the different state that each bridge will have?

–Roger


tor-relays mailing list

tor-relays@lists.torproject.org

https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays


tor-relays mailing list

tor-relays@lists.torproject.org

https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays

1 Like

Hi Gary,

why would that be needed? Linux has a pretty good thread scheduler imo and will shuffle loads around as needed.

Even Windows’ thread scheduler is quite decent these days and tools like “Process Lasso” exist if additional fine tuning is needed.

Attached is one of my servers running multiple tor instances on a 12/24C platform. The load is spread quite evenly across all cores.

Best Regards,

Kristian

Dec 27, 2021, 22:08 by tor-relays@lists.torproject.org:

···

BTW… I just fact-checked my post-script and the cpu affinity configuration I was thinking of is for Nginx (not Tor). Tor should consider adding a cpu affinity configuration option. What happens if you configure additional Tor instances on the same machine (my Tor instances are on different machines) and start them up? Do they bind to a different or the same cpu core?

Respectfully,

Gary

On Monday, December 27, 2021, 2:44:59 PM MST, Gary C. New via tor-relays tor-relays@lists.torproject.org wrote:

David/Roger:

Search the tor-relay mail archive for my previous responses on loadbalancing Tor Relays, which I’ve been successfully doing for the past 6 months with Nginx (it’s possible to do with HAProxy as well). I haven’t had time to implement it with a Tor Bridge, but I assume it will be very similar. Keep in mind it’s critical to configure each Tor instance to use the same DirectoryAuthority and to disable the upstream timeouts on Nginx/HAProxy.

Happy Tor Loadbalancing!

Respectfully,

Gary

P.S. I believe there’s a torrc config option to specify which cpu core a given Tor instance should use, too.

On Monday, December 27, 2021, 2:00:50 PM MST, Roger Dingledine arma@torproject.org wrote:

On Mon, Dec 27, 2021 at 12:05:26PM -0700, David Fifield wrote:

I have the impression that tor cannot use more than one CPU core???is that

correct? If so, what can be done to permit a bridge to scale beyond

1×100% CPU? We can fairly easily scale the Snowflake-specific components

around the tor process, but ultimately, a tor client process expects to

connect to a bridge having a certain fingerprint, and that is the part I

don’t know how to easily scale.

  • Surely it’s not possible to run multiple instances of tor with the

same fingerprint? Or is it? Does the answer change if all instances

are on the same IP address? If the OR ports are never used?

Good timing – Cecylia pointed out the higher load on Flakey a few days

ago, and I’ve been meaning to post a suggestion somewhere. You actually

can run more than one bridge with the same fingerprint. Just set it

up in two places, with the same identity key, and then whichever one the

client connects to, the client will be satisfied that it’s reaching the

right bridge.

There are two catches to the idea:

(A) Even though the bridges will have the same identity key, they won’t

have the same circuit-level onion key, so it will be smart to “pin”

each client to a single bridge instance – so when they fetch the bridge

descriptor, which specifies the onion key, they will continue to use

that bridge instance with that onion key. Snowflake in particular might

also want to pin clients to specific bridges because of the KCP state.

(Another option, instead of pinning clients to specific instances,

would be to try to share state among all the bridges on the backend,

e.g. so they use the same onion key, can resume the same KCP sessions,

etc. This option seems hard.)

(B) It’s been a long time since anybody tried this, so there might be

surprises. :slight_smile: But it should work, so if there are surprises, we should

try to fix them.

This overall idea is similar to the “router twins” idea from the distant

distant past:

https://lists.torproject.org/pipermail/tor-dev/2002-July/001122.html

https://lists.torproject.org/pipermail/tor-commits/2003-October/024388.html

https://lists.torproject.org/pipermail/tor-dev/2003-August/000236.html

  • Removing the fingerprint from the snowflake Bridge line in Tor Browser

would permit the Snowflake proxies to round-robin clients over several

bridges, but then the first hop would be unauthenticated (at the Tor

layer). It would be nice if it were possible to specify a small set of

permitted bridge fingerprints.

This approach would also require clients to pin to a particular bridge,

right? Because of the different state that each bridge will have?

–Roger


tor-relays mailing list

tor-relays@lists.torproject.org

https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays


tor-relays mailing list

tor-relays@lists.torproject.org

https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays

1 Like

Hi Gary,

thanks!

As an aside… Presently, are you using a single, public address with many ports or many, public addresses with a single port for your Tor deployments? Have you ever considered putting all those Tor instances behind a single, public address:port (fingerprint) to create one super bridge/relay? I’m just wondering if it makes sense to conserve and rotate through public address space to stay ahead of the blacklisting curve?

Almost all of my dedicated servers have multiple IPv4 addresses, and you can have up to two tor relays per IPv4. So, the answer is multiple IPs and on multiple different ports. A “super relay” still has no real merit for me. I am not really concerned about my IPs being blacklisted as these are normal relays, not bridges.

What I am doing now for new servers is running them for a week or two as bridges and only then I move them over to hosting relays. In the past I have not seen a lot of traffic on bridges, but this has changed very recently. I saw 200+ unique users in the past 6 hours on one of my new bridges yesterday with close to 100 Mbit/s of consistent traffic. There appears to be an increased need right now, which I am happy to tend to.

Also… Do you mind disclosing what all your screen instances are for? Are you running your Tor instances manually and not in daemon mode? “Inquiring minds want to know.” :grin:

In that area I am a little bit old school, and I am indeed running them manually for now. I don’t think there is a technical reason for it. It’s just me being me.

Best Regards,

Kristian

Dec 29, 2021, 01:46 by tor-relays@lists.torproject.org:

···

Hi Kristian,

Thanks for the screenshot. Nice Machine! Not everyone is as fortunate as you when it comes to resources for their Tor deployments. While a cpu affinity option isn’t high on the priority list, as you point out, many operating systems do a decent job of load management and there are third-party options available for cpu affinity, but it might be helpful for some to have an application layer option to tune their implementations natively.

As an aside… Presently, are you using a single, public address with many ports or many, public addresses with a single port for your Tor deployments? Have you ever considered putting all those Tor instances behind a single, public address:port (fingerprint) to create one super bridge/relay? I’m just wondering if it makes sense to conserve and rotate through public address space to stay ahead of the blacklisting curve?

Also… Do you mind disclosing what all your screen instances are for? Are you running your Tor instances manually and not in daemon mode? “Inquiring minds want to know.” :grin:

As always… It is great to engage in dialogue with you.

Respectfully,

Gary

On Tuesday, December 28, 2021, 1:39:31 PM MST, abuse@lokodlare.com abuse@lokodlare.com wrote:

Hi Gary,

why would that be needed? Linux has a pretty good thread scheduler imo and will shuffle loads around as needed.

Even Windows’ thread scheduler is quite decent these days and tools like “Process Lasso” exist if additional fine tuning is needed.

Attached is one of my servers running multiple tor instances on a 12/24C platform. The load is spread quite evenly across all cores.

Best Regards,

Kristian

Dec 27, 2021, 22:08 by tor-relays@lists.torproject.org:

BTW… I just fact-checked my post-script and the cpu affinity configuration I was thinking of is for Nginx (not Tor). Tor should consider adding a cpu affinity configuration option. What happens if you configure additional Tor instances on the same machine (my Tor instances are on different machines) and start them up? Do they bind to a different or the same cpu core?

Respectfully,

Gary

On Monday, December 27, 2021, 2:44:59 PM MST, Gary C. New via tor-relays tor-relays@lists.torproject.org wrote:

David/Roger:

Search the tor-relay mail archive for my previous responses on loadbalancing Tor Relays, which I’ve been successfully doing for the past 6 months with Nginx (it’s possible to do with HAProxy as well). I haven’t had time to implement it with a Tor Bridge, but I assume it will be very similar. Keep in mind it’s critical to configure each Tor instance to use the same DirectoryAuthority and to disable the upstream timeouts on Nginx/HAProxy.

Happy Tor Loadbalancing!

Respectfully,

Gary

P.S. I believe there’s a torrc config option to specify which cpu core a given Tor instance should use, too.

On Monday, December 27, 2021, 2:00:50 PM MST, Roger Dingledine arma@torproject.org wrote:

On Mon, Dec 27, 2021 at 12:05:26PM -0700, David Fifield wrote:

I have the impression that tor cannot use more than one CPU core???is that

correct? If so, what can be done to permit a bridge to scale beyond

1×100% CPU? We can fairly easily scale the Snowflake-specific components

around the tor process, but ultimately, a tor client process expects to

connect to a bridge having a certain fingerprint, and that is the part I

don’t know how to easily scale.

  • Surely it’s not possible to run multiple instances of tor with the

same fingerprint? Or is it? Does the answer change if all instances

are on the same IP address? If the OR ports are never used?

Good timing – Cecylia pointed out the higher load on Flakey a few days

ago, and I’ve been meaning to post a suggestion somewhere. You actually

can run more than one bridge with the same fingerprint. Just set it

up in two places, with the same identity key, and then whichever one the

client connects to, the client will be satisfied that it’s reaching the

right bridge.

There are two catches to the idea:

(A) Even though the bridges will have the same identity key, they won’t

have the same circuit-level onion key, so it will be smart to “pin”

each client to a single bridge instance – so when they fetch the bridge

descriptor, which specifies the onion key, they will continue to use

that bridge instance with that onion key. Snowflake in particular might

also want to pin clients to specific bridges because of the KCP state.

(Another option, instead of pinning clients to specific instances,

would be to try to share state among all the bridges on the backend,

e.g. so they use the same onion key, can resume the same KCP sessions,

etc. This option seems hard.)

(B) It’s been a long time since anybody tried this, so there might be

surprises. :slight_smile: But it should work, so if there are surprises, we should

try to fix them.

This overall idea is similar to the “router twins” idea from the distant

distant past:

https://lists.torproject.org/pipermail/tor-dev/2002-July/001122.html

https://lists.torproject.org/pipermail/tor-commits/2003-October/024388.html

https://lists.torproject.org/pipermail/tor-dev/2003-August/000236.html

  • Removing the fingerprint from the snowflake Bridge line in Tor Browser

would permit the Snowflake proxies to round-robin clients over several

bridges, but then the first hop would be unauthenticated (at the Tor

layer). It would be nice if it were possible to specify a small set of

permitted bridge fingerprints.

This approach would also require clients to pin to a particular bridge,

right? Because of the different state that each bridge will have?

–Roger


tor-relays mailing list

tor-relays@lists.torproject.org

https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays


tor-relays mailing list

tor-relays@lists.torproject.org

https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays

1 Like

To improve cache locality, as in modern CPUs L1/L2/L3 cache is partitioned
into various schemes per core or core cluster. So it is benificial if the same
running thread gets stuck to a particular core or set of cores, as that's where
it would have all cached data still warm in cache from its previous
timeslices, and is not shuffled around to other cores.

But in theory the OS scheduler should be smart enough to ensure that without
manual intervention.

Also I am not sure how relevant that is for the kind of computation that Tor
does. And in any case, it is a "nice to have" which usually shouldn't make a
huge difference.

Ideally though, the application thread handling the incoming data should also
run on the same CPU core that just handled the incoming IRQ from NIC. But that
requires support across all of the application, OS, NIC hardware and driver,
and very careful tuning.

···

On Tue, 28 Dec 2021 21:39:27 +0100 (CET) abuse--- via tor-relays <tor-relays@lists.torproject.org> wrote:

why would that be needed? Linux has a pretty good thread scheduler imo and
will shuffle loads around as needed.

--
With respect,
Roman
_______________________________________________
tor-relays mailing list
tor-relays@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays

1 Like

On Mon, Dec 27, 2021 at 04:00:34PM -0500, Roger Dingledine wrote:

On Mon, Dec 27, 2021 at 12:05:26PM -0700, David Fifield wrote:

I have the impression that tor cannot use more than one CPU core???is that
correct? If so, what can be done to permit a bridge to scale beyond
1×100% CPU? We can fairly easily scale the Snowflake-specific components
around the tor process, but ultimately, a tor client process expects to
connect to a bridge having a certain fingerprint, and that is the part I
don’t know how to easily scale.

  • Surely it’s not possible to run multiple instances of tor with the
    same fingerprint? Or is it? Does the answer change if all instances
    are on the same IP address? If the OR ports are never used?

Good timing – Cecylia pointed out the higher load on Flakey a few days
ago, and I’ve been meaning to post a suggestion somewhere. You actually
can run more than one bridge with the same fingerprint. Just set it
up in two places, with the same identity key, and then whichever one the
client connects to, the client will be satisfied that it’s reaching the
right bridge.

Thanks for this information. I’ve done a test with one instance of obfs4proxy forwarding through a load balancer to two instances of tor that have the same keys, and it seems to work. It seems like this could
work for Snowflake.

(A) Even though the bridges will have the same identity key, they won’t
have the same circuit-level onion key, so it will be smart to “pin”
each client to a single bridge instance – so when they fetch the bridge
descriptor, which specifies the onion key, they will continue to use
that bridge instance with that onion key. Snowflake in particular might
also want to pin clients to specific bridges because of the KCP state.

(Another option, instead of pinning clients to specific instances,
would be to try to share state among all the bridges on the backend,
e.g. so they use the same onion key, can resume the same KCP sessions,
etc. This option seems hard.)

Let’s make a distinction between the “frontend” snowflake-server pluggable transport process, and the “backend” tor process. These don’t necessarily have to be 1:1; either one could be run in multiple
instances. Currently, the “backend” tor is the limiting factor, because it uses only 1 CPU core. The “frontend” snowflake-server can scale to multiple cores in a single process and is comparatively unrestrained. So I propose to keep snowflake-server as a single process, and to run multiple tor processes. That eliminates the dimension of KCP state coordination, and should last us until snowflake-server outgrows the resources of a single host.

The snowflake-server program is a managed proxy; i.e., it expects to run with certain environment variables set by a managing process, normally tor. We’ll need to instead run snowflake-server apart from any single tor instance. Probably the easiest way to do that in the short term is with ptadapter, which converts a pluggable transport into a TCP proxy, forwarding to an address you specify.

Then we can have ptadapter forward to a load balancer like haproxy. The load balancer will then round-robin over the ORPorts of the available tor instances. The tor instances can all be on the same host (run as many instances as you have CPU cores), which may or may not be the same host on which snowflake-server is running.

Currently we have this:

	    ________________     ___
	-->|snowflake-server|-->|tor|
            ----------------     ---
              (run by tor)

The design I’m proposing is this:

	                                      ___
	                                  .->|tor|
	    ________________     _______  |   ---
	-->|snowflake-server|-->|haproxy|-+->|tor|
	    ----------------     -------  |   ---
	   (run by ptadapter)             '->|tor|
	                                      ---

I believe that the “pinning” of a client session to particular tor instance will work automatically by the fact that snowflake-server keeps an outgoing connection alive (i.e., through the load balancer) as long
as a KCP session exists.

One complication we’ll have to work out is that ptadapter doesn’t have a setting for ExtORPort forwarding. ptadapter absorbs any ExtORPort information and forwards an unadorned connection onward. The idea I had to to work around this limitation is to have ptadapter, rather than execute snowflake-server directly, execute a shell script that sets TOR_PT_EXTENDED_SERVER_PORT to a hardcoded address (i.e., to haproxy) before running snowflake-server. Though, I am not sure what to do about the extended_orport_auth_cookie file, which will be different for different tor instances.

Demo instructions

This is what I did to do a test of one instance of obfs4proxy communicating with two instances of tor that have the same keys, on Debian 11.

Install a first instance of tor and configure it as a bridge:

	# apt install tor
	# tor-instance-create o1

/etc/tor/instances/o1/torrc:

	BridgeRelay 1
	PublishServerDescriptor 0
	AssumeReachable 1
	SocksPort 0
	ORPort 127.0.0.1:9001

Start the first instance, which will generate keys:

	systemctl start tor at o1

Install a second instance of tor and configure it as a bridge (with a
different ORPort):

# tor-instance-create o2

/etc/tor/instances/o2/torrc:

	BridgeRelay 1
	PublishServerDescriptor 0
	AssumeReachable 1
	SocksPort 0
	ORPort 127.0.0.1:9002

But before starting the second instance the first time, copy keys from
the first instance:

	# cp -r /var/lib/tor-instances/o1/keys /var/lib/tor-instances/o2/
	# chown -R _tor-o2:_tor-o2 /var/lib/tor-instances/o2/keys/
	# systemctl start tor at o2

The two instances should have the same fingerprint:

# cat /var/lib/tor-instances/*/fingerprint
	Unnamed 4808CD98E4C1D4F282DA741A860A44D755701F2F
	Unnamed 4808CD98E4C1D4F282DA741A860A44D755701F2F

Install haproxy and configure it to forward to the two tor instances:

	# apt install haproxy

/etc/haproxy/haproxy.cfg:

	frontend tor
		mode tcp
		bind 127.0.0.1:9000
		default_backend tor-o
	backend tor-o
		mode tcp
		server o1 127.0.0.1:9001
		server o2 127.0.0.1:9002

Restart haproxy with the new configuration:

	# systemctl restart haproxy

Install ptadapter and configure it to listen on an external address and
forward to haproxy:

	# apt install python3-pip
	# pip3 install pdadapter

ptadapter.ini:

	[server]
	exec = /usr/bin/obfs4proxy
	state = pt_state
	forward = 127.0.0.1:9000
	tunnels = server_obfs4
	[server_obfs4]
	transport = obfs4
	listen = [::]:443

Run ptadapter:

	ptadapter -S ptadapter.ini

On the client, make a torrc file with the information from the pt_state/obfs4_bridgeline.txt file created by ptadapter:

	UseBridges 1
	SocksPort auto
	Bridge obfs4 172.105.3.197:443 4808CD98E4C1D4F282DA741A860A44D755701F2F cert=1SCzqyYyPh/SiXTJa9nLFxMyjWQITVCKeICME+SwxgNcTTSUQ7+vM/ghofU7oaalIRBILg iat-mode=0
	ClientTransportPlugin obfs4 exec /usr/bin/obfs4proxy
	DataDir datadir

Then run tor with the torrc:

tor -f torrc

If you restart tor multiple times on the client, you can see haproxy alternating between the two backend servers (o1 and o2) in /var/log/haproxy.log:

	Dec 31 04:30:31 localhost haproxy[9707]: 127.0.0.1:55500 [31/Dec/2021:04:30:21.235] tor tor-o/o1 1/0/10176 11435 -- 1/1/0/0/0 0/0
	Dec 31 04:30:51 localhost haproxy[9707]: 127.0.0.1:55514 [31/Dec/2021:04:30:46.925] tor tor-o/o2 1/0/4506 17682 -- 1/1/0/0/0 0/0
	Dec 31 04:38:41 localhost haproxy[9707]: 127.0.0.1:55528 [31/Dec/2021:04:30:55.540] tor tor-o/o1 1/0/466049 78751 -- 1/1/0/0/0 0/0
	Dec 31 05:34:52 localhost haproxy[9707]: 127.0.0.1:55594 [31/Dec/2021:05:34:50.083] tor tor-o/o2 1/0/2209 13886 -- 1/1/0/0/0 0/0
1 Like

On Thu, Dec 30, 2021 at 10:42:51PM -0700, David Fifield wrote:

One complication we’ll have to work out is that ptadapter doesn’t have a
setting for ExtORPort forwarding. ptadapter absorbs any ExtORPort
information and forwards an unadorned connection onward. The idea I had
to to work around this limitation is to have ptadapter, rather than
execute snowflake-server directly, execute a shell script that sets
TOR_PT_EXTENDED_SERVER_PORT to a hardcoded address (i.e., to haproxy)
before running snowflake-server. Though, I am not sure what to do about
the extended_orport_auth_cookie file, which will be different for
different tor instances.

There are a number of potential ways to deal with the complication of
ExtORPort authentication, from alternative ExtORPort authentication
types, to ExtORPort-aware load balancing. With a view towards deploying
something in the near future, I wrote this program that enables an
external pluggable transport to talk to tor’s ExtORPort and authenticate
as if it had an unchanging authentication cookie.

The difficulty with load-balancing multiple tor instances, with respect
to ExtORPort, is that to authenticate with the ExtORPort you need to
read a cookie from a file on disk, which tor overwrites randomly every
time it starts. If you do not know which instance of tor will receive
your forwarded traffic, you do not know which ExtORPort cookie to use.

The extor-static-cookie program presents an ExtORPort interface, however
it reads its authentication cookie that is independent of any instance
of tor, which you can write once and then leave alone. The external
server pluggable transport can read from the shared authentication
cookie file as well. Every instance of tor runs a copy of
extor-static-cookie, all using the same authentication cookie file. The
extor-static-cookie instances receive ExtORPort authentication from the
external server pluggable transport, along with the USERADDR and
TRANSPORT metadata, then re-authenticate and echo that information to
their respective tor’s ExtORPort.

So we change from this:

	                                    ___
	                                .->|tor|
	   ________________    _______  |   ---
	->|snowflake-server|->|haproxy|-+->|tor|
	   ----------------    -------  |   ---
	                                '->|tor|
	                                    ---

to this:

	                                    ___________________    ___
	                                .->|extor-static-cookie|->|tor|
	   ________________    _______  |   -------------------    ---
	->|snowflake-server|->|haproxy|-+->|extor-static-cookie|->|tor|
	   ----------------    -------  |   -------------------    ---
	                                '->|extor-static-cookie|->|tor|
	                                    -------------------    ---

I have a similar setup running now on a test bridge, with one instance
of obfs4proxy load-balancing to two instances of tor.

Setup notes

Install extor-static-cookie:

	# apt install golang
	# git clone https://gitlab.torproject.org/dcf/extor-static-cookie
	# (cd extor-static-cookie && go build)
	# install -o root -g root extor-static-cookie/extor-static-cookie /usr/local/bin/

Generate a shared authentication cookie file:

	# mkdir -m 755 /var/lib/extor-static-cookie
	# extor-static-cookie/gen-auth-cookie > /var/lib/extor-static-cookie/static_extended_orport_auth_cookie

Install a first instance of tor and configure it as a bridge:

	# apt install tor
	# tor-instance-create o1

/etc/tor/instances/o1/torrc:

	BridgeRelay 1
	PublishServerDescriptor 0
	AssumeReachable 1
	SocksPort 0
	ORPort 127.0.0.1:auto
	ExtORPort auto
	ServerTransportPlugin extor_static_cookie exec /usr/local/bin/extor-static-cookie /var/lib/extor-static-cookie/static_extended_orport_auth_cookie
	ServerTransportListenAddr extor_static_cookie 127.0.0.1:10001

Notice we set ExtORPort auto (this is tor’s own ExtORPort), and also pass 127.0.0.1:10001 to extor-static-cookie, which is the ExtORPort that the external server pluggable transport will talk to. Start the first instance, which will generate keys:

systemctl start tor at o1

Install a second instance of tor and configure it as a bridge (with a different ServerTransportListenAddr port):

 # tor-instance-create o2

/etc/tor/instances/o2/torrc:

BridgeRelay 1
PublishServerDescriptor 0
AssumeReachable 1
SocksPort 0
ORPort 127.0.0.1:auto
ExtORPort auto
ServerTransportPlugin extor_static_cookie exec /usr/local/bin/extor-static-cookie /var/lib/extor-static-cookie/static_extended_orport_auth_cookie
ServerTransportListenAddr extor_static_cookie 127.0.0.1:10002

But before starting the second instance the first time, copy keys from
the first instance:

# cp -r /var/lib/tor-instances/o1/keys /var/lib/tor-instances/o2/
# chown -R _tor-o2:_tor-o2 /var/lib/tor-instances/o2/keys/
# systemctl start tor at o2

The two instances should have the same fingerprint:

# cat /var/lib/tor-instances/*/fingerprint
Unnamed 4808CD98E4C1D4F282DA741A860A44D755701F2F
Unnamed 4808CD98E4C1D4F282DA741A860A44D755701F2F

Install haproxy and configure it to forward to the two instances of
extor-static-cookie (which will then forward to the ExtORPort of their
respective tor instances):

# apt install haproxy

/etc/haproxy/haproxy.cfg:

frontend tor
	mode tcp
	bind 127.0.0.1:10000
	default_backend tor-o
backend tor-o
	mode tcp
	server o1 127.0.0.1:10001
	server o2 127.0.0.1:10002

Restart haproxy with the new configuration:

# systemctl restart haproxy

Instead of ptadapter, I found it more convenient to start the external server pluggable transport with a shell script that sets up the necessary variables:

extor.sh:

#!/bin/sh

# Usage: extor.sh 127.0.0.1:10000 /var/lib/extor-static-cookie/static_extended_orport_auth_cookie /usr/bin/obfs4proxy

set -e

EXTOR_ADDR="${1:?missing ExtORPort address}"
EXTOR_COOKIE_FILE="${2:?missing ExtORPort auth cookie file}"
shift 2

BINDADDR='[::]:443'
TRANSPORT=obfs4

TOR_PT_MANAGED_TRANSPORT_VER=1 \
TOR_PT_SERVER_TRANSPORTS="$TRANSPORT" \
TOR_PT_SERVER_BINDADDR="$TRANSPORT"-"$BINDADDR" \
TOR_PT_EXTENDED_SERVER_PORT="$EXTOR_ADDR" \
TOR_PT_AUTH_COOKIE_FILE="$EXTOR_COOKIE_FILE" \
TOR_PT_STATE_LOCATION=pt_state \
TOR_PT_EXIT_ON_STDIN_CLOSE=1 \
exec "$@"

Then I run the shell script, giving the address of the haproxy frontend,
the path to the shared authentication cookie file, and a command to run:

# ./extor.sh 127.0.0.1:10000 /var/lib/extor-static-cookie/static_extended_orport_auth_cookie /usr/bin/obfs4proxy

On the client, make a torrc file with the information from
pt_state/obfs4_bridgeline.txt:

UseBridges 1
SocksPort auto
Bridge obfs4 172.105.3.197:443 4808CD98E4C1D4F282DA741A860A44D755701F2F cert=1SCzqyYyPh/SiXTJa9nLFxMyjWQITVCKeICME+SwxgNcTTSUQ7+vM/ghofU7oaalIRBILg iat-mode=0
ClientTransportPlugin obfs4 exec /usr/bin/obfs4proxy
DataDir datadir

Then run tor with the torrc:

$ tor -f torrc

[I'm about to go off-line for some days, so I am sending my current
suboptimally-organized reply, which I hope is better than waiting another
week to respond :)]

Let's make a distinction between the "frontend" snowflake-server
pluggable transport process, and the "backend" tor process. These don't
necessarily have to be 1:1; either one could be run in multiple
instances. Currently, the "backend" tor is the limiting factor, because
it uses only 1 CPU core. The "frontend" snowflake-server can scale to
multiple cores in a single process and is comparatively unrestrained.

Excellent point, and yes this simplifies. Great.

I believe that the "pinning" of a client session to particular tor
instance will work automatically by the fact that snowflake-server keeps
an outgoing connection alive (i.e., through the load balancer) as long
as a KCP session exists.
[...]
But before starting the second instance the first time, copy keys from
the first instance:

Hm. It looks promising! But we might still have a Tor-side problem
remaining. I think it boils down to how long the KCP sessions last.

The details on how exactly these bridge instances will diverge over time:

The keys directory will start out the same, but after four weeks
(DEFAULT_ONION_KEY_LIFETIME_DAYS, used to be one week but in Tor
0.3.1.1-alpha, proposal 274, we bumped it up to four weeks) each
bridge will rotate its onion key (the one clients use for circuit-level
crypto). That is, each instance will generate its own fresh onion key.

The two bridge instances actually haven't diverged completely at that
point, since Tor remembers the previous onion key (i.e. the onion key
from the previous period) and is willing to receive create cells that
use it for one further week (DEFAULT_ONION_KEY_GRACE_PERIOD_DAYS). So it
is after 5 weeks that the original (shared) onion key will no longer work.

Where this matters is (after this 5 weeks have passed) if the client
connects to the bridge, fetches and caches the bridge descriptor of
instance A, and then later it connects to the bridge again and gets
passed to instance B. In this case, the create cell that the client
generates will use the onion key for instance A, and instance B won't
know how to decrypt it so it will send a destroy cell back.

If this is an issue, we can definitely work around it, by e.g. disabling
the onion key rotation on the bridges, or setting up a periodic rsync+hup
between the bridges, or teaching clients to use createfast cells in this
situation (this type of circuit crypto doesn't use the onion key at all,
and just relies on TLS for security -- which can only be done for the
first hop of the circuit but that's the one we're talking about here).

But before we think about workarounds, maybe we don't need one: how long
does "the KCP session" last?

Tor clients try to fetch a fresh bridge descriptor every three-ish
hours, and once they fetch a bridge descriptor from their "current"
bridge instance, they should know the onion key that it wants to use. So
it is that up-to-three-hour window where I think things could go wrong.
And that timeframe sounds promising.

(I also want to double-check that clients don't try to use the onion
key from the current cached descriptor while fetching the updated
descriptor. That could become an ugly bug in the wrong circumstances,
and would be something we want to fix if it's happening.)

Here's how you can simulate a pair of bridge instances that have diverged
after five weeks, so you can test how things would work with them:

Copy the keys directory as before, but "rm secret_onion_key*" in the
keys directory on n-1 of the instances, before starting them.)

Thanks!
--Roger

···

On Thu, Dec 30, 2021 at 10:42:51PM -0700, David Fifield wrote:

_______________________________________________
tor-relays mailing list
tor-relays@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays

2 Likes

Kristian,

I am not really concerned about my IPs being blacklisted as these are normal relays, not bridges.

I suppose if you have the address space and are running your relays in a server environment–it’s your prerogative. In my case, I’m running my super relay, from home, with limited address space, so it is more suited to my needs.

In that area I am a little bit old school, and I am indeed running them manually for now. I don’t think there is a technical reason for it. It’s just me being me.

I’m a proponent of individuality. Keep being you.

Respectfully,

Gary

Hi Gary,

thanks!

As an aside… Presently, are you using a single, public address with many ports or many, public addresses with a single port for your Tor deployments? Have you ever considered putting all those Tor instances behind a single, public address:port (fingerprint) to create one super bridge/relay? I’m just wondering if it makes sense to conserve and rotate through public address space to stay ahead of the blacklisting curve?

Almost all of my dedicated servers have multiple IPv4 addresses, and you can have up to two tor relays per IPv4. So, the answer is multiple IPs and on multiple different ports. A “super relay” still has no real merit for me. I am not really concerned about my IPs being blacklisted as these are normal relays, not bridges.

What I am doing now for new servers is running them for a week or two as bridges and only then I move them over to hosting relays. In the past I have not seen a lot of traffic on bridges, but this has changed very recently. I saw 200+ unique users in the past 6 hours on one of my new bridges yesterday with close to 100 Mbit/s of consistent traffic. There appears to be an increased need right now, which I am happy to tend to.

Also… Do you mind disclosing what all your screen instances are for? Are you running your Tor instances manually and not in daemon mode? “Inquiring minds want to know.” :grin:

In that area I am a little bit old school, and I am indeed running them manually for now. I don’t think there is a technical reason for it. It’s just me being me.

Best Regards,

Kristian

Dec 29, 2021, 01:46 by tor-relays@lists.torproject.org:

···

On Wednesday, December 29, 2021, 03:32:55 AM MST, abuse— via tor-relays tor-relays@lists.torproject.org wrote:

Hi Kristian,

Thanks for the screenshot. Nice Machine! Not everyone is as fortunate as you when it comes to resources for their Tor deployments. While a cpu affinity option isn’t high on the priority list, as you point out, many operating systems do a decent job of load management and there are third-party options available for cpu affinity, but it might be helpful for some to have an application layer option to tune their implementations natively.

As an aside… Presently, are you using a single, public address with many ports or many, public addresses with a single port for your Tor deployments? Have you ever considered putting all those Tor instances behind a single, public address:port (fingerprint) to create one super bridge/relay? I’m just wondering if it makes sense to conserve and rotate through public address space to stay ahead of the blacklisting curve?

Also… Do you mind disclosing what all your screen instances are for? Are you running your Tor instances manually and not in daemon mode? “Inquiring minds want to know.” :grin:

As always… It is great to engage in dialogue with you.

Respectfully,

Gary

On Tuesday, December 28, 2021, 1:39:31 PM MST, abuse@lokodlare.com abuse@lokodlare.com wrote:

Hi Gary,

why would that be needed? Linux has a pretty good thread scheduler imo and will shuffle loads around as needed.

Even Windows’ thread scheduler is quite decent these days and tools like “Process Lasso” exist if additional fine tuning is needed.

Attached is one of my servers running multiple tor instances on a 12/24C platform. The load is spread quite evenly across all cores.

Best Regards,

Kristian

Dec 27, 2021, 22:08 by tor-relays@lists.torproject.org:

BTW… I just fact-checked my post-script and the cpu affinity configuration I was thinking of is for Nginx (not Tor). Tor should consider adding a cpu affinity configuration option. What happens if you configure additional Tor instances on the same machine (my Tor instances are on different machines) and start them up? Do they bind to a different or the same cpu core?

Respectfully,

Gary

On Monday, December 27, 2021, 2:44:59 PM MST, Gary C. New via tor-relays tor-relays@lists.torproject.org wrote:

David/Roger:

Search the tor-relay mail archive for my previous responses on loadbalancing Tor Relays, which I’ve been successfully doing for the past 6 months with Nginx (it’s possible to do with HAProxy as well). I haven’t had time to implement it with a Tor Bridge, but I assume it will be very similar. Keep in mind it’s critical to configure each Tor instance to use the same DirectoryAuthority and to disable the upstream timeouts on Nginx/HAProxy.

Happy Tor Loadbalancing!

Respectfully,

Gary

P.S. I believe there’s a torrc config option to specify which cpu core a given Tor instance should use, too.

On Monday, December 27, 2021, 2:00:50 PM MST, Roger Dingledine arma@torproject.org wrote:

On Mon, Dec 27, 2021 at 12:05:26PM -0700, David Fifield wrote:

I have the impression that tor cannot use more than one CPU core???is that

correct? If so, what can be done to permit a bridge to scale beyond

1×100% CPU? We can fairly easily scale the Snowflake-specific components

around the tor process, but ultimately, a tor client process expects to

connect to a bridge having a certain fingerprint, and that is the part I

don’t know how to easily scale.

  • Surely it’s not possible to run multiple instances of tor with the

same fingerprint? Or is it? Does the answer change if all instances

are on the same IP address? If the OR ports are never used?

Good timing – Cecylia pointed out the higher load on Flakey a few days

ago, and I’ve been meaning to post a suggestion somewhere. You actually

can run more than one bridge with the same fingerprint. Just set it

up in two places, with the same identity key, and then whichever one the

client connects to, the client will be satisfied that it’s reaching the

right bridge.

There are two catches to the idea:

(A) Even though the bridges will have the same identity key, they won’t

have the same circuit-level onion key, so it will be smart to “pin”

each client to a single bridge instance – so when they fetch the bridge

descriptor, which specifies the onion key, they will continue to use

that bridge instance with that onion key. Snowflake in particular might

also want to pin clients to specific bridges because of the KCP state.

(Another option, instead of pinning clients to specific instances,

would be to try to share state among all the bridges on the backend,

e.g. so they use the same onion key, can resume the same KCP sessions,

etc. This option seems hard.)

(B) It’s been a long time since anybody tried this, so there might be

surprises. :slight_smile: But it should work, so if there are surprises, we should

try to fix them.

This overall idea is similar to the “router twins” idea from the distant

distant past:

https://lists.torproject.org/pipermail/tor-dev/2002-July/001122.html

https://lists.torproject.org/pipermail/tor-commits/2003-October/024388.html

https://lists.torproject.org/pipermail/tor-dev/2003-August/000236.html

  • Removing the fingerprint from the snowflake Bridge line in Tor Browser

would permit the Snowflake proxies to round-robin clients over several

bridges, but then the first hop would be unauthenticated (at the Tor

layer). It would be nice if it were possible to specify a small set of

permitted bridge fingerprints.

This approach would also require clients to pin to a particular bridge,

right? Because of the different state that each bridge will have?

–Roger


tor-relays mailing list

tor-relays@lists.torproject.org

https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays


tor-relays mailing list

tor-relays@lists.torproject.org

https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays


tor-relays mailing list
tor-relays@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays

1 Like

David, Roger, et al.,

I just got back from holidays and really enjoyed this thread!

I run my Loadbalanced Tor Relay as a Guard/Middle Relay, very similar to David’s topology diagram, without the Snoflake-Server proxy. I’m using Nginx (which forks a child process per core) instead of HAProxy. My Backend Tor Relay Nodes are running on several, different Physical Servers; thus, I’m using Private Address Space instead of Loopback Address Space.

In this configuration, I discovered that I had to configure Nginix/HAProxy to use Transparent Streaming Mode, use Source IP Address Sticky Sessions (Pinning), configure the Loadbalancer to send the Backend Tor Relay Nodes’ traffic back to Nginx/HAProxy (Kernel & IPTables), configure all Backend Tor Relay Nodes to use a copy of the same .tordb (I wasn’t able to get the Backend Tor Relay Nodes working with the same .tordb (over NFS) without the DirectoryAuthorities complaining), and configure the Backend Tor Relay Nodes to use the same DirectoryAuthority (to ensure each Backend Tor Relay Node sends Meta-Data to the same DirectoryAuthority). Moreover, I’ve enabled logging to a central Syslog Server for each Backend Tor Relay Node and created a number of Shell Scripts to help remotely manage each Backend Tor Relay Node.

Here are some sample configurations for reference.

Nginx Config:

upstream orport_tornodes {
#least_conn;
hash $remote_addr consistent;
#server 192.168.0.1:9001 weight=1 max_fails=1 fail_timeout=10s;
#server 192.168.0.1:9001 down;
server 192.168.0.11:9001 weight=4 max_fails=0 fail_timeout=0s;
server 192.168.0.21:9001 weight=4 max_fails=0 fail_timeout=0s;
#server 192.168.0.31:9001 weight=4 max_fails=3 fail_timeout=300s;
server 192.168.0.41:9001 weight=4 max_fails=0 fail_timeout=0s;
server 192.168.0.51:9001 weight=4 max_fails=0 fail_timeout=0s;
#zone orport_torfarm 64k;

HAProxy Config (Alternate):

frontend tornodes

Log to global config

log global

Bind to port 443 on a specified interface

bind 0.0.0.0:9001 transparent

We’re proxying TCP here…

mode tcp

default_backend orport_tornodes

Simple TCP source consistent over several servers using the specified

source 0.0.0.0 usesrc clientip

backend orport_tornodes

balance source
hash-type consistent
#server tornode1 192.168.0.1:9001 check disabled
#server tornode11 192.168.0.11:9001 source 192.168.0.1
server tornode11 192.168.0.11:9001 source 0.0.0.0 usesrc clientip check disabled
server tornode21 192.168.0.21:9001 source 0.0.0.0 usesrc clientip check disabled
#server tornode31 192.168.0.31:9001 source 0.0.0.0 usesrc clientip check disabled
server tornode41 192.168.0.41:9001 source 0.0.0.0 usesrc clientip check disabled
server tornode51 192.168.0.51:9001 source 0.0.0.0 usesrc clientip check disabled

Linux Kernel & IPTables Config:

modprobe xt_socket
modprobe xt_TPROXY

echo 1 > /proc/sys/net/ipv4/ip_forward; cat /proc/sys/net/ipv4/ip_forward
echo 1 > /proc/sys/net/ipv4/ip_nonlocal_bind; cat /proc/sys/net/ipv4/ip_nonlocal_bind
echo 15000 64000 > /proc/sys/net/ipv4/ip_local_port_range; cat /proc/sys/net/ipv4/ip_local_port_range

ip rule del fwmark 1 lookup 100 2>/dev/null # Ensure Duplicate Rule is not Created
ip rule add fwmark 1 lookup 100 # ip rule show
ip route add local 0.0.0.0/0 dev lo table 100 # ip route show table wan0; ip route show table 100

iptables -I INPUT -p tcp --dport 9001 -j ACCEPT
iptables -t mangle -N TOR
iptables -t mangle -A PREROUTING -p tcp -m socket -j TOR
iptables -t mangle -A TOR -j MARK --set-mark 1
iptables -t mangle -A TOR -j ACCEPT
#iptables -t mangle -A PREROUTING -p tcp -s 192.168.0.0/24 --sport 9001 -j MARK --set-xmark 0x1/0xffffffff
#iptables -t mangle -A PREROUTING -p tcp --dport 9001 -j TPROXY --tproxy-mark 0x1/0x1 --on-port 9001 --on-ip 127.0.0.1

Backend Tor Relay Node Configs:

cat /tmp/torrc

Nickname xxxxxxxxxxxxxxxxxx
ORPort xxx.xxx.xxx.xxx:9001 NoListen
ORPort 192.168.0.11:9001 NoAdvertise
SocksPort 9050
SocksPort 192.168.0.11:9050
ControlPort 9051
DirAuthority longclaw orport=443 no-v2 v3ident=23D15D965BC35114467363C165C4F724B64B4F66 199.58.81.140:80 74A910646BCEEFBCD2E874FC1DC997430F968145
FallbackDir 193.23.244.244:80 orport=443 id=7BE683E65D48141321C5ED92F075C55364AC7123
DirCache 0
ExitRelay 0
MaxMemInQueues 192 MB
GeoIPFile /opt/share/tor/geoip
Log notice file /tmp/torlog
Log notice syslog
VirtualAddrNetwork 10.192.0.0/10
AutomapHostsOnResolve 1
TransPort 192.168.0.11:9040
DNSPort 192.168.0.11:9053
RunAsDaemon 1
DataDirectory /tmp/tor/torrc.d/.tordb
AvoidDiskWrites 1
User tor
ContactInfo tor-operator@your-emailaddress-domain

cat /tmp/torrc

Nickname xxxxxxxxxxxxxxxxxx
ORPort xxx.xxx.xxx.xxx:9001 NoListen
ORPort 192.168.0.41:9001 NoAdvertise
SocksPort 9050
SocksPort 192.168.0.41:9050
ControlPort 9051
DirAuthority longclaw orport=443 no-v2 v3ident=23D15D965BC35114467363C165C4F724B64B4F66 199.58.81.140:80 74A910646BCEEFBCD2E874FC1DC997430F968145
FallbackDir 193.23.244.244:80 orport=443 id=7BE683E65D48141321C5ED92F075C55364AC7123
DirCache 0
ExitRelay 0
MaxMemInQueues 192 MB
GeoIPFile /opt/share/tor/geoip
Log notice file /tmp/torlog
Log notice syslog
VirtualAddrNetwork 10.192.0.0/10
AutomapHostsOnResolve 1
TransPort 192.168.0.41:9040
DNSPort 192.168.0.41:9053
RunAsDaemon 1
DataDirectory /tmp/tor/torrc.d/.tordb
AvoidDiskWrites 1
User tor
ContactInfo tor-operator@your-emailaddress-domain

Shell Scripts to Remotely Manage Tor Relay Nodes:

cat /usr/sbin/stat-tor-nodes

#!/bin/sh
uptime-all-nodes; memfree-all-nodes; netstat-tor-nodes

cat /usr/sbin/uptime-all-nodes

#!/bin/sh
/usr/bin/ssh -t admin@192.168.0.11 ‘hostname; uptime’
/usr/bin/ssh -t admin@192.168.0.21 ‘hostname; uptime’
/usr/bin/ssh -t admin@192.168.0.31 ‘hostname; uptime’
/usr/bin/ssh -t admin@192.168.0.41 ‘hostname; uptime’
/usr/bin/ssh -t admin@192.168.0.51 ‘hostname; uptime’

cat /usr/sbin/memfree-all-nodes

#!/bin/sh
/usr/bin/ssh -t admin@192.168.0.11 ‘hostname; grep MemFree /proc/meminfo’
/usr/bin/ssh -t admin@192.168.0.21 ‘hostname; grep MemFree /proc/meminfo’
/usr/bin/ssh -t admin@192.168.0.31 ‘hostname; grep MemFree /proc/meminfo’
/usr/bin/ssh -t admin@192.168.0.41 ‘hostname; grep MemFree /proc/meminfo’
/usr/bin/ssh -t admin@192.168.0.51 ‘hostname; grep MemFree /proc/meminfo’

cat /usr/sbin/netstat-tor-nodes

#!/bin/sh
/usr/bin/ssh -t admin@192.168.0.11 ‘hostname; netstat -anp | grep -i tor | grep -v 192.168.0.1: | wc -l’
/usr/bin/ssh -t admin@192.168.0.21 ‘hostname; netstat -anp | grep -i tor | grep -v 192.168.0.1: | wc -l’
/usr/bin/ssh -t admin@192.168.0.31 ‘hostname; netstat -anp | grep -i tor | grep -v 192.168.0.1: | wc -l’
/usr/bin/ssh -t admin@192.168.0.41 ‘hostname; netstat -anp | grep -i tor | grep -v 192.168.0.1: | wc -l’
/usr/bin/ssh -t admin@192.168.0.51 ‘hostname; netstat -anp | grep -i tor | grep -v 192.168.0.1: | wc -l’

cat /jffs/sbin/ps-tor-nodes

#!/bin/sh
/usr/bin/ssh -t admin@192.168.0.11 ‘hostname; ps w | grep -i tor’
/usr/bin/ssh -t admin@192.168.0.21 ‘hostname; ps w | grep -i tor’
/usr/bin/ssh -t admin@192.168.0.31 ‘hostname; ps w | grep -i tor’
/usr/bin/ssh -t admin@192.168.0.41 ‘hostname; ps w | grep -i tor’
/usr/bin/ssh -t admin@192.168.0.51 ‘hostname; ps w | grep -i tor’

cat /usr/sbin/killall-tor-nodes

#!/bin/sh
read -r -p "Are you sure? [y/N] " input
case “$input” in
[yY])
/usr/bin/ssh -t admin@192.168.0.11 ‘killall tor’
/usr/bin/ssh -t admin@192.168.0.21 ‘killall tor’
#/usr/bin/ssh -t admin@192.168.0.31 ‘killall tor’
/usr/bin/ssh -t admin@192.168.0.41 ‘killall tor’
/usr/bin/ssh -t admin@192.168.0.51 ‘killall tor’
return 0
;;
*)
return 1
;;
esac

cat /usr/sbin/restart-tor-nodes

#!/bin/sh
read -r -p "Are you sure? [y/N] " input
case “$input” in
[yY])
/usr/bin/ssh -t admin@192.168.0.11 ‘/usr/sbin/tor -f /tmp/torrc --quiet’
/usr/bin/ssh -t admin@192.168.0.21 ‘/usr/sbin/tor -f /tmp/torrc --quiet’
#/usr/bin/ssh -t admin@192.168.0.31 ‘/usr/sbin/tor -f /tmp/torrc --quiet’
/usr/bin/ssh -t admin@192.168.0.41 ‘/usr/sbin/tor -f /tmp/torrc --quiet’
/usr/bin/ssh -t admin@192.168.0.51 ‘/usr/sbin/tor -f /tmp/torrc --quiet’
return 0
;;
*)
return 1
;;
esac

I’ve been meaning to put together a tutorial on Loadbalancing Tor Relays, but haven’t found the time as of yet. Perhaps, this will help, until I am able to find the time.

I appreciate your knowledge sharing and for furthering the topic of Loadbalancing Tor Relays; especially, with regard to Bridging and Exit Relays.

Keep up the Great Work!

Respectfully,

Gary

[I’m about to go off-line for some days, so I am sending my current suboptimally-organized reply, which I hope is better than waiting another week to respond :)]

Let’s make a distinction between the “frontend” snowflake-server
pluggable transport process, and the “backend” tor process. These don’t
necessarily have to be 1:1; either one could be run in multiple
instances. Currently, the “backend” tor is the limiting factor, because
it uses only 1 CPU core. The “frontend” snowflake-server can scale to
multiple cores in a single process and is comparatively unrestrained.

Excellent point, and yes this simplifies. Great.

I believe that the “pinning” of a client session to particular tor
instance will work automatically by the fact that snowflake-server keeps
an outgoing connection alive (i.e., through the load balancer) as long
as a KCP session exists.
[…]
But before starting the second instance the first time, copy keys from
the first instance:

Hm. It looks promising! But we might still have a Tor-side problem
remaining. I think it boils down to how long the KCP sessions last.

The details on how exactly these bridge instances will diverge over time:

The keys directory will start out the same, but after four weeks
(DEFAULT_ONION_KEY_LIFETIME_DAYS, used to be one week but in Tor
0.3.1.1-alpha, proposal 274, we bumped it up to four weeks) each
bridge will rotate its onion key (the one clients use for circuit-level
crypto). That is, each instance will generate its own fresh onion key.

The two bridge instances actually haven’t diverged completely at that
point, since Tor remembers the previous onion key (i.e. the onion key
from the previous period) and is willing to receive create cells that
use it for one further week (DEFAULT_ONION_KEY_GRACE_PERIOD_DAYS). So it
is after 5 weeks that the original (shared) onion key will no longer work.

Where this matters is (after this 5 weeks have passed) if the client
connects to the bridge, fetches and caches the bridge descriptor of
instance A, and then later it connects to the bridge again and gets
passed to instance B. In this case, the create cell that the client
generates will use the onion key for instance A, and instance B won’t
know how to decrypt it so it will send a destroy cell back.

If this is an issue, we can definitely work around it, by e.g. disabling
the onion key rotation on the bridges, or setting up a periodic rsync+hup
between the bridges, or teaching clients to use createfast cells in this
situation (this type of circuit crypto doesn’t use the onion key at all,
and just relies on TLS for security – which can only be done for the
first hop of the circuit but that’s the one we’re talking about here).

But before we think about workarounds, maybe we don’t need one: how long
does “the KCP session” last?

Tor clients try to fetch a fresh bridge descriptor every three-ish
hours, and once they fetch a bridge descriptor from their “current”
bridge instance, they should know the onion key that it wants to use. So
it is that up-to-three-hour window where I think things could go wrong.
And that timeframe sounds promising.

(I also want to double-check that clients don’t try to use the onion
key from the current cached descriptor while fetching the updated
descriptor. That could become an ugly bug in the wrong circumstances,
and would be something we want to fix if it’s happening.)

Here’s how you can simulate a pair of bridge instances that have diverged
after five weeks, so you can test how things would work with them:

Copy the keys directory as before, but “rm secret_onion_key*” in the
keys directory on n-1 of the instances, before starting them.)

Thanks!

–Roger

···

On Tuesday, January 4, 2022, 09:57:52 PM MST, Roger Dingledine arma@torproject.org wrote:
On Thu, Dec 30, 2021 at 10:42:51PM -0700, David Fifield wrote:


tor-relays mailing list
tor-relays@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays

1 Like

Hm. It looks promising! But we might still have a Tor-side problem remaining. I think it boils down to how long the KCP sessions last.

The details on how exactly these bridge instances will diverge over time:

The keys directory will start out the same, but after four weeks (DEFAULT_ONION_KEY_LIFETIME_DAYS, used to be one week but in Tor 0.3.1.1-alpha, proposal 274, we bumped it up to four weeks) each bridge will rotate its onion key (the one clients use for circuit-level crypto). That is, each instance will generate its own fresh onion key.

The two bridge instances actually haven't diverged completely at that point, since Tor remembers the previous onion key (i.e. the onion key from the previous period) and is willing to receive create cells that use it for one further week (DEFAULT_ONION_KEY_GRACE_PERIOD_DAYS). So it is after 5 weeks that the original (shared) onion key will no longer work.

Where this matters is (after this 5 weeks have passed) if the client connects to the bridge, fetches and caches the bridge descriptor of instance A, and then later it connects to the bridge again and gets passed to instance B. In this case, the create cell that the client generates will use the onion key for instance A, and instance B won't know how to decrypt it so it will send a destroy cell back.

I've done an experiment with a second snowflake bridge that has the same identity keys but different onion keys. A client can bootstrap with either one starting from a clean state, but it fails if you bootstrap with one and then try to bootstrap with the other using the same DataDirectory. The error you get is

onion_skin_ntor_client_handshake(): Invalid result from curve25519 handshake: 4

The first bridge is the existing "prod" snowflake bridge with nickname:
* flakey

The other "staging" bridge is the load-balanced configuration with four instances. All four instances currently have the same onion keys; which however are different from the "prod"'s onion keys. (The onion keys actually come from a backup I made.)
* flakey1
* flakey2
* flakey3
* flakey4

Bootstrapping "prod" with a fresh DataDirectory "datadir.prod" works. Here is torrc.prod:

UseBridges 1
SocksPort auto
DataDirectory datadir.prod
ClientTransportPlugin snowflake exec ./client -keep-local-addresses -log snowflake.log
Bridge snowflake 192.0.2.3:1 2B280B23E1107BB62ABFC40DDCC8824814F80A72 url=https://snowflake-broker.torproject.net/ max=1 ice=stun:stun.voip.blackberry.com:3478,stun:stun.altar.com.pl:3478,stun:stun.antisip.com:3478,stun:stun.bluesip.net:3478,stun:stun.dus.net:3478,stun:stun.epygi.com:3478,stun:stun.sonetel.com:3478,stun:stun.sonetel.net:3478,stun:stun.stunprotocol.org:3478,stun:stun.uls.co.za:3478,stun:stun.voipgate.com:3478,stun:stun.voys.nl:3478

Notice `new bridge descriptor 'flakey' (fresh)`:

snowflake/client$ tor -f torrc.prod
[notice] Tor 0.3.5.16 running on Linux with Libevent 2.1.8-stable, OpenSSL 1.1.1d, Zlib 1.2.11, Liblzma 5.2.4, and Libzstd 1.3.8.
[notice] Bootstrapped 0%: Starting
[notice] Starting with guard context "bridges"
[notice] Delaying directory fetches: No running bridges
[notice] Bootstrapped 5%: Connecting to directory server
[notice] Bootstrapped 10%: Finishing handshake with directory server
[notice] Bootstrapped 15%: Establishing an encrypted directory connection
[notice] Bootstrapped 20%: Asking for networkstatus consensus
[notice] new bridge descriptor 'flakey' (fresh): $2B280B23E1107BB62ABFC40DDCC8824814F80A72~flakey at 192.0.2.3
[notice] Bootstrapped 25%: Loading networkstatus consensus
[notice] I learned some more directory information, but not enough to build a circuit: We have no usable consensus.
[notice] Bootstrapped 40%: Loading authority key certs
[notice] The current consensus has no exit nodes. Tor can only build internal paths, such as paths to onion services.
[notice] Bootstrapped 45%: Asking for relay descriptors for internal paths
[notice] I learned some more directory information, but not enough to build a circuit: We need more microdescriptors: we have 0/6673, and can only build 0% of likely paths. (We have 100% of guards bw, 0% of midpoint bw, and 0% of end bw (no exits in consensus, using mid) = 0% of path bw.)
[notice] Bootstrapped 50%: Loading relay descriptors for internal paths
[notice] The current consensus contains exit nodes. Tor can build exit and internal paths.
[notice] Bootstrapped 57%: Loading relay descriptors
[notice] Bootstrapped 64%: Loading relay descriptors
[notice] Bootstrapped 73%: Loading relay descriptors
[notice] Bootstrapped 78%: Loading relay descriptors
[notice] Bootstrapped 80%: Connecting to the Tor network
[notice] Bootstrapped 85%: Finishing handshake with first hop
[notice] Bootstrapped 90%: Establishing a Tor circuit
[notice] Bootstrapped 100%: Done

Bootstrapping "staging" with a fresh DataDirectory "datadir.staging" also works. Here is torrc.staging:

UseBridges 1
SocksPort auto
DataDirectory datadir.staging
ClientTransportPlugin snowflake exec ./client -keep-local-addresses -log snowflake.log
Bridge snowflake 192.0.2.3:1 2B280B23E1107BB62ABFC40DDCC8824814F80A72 url=http://127.0.0.1:8000/ max=1 ice=stun:stun.voip.blackberry.com:3478,stun:stun.altar.com.pl:3478,stun:stun.antisip.com:3478,stun:stun.bluesip.net:3478,stun:stun.dus.net:3478,stun:stun.epygi.com:3478,stun:stun.sonetel.com:3478,stun:stun.sonetel.net:3478,stun:stun.stunprotocol.org:3478,stun:stun.uls.co.za:3478,stun:stun.voipgate.com:3478,stun:stun.voys.nl:3478

Notice `new bridge descriptor 'flakey4' (fresh)`:

snowflake/broker$ ./broker -disable-tls -addr 127.0.0.1:8000
snowflake/proxy$ ./proxy -capacity 10 -broker http://127.0.0.1:8000/ -keep-local-addresses -relay wss://snowflake-staging.bamsoftware.com/
snowflake/client$ tor -f torrc.staging
[notice] Tor 0.3.5.16 running on Linux with Libevent 2.1.8-stable, OpenSSL 1.1.1d, Zlib 1.2.11, Liblzma 5.2.4, and Libzstd 1.3.8.
[notice] Bootstrapped 0%: Starting
[notice] Starting with guard context "bridges"
[notice] Delaying directory fetches: No running bridges
[notice] Bootstrapped 5%: Connecting to directory server
[notice] Bootstrapped 10%: Finishing handshake with directory server
[notice] Bootstrapped 15%: Establishing an encrypted directory connection
[notice] Bootstrapped 20%: Asking for networkstatus consensus
[notice] new bridge descriptor 'flakey4' (fresh): $2B280B23E1107BB62ABFC40DDCC8824814F80A72~flakey4 at 192.0.2.3
[notice] Bootstrapped 25%: Loading networkstatus consensus
[notice] I learned some more directory information, but not enough to build a circuit: We have no usable consensus.
[notice] Bootstrapped 40%: Loading authority key certs
[notice] The current consensus has no exit nodes. Tor can only build internal paths, such as paths to onion services.
[notice] Bootstrapped 45%: Asking for relay descriptors for internal paths
[notice] I learned some more directory information, but not enough to build a circuit: We need more microdescriptors: we have 0/6673, and can only build 0% of likely paths. (We have 100% of guards bw, 0% of midpoint bw, and 0% of end bw (no exits in consensus, using mid) = 0% of path bw.)
[notice] Bootstrapped 50%: Loading relay descriptors for internal paths
[notice] The current consensus contains exit nodes. Tor can build exit and internal paths.
[notice] Bootstrapped 57%: Loading relay descriptors
[notice] Bootstrapped 63%: Loading relay descriptors
[notice] Bootstrapped 72%: Loading relay descriptors
[notice] Bootstrapped 77%: Loading relay descriptors
[notice] Bootstrapped 80%: Connecting to the Tor network
[notice] Bootstrapped 85%: Finishing handshake with first hop
[notice] Bootstrapped 90%: Establishing a Tor circuit
[notice] Bootstrapped 100%: Done

But now, if we try running torrc.staging but give it the DataDirectory "datadir.prod", it fails at 90%. Notice `new bridge descriptor 'flakey' (cached)`: if the descriptor had not been cached it would have been flakey[1234] instead.

$ tor -f torrc.staging DataDirectory datadir.prod Log "notice stderr" Log "info file info.log"
[notice] Tor 0.3.5.16 running on Linux with Libevent 2.1.8-stable, OpenSSL 1.1.1d, Zlib 1.2.11, Liblzma 5.2.4, and Libzstd 1.3.8.
[notice] Bootstrapped 0%: Starting
[notice] Starting with guard context "bridges"
[notice] new bridge descriptor 'flakey' (cached): $2B280B23E1107BB62ABFC40DDCC8824814F80A72~flakey at 192.0.2.3
[notice] Delaying directory fetches: Pluggable transport proxies still configuring
[notice] Bootstrapped 5%: Connecting to directory server
[notice] Bootstrapped 10%: Finishing handshake with directory server
[notice] Bootstrapped 80%: Connecting to the Tor network
[notice] Bootstrapped 90%: Establishing a Tor circuit
[notice] Delaying directory fetches: No running bridges

Here is an excerpt from the info-level log that shows the error. The important part seems to be `onion_skin_ntor_client_handshake(): Invalid result from curve25519 handshake: 4`.

[notice] new bridge descriptor 'flakey' (cached): $2B280B23E1107BB62ABFC40DDCC8824814F80A72~flakey at 192.0.2.3
[notice] Delaying directory fetches: Pluggable transport proxies still configuring
[info] extend_info_from_node(): Including Ed25519 ID for $2B280B23E1107BB62ABFC40DDCC8824814F80A72~flakey at 192.0.2.3
[info] onion_pick_cpath_exit(): Using requested exit node '$2B280B23E1107BB62ABFC40DDCC8824814F80A72~flakey at 192.0.2.3'
[info] circuit_handle_first_hop(): Next router is [scrubbed]: Not connected. Connecting.
[notice] Bootstrapped 5%: Connecting to directory server
[info] connection_or_set_canonical(): Channel 0 chose an idle timeout of 247.
[info] connection_or_set_identity_digest(): Set identity digest for 0x55c3f9356770 ([scrubbed]): 2B280B23E1107BB62ABFC40DDCC8824814F80A72 1zOHpg+FxqQfi/6jDLtCpHHqBTH8gjYmCKXkus1D5Ko.
[info] connection_or_set_identity_digest():    (Previously: 0000000000000000000000000000000000000000 <unset>)
[info] connection_or_set_canonical(): Channel 1 chose an idle timeout of 232.
[info] circuit_predict_and_launch_new(): Have 0 clean circs (0 internal), need another exit circ.
[info] choose_good_exit_server_general(): Found 1336 servers that might support 0/0 pending connections.
[info] choose_good_exit_server_general(): Chose exit server '$0F1C8168DFD0AADBE61BD71194D37C867FED5A21~FreeExit at 81.17.18.60'
[info] extend_info_from_node(): Including Ed25519 ID for $0F1C8168DFD0AADBE61BD71194D37C867FED5A21~FreeExit at 81.17.18.60
[info] select_primary_guard_for_circuit(): Selected primary guard $2B280B23E1107BB62ABFC40DDCC8824814F80A72 ($2B280B23E1107BB62ABFC40DDCC8824814F80A72) for circuit.
[info] extend_info_from_node(): Including Ed25519 ID for $2B280B23E1107BB62ABFC40DDCC8824814F80A72~flakey at 192.0.2.3
[info] extend_info_from_node(): Including Ed25519 ID for $7158D1E0D9F90F7999ACB3B073DA762C9B2C3275~maltimore at 207.180.224.17
[info] circuit_handle_first_hop(): Next router is [scrubbed]: Connection in progress; waiting.
[info] connection_edge_process_inbuf(): data from edge while in 'waiting for circuit' state. Leaving it on buffer.
[info] connection_edge_process_inbuf(): data from edge while in 'waiting for circuit' state. Leaving it on buffer.
[notice] Bootstrapped 10%: Finishing handshake with directory server
[notice] Bootstrapped 80%: Connecting to the Tor network
[info] parse_socks_client(): SOCKS 5 client: need authentication.
[info] parse_socks_client(): SOCKS 5 client: authentication successful.
[info] connection_read_proxy_handshake(): Proxy Client: connection to 192.0.2.3:1 successful
[info] circuit_predict_and_launch_new(): Have 1 clean circs (0 internal), need another exit circ.
[info] choose_good_exit_server_general(): Found 1336 servers that might support 0/0 pending connections.
[info] choose_good_exit_server_general(): Chose exit server '$D8A1F5A8EA1AF53E3414B9C48FE6B10C31ACC9B2~privexse1exit at 185.130.44.108'
[info] extend_info_from_node(): Including Ed25519 ID for $D8A1F5A8EA1AF53E3414B9C48FE6B10C31ACC9B2~privexse1exit at 185.130.44.108
[info] select_primary_guard_for_circuit(): Selected primary guard $2B280B23E1107BB62ABFC40DDCC8824814F80A72 ($2B280B23E1107BB62ABFC40DDCC8824814F80A72) for circuit.
[info] extend_info_from_node(): Including Ed25519 ID for $2B280B23E1107BB62ABFC40DDCC8824814F80A72~flakey at 192.0.2.3
[info] extend_info_from_node(): Including Ed25519 ID for $2F9AFDE43DC8E3F05803304C01BD3DBF329169AC~dutreuil at 213.152.168.27
[info] circuit_handle_first_hop(): Next router is [scrubbed]: Connection in progress; waiting.
[info] circuit_predict_and_launch_new(): Have 2 clean circs (0 uptime-internal, 0 internal), need another hidden service circ.
[info] extend_info_from_node(): Including Ed25519 ID for $8967A8912E61070FCFA9B8EC9869E5AC8F94949A~4Freunde at 145.239.154.56
[info] select_primary_guard_for_circuit(): Selected primary guard $2B280B23E1107BB62ABFC40DDCC8824814F80A72 ($2B280B23E1107BB62ABFC40DDCC8824814F80A72) for circuit.
[info] extend_info_from_node(): Including Ed25519 ID for $2B280B23E1107BB62ABFC40DDCC8824814F80A72~flakey at 192.0.2.3
[info] extend_info_from_node(): Including Ed25519 ID for $9367EB01DF75DE6265A0971249204029D6A55877~oddling at 5.182.210.231
[info] circuit_handle_first_hop(): Next router is [scrubbed]: Connection in progress; waiting.
[info] circuit_predict_and_launch_new(): Have 3 clean circs (1 uptime-internal, 1 internal), need another hidden service circ.
[info] extend_info_from_node(): Including Ed25519 ID for $AF85E6556FD5692BC554A93BAC9FACBFC2D79EFD~whoUSicebeer09b at 192.187.103.74
[info] select_primary_guard_for_circuit(): Selected primary guard $2B280B23E1107BB62ABFC40DDCC8824814F80A72 ($2B280B23E1107BB62ABFC40DDCC8824814F80A72) for circuit.
[info] extend_info_from_node(): Including Ed25519 ID for $2B280B23E1107BB62ABFC40DDCC8824814F80A72~flakey at 192.0.2.3
[info] extend_info_from_node(): Including Ed25519 ID for $9515B435D8D063E537AB137FCF5A97B1ACE3CA2A~corvuscorone at 135.181.178.197
[info] circuit_handle_first_hop(): Next router is [scrubbed]: Connection in progress; waiting.
[info] circuit_predict_and_launch_new(): Have 4 clean circs (2 uptime-internal, 2 internal), need another hidden service circ.
[info] extend_info_from_node(): Including Ed25519 ID for $68A9F0DFFC7C8F57B3DEA3801D6CF001652A809F~vpskilobug at 213.164.206.145
[info] select_primary_guard_for_circuit(): Selected primary guard $2B280B23E1107BB62ABFC40DDCC8824814F80A72 ($2B280B23E1107BB62ABFC40DDCC8824814F80A72) for circuit.
[info] extend_info_from_node(): Including Ed25519 ID for $2B280B23E1107BB62ABFC40DDCC8824814F80A72~flakey at 192.0.2.3
[info] extend_info_from_node(): Including Ed25519 ID for $2C13A54E3E8A6AFB18E0DE5890E5B08AAF5B0F36~history at 138.201.123.109
[info] circuit_handle_first_hop(): Next router is [scrubbed]: Connection in progress; waiting.
[info] channel_tls_process_versions_cell(): Negotiated version 5 with [scrubbed]:1; Waiting for CERTS cell
[info] connection_or_client_learned_peer_id(): learned peer id for 0x55c3f9356770 ([scrubbed]): 2B280B23E1107BB62ABFC40DDCC8824814F80A72, 1zOHpg+FxqQfi/6jDLtCpHHqBTH8gjYmCKXkus1D5Ko
[info] channel_tls_process_certs_cell(): Got some good certificates from [scrubbed]:1: Authenticated it with RSA and Ed25519
[info] circuit_send_first_onion_skin(): First hop: finished sending CREATE cell to '$2B280B23E1107BB62ABFC40DDCC8824814F80A72~flakey at 192.0.2.3'
[notice] Bootstrapped 90%: Establishing a Tor circuit
[info] circuit_send_first_onion_skin(): First hop: finished sending CREATE cell to '$2B280B23E1107BB62ABFC40DDCC8824814F80A72~flakey at 192.0.2.3'
[info] circuit_send_first_onion_skin(): First hop: finished sending CREATE cell to '$2B280B23E1107BB62ABFC40DDCC8824814F80A72~flakey at 192.0.2.3'
[info] circuit_send_first_onion_skin(): First hop: finished sending CREATE cell to '$2B280B23E1107BB62ABFC40DDCC8824814F80A72~flakey at 192.0.2.3'
[info] circuit_send_first_onion_skin(): First hop: finished sending CREATE cell to '$2B280B23E1107BB62ABFC40DDCC8824814F80A72~flakey at 192.0.2.3'
[info] circuit_send_first_onion_skin(): First hop: finished sending CREATE cell to '$2B280B23E1107BB62ABFC40DDCC8824814F80A72~flakey at 192.0.2.3'
[info] channel_tls_process_netinfo_cell(): Got good NETINFO cell from [scrubbed]:1; OR connection is now open, using protocol version 5. Its ID digest is 2B280B23E1107BB62ABFC40DDCC8824814F80A72. Our address is apparently [scrubbed].
[info] onion_skin_ntor_client_handshake(): Invalid result from curve25519 handshake: 4
[info] circuit_mark_for_close_(): Circuit 3457244666 (id: 1) marked for close at ../src/core/or/command.c:443 (orig reason: 1, new reason: 0)
[info] onion_skin_ntor_client_handshake(): Invalid result from curve25519 handshake: 4
[info] circuit_mark_for_close_(): Circuit 4237434553 (id: 2) marked for close at ../src/core/or/command.c:443 (orig reason: 1, new reason: 0)
[info] onion_skin_ntor_client_handshake(): Invalid result from curve25519 handshake: 4
[info] circuit_mark_for_close_(): Circuit 3082862549 (id: 6) marked for close at ../src/core/or/command.c:443 (orig reason: 1, new reason: 0)
[info] onion_skin_ntor_client_handshake(): Invalid result from curve25519 handshake: 4
[info] circuit_mark_for_close_(): Circuit 2596950236 (id: 4) marked for close at ../src/core/or/command.c:443 (orig reason: 1, new reason: 0)
[info] circuit_build_failed(): Our circuit 3457244666 (id: 1) failed to get a response from the first hop (192.0.2.3:1). I'm going to try to rotate to a better connection.
[info] connection_ap_fail_onehop(): Closing one-hop stream to '$2B280B23E1107BB62ABFC40DDCC8824814F80A72/192.0.2.3' because the OR conn just failed.
[info] circuit_free_(): Circuit 0 (id: 1) has been freed.
[info] circuit_build_failed(): Our circuit 4237434553 (id: 2) failed to get a response from the first hop (192.0.2.3:1). I'm going to try to rotate to a better connection.
[info] circuit_free_(): Circuit 0 (id: 2) has been freed.
[info] circuit_build_failed(): Our circuit 3082862549 (id: 6) failed to get a response from the first hop (192.0.2.3:1). I'm going to try to rotate to a better connection.
[info] circuit_free_(): Circuit 0 (id: 6) has been freed.
[info] circuit_build_failed(): Our circuit 2596950236 (id: 4) failed to get a response from the first hop (192.0.2.3:1). I'm going to try to rotate to a better connection.
[info] circuit_free_(): Circuit 0 (id: 4) has been freed.
[info] connection_free_minimal(): Freeing linked Socks connection [waiting for circuit] with 121 bytes on inbuf, 0 on outbuf.
[info] connection_dir_client_reached_eof(): 'fetch' response not all here, but we're at eof. Closing.
[info] entry_guards_note_guard_failure(): Recorded failure for primary confirmed guard $2B280B23E1107BB62ABFC40DDCC8824814F80A72 ($2B280B23E1107BB62ABFC40DDCC8824814F80A72)
[info] connection_dir_client_request_failed(): Giving up on serverdesc/extrainfo fetch from directory server at '192.0.2.3'; retrying
[info] connection_free_minimal(): Freeing linked Directory connection [client reading] with 0 bytes on inbuf, 0 on outbuf.
[info] onion_skin_ntor_client_handshake(): Invalid result from curve25519 handshake: 4
[info] circuit_mark_for_close_(): Circuit 2912328161 (id: 5) marked for close at ../src/core/or/command.c:443 (orig reason: 1, new reason: 0)
[info] onion_skin_ntor_client_handshake(): Invalid result from curve25519 handshake: 4
[info] circuit_mark_for_close_(): Circuit 2793970028 (id: 3) marked for close at ../src/core/or/command.c:443 (orig reason: 1, new reason: 0)
[info] circuit_build_failed(): Our circuit 2912328161 (id: 5) failed to get a response from the first hop (192.0.2.3:1). I'm going to try to rotate to a better connection.
[info] circuit_free_(): Circuit 0 (id: 5) has been freed.
[info] circuit_build_failed(): Our circuit 2793970028 (id: 3) failed to get a response from the first hop (192.0.2.3:1). I'm going to try to rotate to a better connection.
[info] circuit_free_(): Circuit 0 (id: 3) has been freed.
[info] connection_ap_make_link(): Making internal direct tunnel to [scrubbed]:1 ...
[info] connection_ap_make_link(): ... application connection created and linked.
[info] should_delay_dir_fetches(): Delaying dir fetches (no running bridges known)
[notice] Delaying directory fetches: No running bridges

As you suggested, CREATE_FAST in place of CREATE works. I hacked `should_use_create_fast_for_circuit` to always return true:

diff --git a/src/core/or/circuitbuild.c b/src/core/or/circuitbuild.c
index 2bcc642a97..4005ba56ce 100644
--- a/src/core/or/circuitbuild.c
+++ b/src/core/or/circuitbuild.c
@@ -801,6 +801,7 @@ should_use_create_fast_for_circuit(origin_circuit_t *circ)
   tor_assert(circ->cpath);
   tor_assert(circ->cpath->extend_info);

+  return true;
   return ! circuit_has_usable_onion_key(circ);
 }

And then the mixed configuration with the "staging" bridge and the "prod" DataDirectory bootstraps. Notice `new bridge descriptor 'flakey' (cached)` followed later by `new bridge descriptor 'flakey1' (fresh)`.

$ ~/tor/src/app/tor -f torrc.staging DataDirectory datadir.prod
[notice] Tor 0.4.6.8 (git-d5efc2c98619568e) running on Linux with Libevent 2.1.8-stable, OpenSSL 1.1.1d, Zlib 1.2.11, Liblzma 5.2.4, Libzstd N/A and Glibc 2.28 as libc.
[notice] Bootstrapped 0% (starting): Starting
[notice] Starting with guard context "bridges"
[notice] new bridge descriptor 'flakey' (cached): $2B280B23E1107BB62ABFC40DDCC8824814F80A72~flakey [1zOHpg+FxqQfi/6jDLtCpHHqBTH8gjYmCKXkus1D5Ko] at 192.0.2.3
[notice] Delaying directory fetches: Pluggable transport proxies still configuring
[notice] Bootstrapped 1% (conn_pt): Connecting to pluggable transport
[notice] Bootstrapped 2% (conn_done_pt): Connected to pluggable transport
[notice] Bootstrapped 10% (conn_done): Connected to a relay
[notice] Bootstrapped 14% (handshake): Handshaking with a relay
[notice] Bootstrapped 15% (handshake_done): Handshake with a relay done
[notice] Bootstrapped 75% (enough_dirinfo): Loaded enough directory info to build circuits
[notice] Bootstrapped 95% (circuit_create): Establishing a Tor circuit
[notice] new bridge descriptor 'flakey1' (fresh): $2B280B23E1107BB62ABFC40DDCC8824814F80A72~flakey1 [1zOHpg+FxqQfi/6jDLtCpHHqBTH8gjYmCKXkus1D5Ko] at 192.0.2.3
[notice] Bootstrapped 100% (done): Done

If this is an issue, we can definitely work around it, by e.g. disabling the onion key rotation on the bridges, or setting up a periodic rsync+hup between the bridges, or teaching clients to use createfast cells in this situation (this type of circuit crypto doesn't use the onion key at all, and just relies on TLS for security -- which can only be done for the first hop of the circuit but that's the one we're talking about here).

What do you recommend trying? I guess the quickest way to get more capacity on the snowflake bridge is to disable onion key rotation by patching the tor source code, though I wouldn't want to maintain that long-term.

Gary, I was wondering how you are dealing with the changing onion key issue, and I suppose it is [this]([tor-relays] How to reduce tor CPU load on a single bridge? - #13 by tor-relays):

use Source IP Address Sticky Sessions (Pinning)

The same client source address gets pinned to the same tor instance and therefore the same onion key. If I understand correctly, there's a potential failure if a client changes its IP address and later gets mapped to a different instance. Is that right?

···

On Tue, Jan 04, 2022 at 11:57:36PM -0500, Roger Dingledine wrote:
_______________________________________________
tor-relays mailing list
tor-relays@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays

Yes… That is correct. As long as circuits originate from the same Source IP Address, Nginx/HAProxy ensures they are pinned to the same loadbalanced Upstream Tor Node; unless, the originating Source IP Address changes (low-risk) or one of the Upstream Tor Nodes goes down (low-risk with UPS) and surviving circuits migrate to the remaining Upstream Tor Nodes, which effectively forces building of new circuits with relavent keys.

The issue I find more challenging, in loadbalancing Upstream Tor Nodes, is when the Medium-Term Key is updated after running for some time (it’s consistent with the previously mentioned 4 - 5 week time period). It is at this point that I notice all circuits bleed-off from the Upstream Tor Nodes with the exception of the Tor Node where the Medium-Term Key was successfully updated. It’s at this point that I am forced to shutdown all Upstream Tor Nodes, copy the .tordb containing the updated Medium-Term Key to the other Upstream Tor Nodes, and restart all Upstream Tor Nodes.

If there was a way for a Family of Tor Instances to share a Medium-Term Key, I believe that might solve the long-term issue of running a Loadbalanced Tor Relay.

As it stands… I can run my Loadbalanced Tor Relay for 4 - 5 weeks without any intervention.

Hope that answers your question.

Respectfully,

Gary

···

On Monday, January 17, 2022, 11:47:11 AM MST, David Fifield david@bamsoftware.com wrote:

Gary, I was wondering how you are dealing with the changing onion key issue, and I suppose it is this:

use Source IP Address Sticky Sessions (Pinning)

The same client source address gets pinned to the same tor instance and therefore the same onion key. If I understand correctly, there’s a potential failure if a client changes its IP address and later gets mapped to a different instance. Is that right?


This Message Originated by the Sun.
iBigBlue 63W Solar Array (~12 Hour Charge)

  • 2 x Charmast 26800mAh Power Banks
    = iPhone XS Max 512GB (~2 Weeks Charged)
1 Like

The DNS record for the Snowflake bridge was switched to a temporary staging server, running the load balancing setup, at 2022-01-25 17:41:00. We were debugging some initial problems until 2022-01-25 18:47:00. You can read about it here:

Snowflake sessions are now using the staging bridge, except for those that started before the change happened and haven't finished yet, and perhaps some proxies that still have the IP address of the production bridge in their DNS cache. I am not sure yet what will happen with metrics, but we'll see after a few days.

On the matter of onion key rotation, I had the idea of making the onion key files read-only. Roger did some source code investigation and said that it might work to prevent onion key rotation, with some minor side effects. I plan to give the idea a try on a different bridge. The possible side effects are that tor will continue trying and failing to rotate the onion key every hour, and "force a router descriptor rebuild, so it will try to publish a new descriptor each hour."

https://gitweb.torproject.org/tor.git/tree/src/feature/relay/router.c?h=tor-0.4.6.9#n523

  if (curve25519_keypair_write_to_file(&new_curve25519_keypair, fname,
                                       "onion") < 0) {
    log_err(LD_FS,"Couldn't write curve25519 onion key to \"%s\".",fname);
    goto error;
  }
  // ...
 error:
  log_warn(LD_GENERAL, "Couldn't rotate onion key.");
  if (prkey)
    crypto_pk_free(prkey);
···

_______________________________________________
tor-relays mailing list
tor-relays@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays

1 Like

David,

Excellent documentation of your loadbalanced Snowflake endeavors!

The DNS record for the Snowflake bridge was switched to a temporary staging server, running the load balancing setup, at 2022-01-25 17:41:00. We were debugging some initial problems until 2022-01-25 18:47:00. You can read about it here:

https://bugs.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/40095#note_2772325

It’s nice to see that the Snowflake daemon offers a native configuration option for LimitNOFile. I ran into a similar issue with my initial loadbalanced Tor Relay Nodes that was solved at the O/S level using ulimit. It would be nice if torrc had a similar option.

From your documentation, it sounds like you’re running everything on the same machine? When expanding to additional machines, similar to the file limit issue, you’ll have to expand the usable ports as well.

I’d like to see more of your HAProxy configuration. Do you not have to use transparent proxy mode with Snowflake instances as you do with Tor Relay instances? I hadn’t realized HAProxy had a client timeout. Thank you for that tidbit. And thank you for referencing my comments as well.

Snowflake sessions are now using the staging bridge, except for those that started before the change happened and haven’t finished yet, and perhaps some proxies that still have the IP address of the production bridge in their DNS cache. I am not sure yet what will happen with metrics, but we’ll see after a few days.

Currently, as I only use IPv4, I can’t offer much insight as to the lack of IPv6 connections being reported (that’s what my logs report, too). Your Heartbeat messages are looking good with a symmetric balance of connections and data. They look very similar to my Heartbeat logs; except, you can tell you offer more computing power, which is great to see extrapolated! I’ve found that the Heartbeat logs are key to knowing the health of your loadbalanced Tor implementation. You might consider setting up syslog with a Snowflake filter to aggregate your Snowflake logs for easier readability.

Regarding metrics.torproject.org… I expect you’ll see that written-bytes and read-bytes only reflect that of a single Snowflake instance. However, your consensus weight will reflect the aggregate of all Snowflake instances.

On the matter of onion key rotation, I had the idea of making the onion key files read-only. Roger did some source code investigation and said that it might work to prevent onion key rotation, with some minor side effects. I plan to give the idea a try on a different bridge. The possible side effects are that tor will continue trying and failing to rotate the onion key every hour, and “force a router descriptor rebuild, so it will try to publish a new descriptor each hour.”

I’m interested to hear how the prospective read-only file fix plays out. However, from my observations, I would assume that connects will eventually start bleeding off any instances that fail to update the key. We really need a long-term solution to this issue for this style of deployment.

Keep up the Great Work!

Respectfully,

Gary

···


This Message Originated by the Sun.
iBigBlue 63W Solar Array (~12 Hour Charge)

  • 2 x Charmast 26800mAh Power Banks
    = iPhone XS Max 512GB (~2 Weeks Charged)
1 Like

David,

I’d like to see more of your HAProxy configuration. Do you not have to use transparent proxy mode with Snowflake instances as you do with Tor Relay instances? I hadn’t realized HAProxy had a client timeout. Thank you for that tidbit. And thank you for referencing my comments as well.

I found your HAProxy configuration in your “Draft installation guide.” It seems you’re using regular TCP streaming mode with the Snowflake instances vs transparent TCP streaming mode, which is a notable difference with the directly loadbalanced Tor Relay configuration. I also noticed you’ve configured the backend node timeout globally vs per node, which is just a nuance. You might test using a timeout value of 0s (to disable the timeout at the loadbalancer) and allow the Snowflake instances to preform state checking to ensure HAProxy isn’t throttling your bridge. I’ve tested both and I’m still not sure which timeout configuration makes most sense for this style implementation. Currently, I’m running with the 0s (disabled) timeout.

Any reason why you chose HAProxy over Nginx?

I did notice that you’re using the AssumeReachable 1 directive in your torrc files. Are you running into an issue where your Tor instances are failing the reachability test? Initially, I ran into a reachability issue and after digging through mountains of Tor debug logs discovered I needed to use transparent TCP streaming mode along with the Linux kernel and iptables changes to route the Tor traffic back from the Tor Relay Nodes to the loadbalancer. You shouldn’t need to run your Tor instances with the AssumeReachable 1 directive. This might suggest something in your configuration isn’t quite right.

One of my initial tests was staggering the startup of my instances to see how they randomly reported to the DirectoryAuthorities. It’s how I discovered that Tor instances pushed instead polled meta-data (different uptimes). The later would work better in a loadbalanced style deployment.

Do your Snowflake instances not have issues reporting to different DirectoryAuthorities? My Tor instances have issues if I don’t have them all report to the same DirectoryAuthority.

Keep up the excellent work.

Respectfully,

Gary

···


This Message Originated by the Sun.
iBigBlue 63W Solar Array (~12 Hour Charge)

  • 2 x Charmast 26800mAh Power Banks
    = iPhone XS Max 512GB (~2 Weeks Charged)
1 Like

It's nice to see that the Snowflake daemon offers a native configuration option for LimitNOFile. I ran into a similar issue with my initial loadbalanced Tor Relay Nodes that was solved at the O/S level using ulimit. It would be nice if torrc had a similar option.

LimitNOFile is actually not a Snowflake thing, it's a systemd thing. It's the same as `ulimit -n`. See:
https://www.freedesktop.org/software/systemd/man/systemd.exec.html#Process%20Properties

From your documentation, it sounds like you're running everything on the same machine? When expanding to additional machines, similar to the file limit issue, you'll have to expand the usable ports as well.

I don't think I understand your point. At 64K simultaneous connections, you run out of source ports for making connection 4-tuple unique, but I don't see how the same or different hosts makes a difference, in that respect.

I found your HAProxy configuration in your “Draft installation guide.” It seems you’re using regular TCP streaming mode with the Snowflake instances vs transparent TCP streaming mode, which is a notable difference with the directly loadbalanced Tor Relay configuration.

I admit I did not understand your point about transparent proxying. If it's about retaining the client's source IP address for source IP address pinning, I don't think that helps us. This is a bridge, not a relay, and the source IP address that haproxy sees is several steps removed from the client's actual IP address. haproxy receives connections from a localhost web server (the server pluggable transport that receives WebSocket connections); the web server receives connections from Snowflake proxies (which can and do have different IP addresses during the lifetime of a client session); only the Snowflake proxies themselves receive direct traffic from the client's own source IP address. The client's IP address is tunnelled all the way through to tor, for metrics purposes, but that uses the ExtORPort protocol and the load balancer isn't going to understand that. I think that transparent proxying would only transparently proxy the localhost IP addresses from the web server, which doesn't have any benefit, I don't think.

What's written in the draft installation guide is not the whole file. There's additionally the default settings as follows:

global
        log /dev/log    local0
        log /dev/log    local1 notice
        chroot /var/lib/haproxy
        stats socket /run/haproxy/admin.sock mode 660 level admin expose-fd listeners
        stats timeout 30s
        user haproxy
        group haproxy
        daemon

        # Default SSL material locations
        ca-base /etc/ssl/certs
        crt-base /etc/ssl/private

        # See: https://ssl-config.mozilla.org/#server=haproxy&server-version=2.0.3&config=intermediate
        ssl-default-bind-ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384
        ssl-default-bind-ciphersuites TLS_AES_128_GCM_SHA256:TLS_AES_256_GCM_SHA384:TLS_CHACHA20_POLY1305_SHA256
        ssl-default-bind-options ssl-min-ver TLSv1.2 no-tls-tickets

defaults
        log     global
        mode    http
        option  httplog
        option  dontlognull
        timeout connect 5000
        timeout client  50000
        timeout server  50000
        errorfile 400 /etc/haproxy/errors/400.http
        errorfile 403 /etc/haproxy/errors/403.http
        errorfile 408 /etc/haproxy/errors/408.http
        errorfile 500 /etc/haproxy/errors/500.http
        errorfile 502 /etc/haproxy/errors/502.http
        errorfile 503 /etc/haproxy/errors/503.http
        errorfile 504 /etc/haproxy/errors/504.http

You might test using a timeout value of 0s (to disable the timeout at the loadbalancer) and allow the Snowflake instances to preform state checking to ensure HAProxy isn’t throttling your bridge.

Thanks for that hint. So far, 10-minute timeouts seem not to be causing a problem. I don't know this software too well, but I think it's an idle timeout, not an absolute limit on connection lifetime.

Currently, as I only use IPv4, I can't offer much insight as to the lack of IPv6 connections being reported (that's what my logs report, too).

On further reflection, I don't think there's a problem here. The instances' bridge-stats and end-stats show a mix of countries and v4/v6.

Regarding metrics.torproject.org... I expect you'll see that written-bytes and read-bytes only reflect that of a single Snowflake instance. However, your consensus weight will reflect the aggregate of all Snowflake instances.

Indeed, the first few data points after the switchover show an apparent decrease in read/written bytes per second, even though the on-bridge bandwidth monitors show much more bandwidth being used than before. I suppose it could be selecting from any of 5 instances that currently share the same identity fingerprint: the 4 new load-balanced instances on the "staging" bridge, plus the 1 instance which is still running concurrently on the "production" bridge. When we finish the upgrade and get all the instances back on the production bridge, if the metrics are wrong, they will at least be uniformly wrong.
https://metrics.torproject.org/rs.html#details/5481936581E23D2D178105D44DB6915AB06BFB7F

Any reason why you chose HAProxy over Nginx?

Shelikhoo drafted a configuration using Nginx, which for the time being you can see here:

https://pad.riseup.net/p/pvKoxaIcejfiIbvVAV7j#L416

I don't have a strong preference and I don't have a lot of experience with either one. haproxy seemed to offer fewer opportunities for error, because the default Nginx installation expects to run a web server, which I would have to disable and ensure it did not fight with snowflake-server for port 443. It just seemed simpler to have one configuration file to edit and restart the daemon.

I did notice that you’re using the AssumeReachable 1 directive in your torrc files. Are you running into an issue where your Tor instances are failing the reachability test?

It's because this bridge does not expose its ORPort, which is the recommended configuration for default bridges. The torrc has `ORPort 127.0.0.1:auto`, so the bridges will never be reachable over their ORPort, which is intentional. Bridges that want to be distributed by BridgeDB need to expose their ORPort, which is an unfortunate technical limitation that makes the bridges more detectable (Obfsbridges should be able to "disable" their ORPort (#7349) · Issues · The Tor Project / Core / Tor · GitLab), but for default bridges it's not necessary. To be honest, I'm not sure that `AssumeReachable` is even required anymore for this kind of configuration; it's just something I remember having to do years ago for some reason. It may be superfluous now that we have `BridgeDistribution none`.

Do your Snowflake instances not have issues reporting to different DirectoryAuthorities?

Other than the possible metrics anomalies, I don't know what kind of issue you mean. It could be that, being a bridge, it has fewer constraints than your relays. A bridge doesn't have to be listed in the consensus, for example.

···

On Tue, Jan 25, 2022 at 11:21:10PM +0000, Gary C. New via tor-relays wrote:
_______________________________________________
tor-relays mailing list
tor-relays@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays

1 Like

David,

Snowflake sessions are now using the staging bridge, except for those that started before the change happened and haven’t finished yet, and perhaps some proxies that still have the IP address of the production bridge in their DNS cache. I am not sure yet what will happen with metrics, but we’ll see after a few days.

With regard to loadbalanced Snowflake sessions, I’m curious to know what connections (i.e., inbound, outbound, directory, control, etc) are being displayed within nyx?

Much Appreciated.

Gary

···


This Message Originated by the Sun.
iBigBlue 63W Solar Array (~12 Hour Charge)

  • 2 x Charmast 26800mAh Power Banks
    = iPhone XS Max 512GB (~2 Weeks Charged)