[tor-relays] How to reduce tor CPU load on a single bridge?

David,

Excellent documentation of your loadbalanced Snowflake endeavors!

The DNS record for the Snowflake bridge was switched to a temporary staging server, running the load balancing setup, at 2022-01-25 17:41:00. We were debugging some initial problems until 2022-01-25 18:47:00. You can read about it here:

https://bugs.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/40095#note_2772325

It’s nice to see that the Snowflake daemon offers a native configuration option for LimitNOFile. I ran into a similar issue with my initial loadbalanced Tor Relay Nodes that was solved at the O/S level using ulimit. It would be nice if torrc had a similar option.

From your documentation, it sounds like you’re running everything on the same machine? When expanding to additional machines, similar to the file limit issue, you’ll have to expand the usable ports as well.

I’d like to see more of your HAProxy configuration. Do you not have to use transparent proxy mode with Snowflake instances as you do with Tor Relay instances? I hadn’t realized HAProxy had a client timeout. Thank you for that tidbit. And thank you for referencing my comments as well.

Snowflake sessions are now using the staging bridge, except for those that started before the change happened and haven’t finished yet, and perhaps some proxies that still have the IP address of the production bridge in their DNS cache. I am not sure yet what will happen with metrics, but we’ll see after a few days.

Currently, as I only use IPv4, I can’t offer much insight as to the lack of IPv6 connections being reported (that’s what my logs report, too). Your Heartbeat messages are looking good with a symmetric balance of connections and data. They look very similar to my Heartbeat logs; except, you can tell you offer more computing power, which is great to see extrapolated! I’ve found that the Heartbeat logs are key to knowing the health of your loadbalanced Tor implementation. You might consider setting up syslog with a Snowflake filter to aggregate your Snowflake logs for easier readability.

Regarding metrics.torproject.org… I expect you’ll see that written-bytes and read-bytes only reflect that of a single Snowflake instance. However, your consensus weight will reflect the aggregate of all Snowflake instances.

On the matter of onion key rotation, I had the idea of making the onion key files read-only. Roger did some source code investigation and said that it might work to prevent onion key rotation, with some minor side effects. I plan to give the idea a try on a different bridge. The possible side effects are that tor will continue trying and failing to rotate the onion key every hour, and “force a router descriptor rebuild, so it will try to publish a new descriptor each hour.”

I’m interested to hear how the prospective read-only file fix plays out. However, from my observations, I would assume that connects will eventually start bleeding off any instances that fail to update the key. We really need a long-term solution to this issue for this style of deployment.

Keep up the Great Work!

Respectfully,

Gary

···


This Message Originated by the Sun.
iBigBlue 63W Solar Array (~12 Hour Charge)

  • 2 x Charmast 26800mAh Power Banks
    = iPhone XS Max 512GB (~2 Weeks Charged)
1 Like

David,

I’d like to see more of your HAProxy configuration. Do you not have to use transparent proxy mode with Snowflake instances as you do with Tor Relay instances? I hadn’t realized HAProxy had a client timeout. Thank you for that tidbit. And thank you for referencing my comments as well.

I found your HAProxy configuration in your “Draft installation guide.” It seems you’re using regular TCP streaming mode with the Snowflake instances vs transparent TCP streaming mode, which is a notable difference with the directly loadbalanced Tor Relay configuration. I also noticed you’ve configured the backend node timeout globally vs per node, which is just a nuance. You might test using a timeout value of 0s (to disable the timeout at the loadbalancer) and allow the Snowflake instances to preform state checking to ensure HAProxy isn’t throttling your bridge. I’ve tested both and I’m still not sure which timeout configuration makes most sense for this style implementation. Currently, I’m running with the 0s (disabled) timeout.

Any reason why you chose HAProxy over Nginx?

I did notice that you’re using the AssumeReachable 1 directive in your torrc files. Are you running into an issue where your Tor instances are failing the reachability test? Initially, I ran into a reachability issue and after digging through mountains of Tor debug logs discovered I needed to use transparent TCP streaming mode along with the Linux kernel and iptables changes to route the Tor traffic back from the Tor Relay Nodes to the loadbalancer. You shouldn’t need to run your Tor instances with the AssumeReachable 1 directive. This might suggest something in your configuration isn’t quite right.

One of my initial tests was staggering the startup of my instances to see how they randomly reported to the DirectoryAuthorities. It’s how I discovered that Tor instances pushed instead polled meta-data (different uptimes). The later would work better in a loadbalanced style deployment.

Do your Snowflake instances not have issues reporting to different DirectoryAuthorities? My Tor instances have issues if I don’t have them all report to the same DirectoryAuthority.

Keep up the excellent work.

Respectfully,

Gary

···


This Message Originated by the Sun.
iBigBlue 63W Solar Array (~12 Hour Charge)

  • 2 x Charmast 26800mAh Power Banks
    = iPhone XS Max 512GB (~2 Weeks Charged)
1 Like

It's nice to see that the Snowflake daemon offers a native configuration option for LimitNOFile. I ran into a similar issue with my initial loadbalanced Tor Relay Nodes that was solved at the O/S level using ulimit. It would be nice if torrc had a similar option.

LimitNOFile is actually not a Snowflake thing, it's a systemd thing. It's the same as `ulimit -n`. See:
https://www.freedesktop.org/software/systemd/man/systemd.exec.html#Process%20Properties

From your documentation, it sounds like you're running everything on the same machine? When expanding to additional machines, similar to the file limit issue, you'll have to expand the usable ports as well.

I don't think I understand your point. At 64K simultaneous connections, you run out of source ports for making connection 4-tuple unique, but I don't see how the same or different hosts makes a difference, in that respect.

I found your HAProxy configuration in your “Draft installation guide.” It seems you’re using regular TCP streaming mode with the Snowflake instances vs transparent TCP streaming mode, which is a notable difference with the directly loadbalanced Tor Relay configuration.

I admit I did not understand your point about transparent proxying. If it's about retaining the client's source IP address for source IP address pinning, I don't think that helps us. This is a bridge, not a relay, and the source IP address that haproxy sees is several steps removed from the client's actual IP address. haproxy receives connections from a localhost web server (the server pluggable transport that receives WebSocket connections); the web server receives connections from Snowflake proxies (which can and do have different IP addresses during the lifetime of a client session); only the Snowflake proxies themselves receive direct traffic from the client's own source IP address. The client's IP address is tunnelled all the way through to tor, for metrics purposes, but that uses the ExtORPort protocol and the load balancer isn't going to understand that. I think that transparent proxying would only transparently proxy the localhost IP addresses from the web server, which doesn't have any benefit, I don't think.

What's written in the draft installation guide is not the whole file. There's additionally the default settings as follows:

global
        log /dev/log    local0
        log /dev/log    local1 notice
        chroot /var/lib/haproxy
        stats socket /run/haproxy/admin.sock mode 660 level admin expose-fd listeners
        stats timeout 30s
        user haproxy
        group haproxy
        daemon

        # Default SSL material locations
        ca-base /etc/ssl/certs
        crt-base /etc/ssl/private

        # See: https://ssl-config.mozilla.org/#server=haproxy&server-version=2.0.3&config=intermediate
        ssl-default-bind-ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384
        ssl-default-bind-ciphersuites TLS_AES_128_GCM_SHA256:TLS_AES_256_GCM_SHA384:TLS_CHACHA20_POLY1305_SHA256
        ssl-default-bind-options ssl-min-ver TLSv1.2 no-tls-tickets

defaults
        log     global
        mode    http
        option  httplog
        option  dontlognull
        timeout connect 5000
        timeout client  50000
        timeout server  50000
        errorfile 400 /etc/haproxy/errors/400.http
        errorfile 403 /etc/haproxy/errors/403.http
        errorfile 408 /etc/haproxy/errors/408.http
        errorfile 500 /etc/haproxy/errors/500.http
        errorfile 502 /etc/haproxy/errors/502.http
        errorfile 503 /etc/haproxy/errors/503.http
        errorfile 504 /etc/haproxy/errors/504.http

You might test using a timeout value of 0s (to disable the timeout at the loadbalancer) and allow the Snowflake instances to preform state checking to ensure HAProxy isn’t throttling your bridge.

Thanks for that hint. So far, 10-minute timeouts seem not to be causing a problem. I don't know this software too well, but I think it's an idle timeout, not an absolute limit on connection lifetime.

Currently, as I only use IPv4, I can't offer much insight as to the lack of IPv6 connections being reported (that's what my logs report, too).

On further reflection, I don't think there's a problem here. The instances' bridge-stats and end-stats show a mix of countries and v4/v6.

Regarding metrics.torproject.org... I expect you'll see that written-bytes and read-bytes only reflect that of a single Snowflake instance. However, your consensus weight will reflect the aggregate of all Snowflake instances.

Indeed, the first few data points after the switchover show an apparent decrease in read/written bytes per second, even though the on-bridge bandwidth monitors show much more bandwidth being used than before. I suppose it could be selecting from any of 5 instances that currently share the same identity fingerprint: the 4 new load-balanced instances on the "staging" bridge, plus the 1 instance which is still running concurrently on the "production" bridge. When we finish the upgrade and get all the instances back on the production bridge, if the metrics are wrong, they will at least be uniformly wrong.
https://metrics.torproject.org/rs.html#details/5481936581E23D2D178105D44DB6915AB06BFB7F

Any reason why you chose HAProxy over Nginx?

Shelikhoo drafted a configuration using Nginx, which for the time being you can see here:

https://pad.riseup.net/p/pvKoxaIcejfiIbvVAV7j#L416

I don't have a strong preference and I don't have a lot of experience with either one. haproxy seemed to offer fewer opportunities for error, because the default Nginx installation expects to run a web server, which I would have to disable and ensure it did not fight with snowflake-server for port 443. It just seemed simpler to have one configuration file to edit and restart the daemon.

I did notice that you’re using the AssumeReachable 1 directive in your torrc files. Are you running into an issue where your Tor instances are failing the reachability test?

It's because this bridge does not expose its ORPort, which is the recommended configuration for default bridges. The torrc has `ORPort 127.0.0.1:auto`, so the bridges will never be reachable over their ORPort, which is intentional. Bridges that want to be distributed by BridgeDB need to expose their ORPort, which is an unfortunate technical limitation that makes the bridges more detectable (Obfsbridges should be able to "disable" their ORPort (#7349) · Issues · The Tor Project / Core / Tor · GitLab), but for default bridges it's not necessary. To be honest, I'm not sure that `AssumeReachable` is even required anymore for this kind of configuration; it's just something I remember having to do years ago for some reason. It may be superfluous now that we have `BridgeDistribution none`.

Do your Snowflake instances not have issues reporting to different DirectoryAuthorities?

Other than the possible metrics anomalies, I don't know what kind of issue you mean. It could be that, being a bridge, it has fewer constraints than your relays. A bridge doesn't have to be listed in the consensus, for example.

···

On Tue, Jan 25, 2022 at 11:21:10PM +0000, Gary C. New via tor-relays wrote:
_______________________________________________
tor-relays mailing list
tor-relays@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays

1 Like

David,

Snowflake sessions are now using the staging bridge, except for those that started before the change happened and haven’t finished yet, and perhaps some proxies that still have the IP address of the production bridge in their DNS cache. I am not sure yet what will happen with metrics, but we’ll see after a few days.

With regard to loadbalanced Snowflake sessions, I’m curious to know what connections (i.e., inbound, outbound, directory, control, etc) are being displayed within nyx?

Much Appreciated.

Gary

···


This Message Originated by the Sun.
iBigBlue 63W Solar Array (~12 Hour Charge)

  • 2 x Charmast 26800mAh Power Banks
    = iPhone XS Max 512GB (~2 Weeks Charged)

David,

I’ve been following your progress in the “Add load balancing to bridge (#40095)” issue.

The apparent decrease has to be spurious, since even at the beginning the bridge was moving more than 10 MB/s in both directions. A couple of hypotheses about what might be happening:

  • Onionoo is only showing us one instance out of the four. The actual numbers are four times higher.
    Per my previous response, my findings are consistent with yours in that Onionoo only shows metrics for a single instance; except, for consensus weight.

Here are the most recent heartbeat logs. It looks like the load is fairly balanced, with each of the four tor instances having sent between 400 and 500 GB since being started.

Your Heartbeat logs continue to appear to be in good health. When keys are rotated, the Heartbeat logs will be a key indicator in validating health whether connections are bleeding off from or remaining with a particular instance.

I worried a bit about the “0 with IPv6” in a previous comment. Looking at the bridge-stats files, I don’t think there’s a problem.

I’m glad to hear you feel the IPv6 reporting appears to be a false-negative. Does this mean there’s something wrong with IPv6 Heartbeat reporting?

Despite the load balancing, the 8 CPUs are pretty close to maxed. I would not mind having 16 cores right now. We may be in an induced demand situation where we make the bridge faster → the bridge gets more users → bridge gets slower.

I believe your observation is correct with regard to an induced traffic situation. As cpu resources increase, it will likely be lagged by increased traffic, until demand is satisfied or you run out of cpu resources, again. Are your existing 8 cpu’s only single cores? Is it too difficult to upgrade with your VPS provider? The O/S should detect the virtual hardware changes and add them accordingly. My current resource constraint is RAM, but I’m using bare-metal machines.

Great Progress!

Gary

···


This Message Originated by the Sun.
iBigBlue 63W Solar Array (~12 Hour Charge)

  • 2 x Charmast 26800mAh Power Banks
    = iPhone XS Max 512GB (~2 Weeks Charged)

With regard to loadbalanced Snowflake sessions, I'm curious to know what connections (i.e., inbound, outbound, directory, control, etc) are being displayed within nyx?

I'm not using nyx. I'm just looking at the bandwidth on the network
interface.

Your Heartbeat logs continue to appear to be in good health. When keys are rotated,

We're trying to avoid rotating keys at all. If the read-only files do not work, we'll instead probably periodically rewrite the state file to push the rotation into the future.

> I worried a bit about the "0 with IPv6" in a previous comment. Looking at the bridge-stats files, I don't think there's a problem.

I'm glad to hear you feel the IPv6 reporting appears to be a false-negative. Does this mean there's something wrong with IPv6 Heartbeat reporting?

I don't know if it's wrong, exactly. It's reporting something different than what ExtORPort is providing. The proximate connections to tor are indeed all IPv4.

Are your existing 8 cpu's only single cores? Is it too difficult to upgrade with your VPS provider?

Sure, there are plenty of ways to increase resources of the bridge, but I feel that's a different topic.

Thanks for your comments.

···

_______________________________________________
tor-relays mailing list
tor-relays@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays

On the matter of onion key rotation, I had the idea of making the onion key files read-only. Roger did some source code investigation and said that it might work to prevent onion key rotation, with some minor side effects. I plan to give the idea a try on a different bridge. The possible side effects are that tor will continue trying and failing to rotate the onion key every hour, and "force a router descriptor rebuild, so it will try to publish a new descriptor each hour."

Making secret_onion_key and secret_onion_key_ntor read-only does not quite work, because tor first renames them to secret_onion_key.old and secret_onion_key_ntor.old before writing new files. (Making the *.old files read-only does not work either, because the `tor_rename` function first unlinks the destination.)
https://gitweb.torproject.org/tor.git/tree/src/feature/relay/router.c?h=tor-0.4.6.9#n497

But a slight variation does work: make secret_onion_key.old and secret_onion_key_ntor.old *directories*, so that tor_rename cannot rename a file over them. It does result in an hourly `BUG` stack trace, but otherwise it seems effective.

I did a test with two tor instances. The rot1 instance had the directory hack to prevent onion key rotation. The rot2 had nothing to prevent onion key rotation.

# tor-instance-create rot1
# tor-instance-create rot2

/etc/tor/instances/rot1/torrc:

Log info file /var/lib/tor-instances/rot1/onionrotate.info.log
BridgeRelay 1
AssumeReachable 1
BridgeDistribution none
ORPort 127.0.0.1:auto
ExtORPort auto
SocksPort 0
Nickname onionrotate1

/etc/tor/instances/rot2/torrc:

Log info file /var/lib/tor-instances/rot2/onionrotate.info.log
BridgeRelay 1
AssumeReachable 1
BridgeDistribution none
ORPort 127.0.0.1:auto
ExtORPort auto
SocksPort 0
Nickname onionrotate2

Start rot1, copy its keys to rot2, then start rot2:

# service tor@rot1 start
# cp -r /var/lib/tor-instances/rot1/keys /var/lib/tor-instances/rot2/
# chown -R _tor-rot2:_tor-rot2 /var/lib/tor-instances/rot2/keys
# service tor@rot2 start

Stop the two instances, check that the onion keys are the same, and that `LastRotatedOnionKey` is set in both state files:

# service tor@rot1 stop
# service tor@rot2 stop
# ls -l /var/lib/tor-instances/rot*/keys/secret_onion_key*
-rw------- 1 _tor-rot1 _tor-rot1 888 Jan 28 22:57 /var/lib/tor-instances/rot1/keys/secret_onion_key
-rw------- 1 _tor-rot1 _tor-rot1  96 Jan 28 22:57 /var/lib/tor-instances/rot1/keys/secret_onion_key_ntor
-rw------- 1 _tor-rot2 _tor-rot2 888 Jan 28 23:05 /var/lib/tor-instances/rot2/keys/secret_onion_key
-rw------- 1 _tor-rot2 _tor-rot2  96 Jan 28 23:05 /var/lib/tor-instances/rot2/keys/secret_onion_key_ntor
# md5sum /var/lib/tor-instances/rot*/keys/secret_onion_key*
fb2a8a8f9de56f061eccbb3fedd700c4  /var/lib/tor-instances/rot1/keys/secret_onion_key
2066ab7e01595adf42fc791ad36e1fc5  /var/lib/tor-instances/rot1/keys/secret_onion_key_ntor
fb2a8a8f9de56f061eccbb3fedd700c4  /var/lib/tor-instances/rot2/keys/secret_onion_key
2066ab7e01595adf42fc791ad36e1fc5  /var/lib/tor-instances/rot2/keys/secret_onion_key_ntor
# grep LastRotatedOnionKey /var/lib/tor-instances/rot*/state
/var/lib/tor-instances/rot1/state:LastRotatedOnionKey 2022-01-28 22:57:14
/var/lib/tor-instances/rot2/state:LastRotatedOnionKey 2022-01-28 23:11:04

Set `LastRotatedOnionKey` 6 weeks into the past to force an attempt to rotate the keys the next time tor is restarted:

# sed -i -e 's/^LastRotatedOnionKey .*/LastRotatedOnionKey 2021-12-15 00:00:00/' /var/lib/tor-instances/rot*/state
# grep LastRotatedOnionKey /var/lib/tor-instances/rot*/state
/var/lib/tor-instances/rot1/state:LastRotatedOnionKey 2021-12-15 00:00:00
/var/lib/tor-instances/rot2/state:LastRotatedOnionKey 2021-12-15 00:00:00

Create the secret_onion_key.old and secret_onion_key_ntor.old directories in the rot1 instance.

# mkdir -m 700 /var/lib/tor-instances/rot1/keys/secret_onion_key{,_ntor}.old

Check the identity of keys before starting:

# md5sum /var/lib/tor-instances/rot*/keys/secret_onion_key*
fb2a8a8f9de56f061eccbb3fedd700c4  /var/lib/tor-instances/rot1/keys/secret_onion_key
2066ab7e01595adf42fc791ad36e1fc5  /var/lib/tor-instances/rot1/keys/secret_onion_key_ntor
md5sum: /var/lib/tor-instances/rot1/keys/secret_onion_key_ntor.old: Is a directory
md5sum: /var/lib/tor-instances/rot1/keys/secret_onion_key.old: Is a directory
fb2a8a8f9de56f061eccbb3fedd700c4  /var/lib/tor-instances/rot2/keys/secret_onion_key
2066ab7e01595adf42fc791ad36e1fc5  /var/lib/tor-instances/rot2/keys/secret_onion_key_ntor

Start both instances:

# service tor@rot1 start
# service tor@rot2 start

Verify that the rot1 instance is still using the same onion keys, while rot2 has rotated them:

# ls -ld /var/lib/tor-instances/rot*/keys/secret_onion_key*
-rw------- 1 _tor-rot1 _tor-rot1  888 Jan 28 23:45 /var/lib/tor-instances/rot1/keys/secret_onion_key
-rw------- 1 _tor-rot1 _tor-rot1   96 Jan 28 23:45 /var/lib/tor-instances/rot1/keys/secret_onion_key_ntor
drwx--S--- 2 root      _tor-rot1 4096 Jan 28 23:44 /var/lib/tor-instances/rot1/keys/secret_onion_key_ntor.old
drwx--S--- 2 root      _tor-rot1 4096 Jan 28 23:44 /var/lib/tor-instances/rot1/keys/secret_onion_key.old
-rw------- 1 _tor-rot2 _tor-rot2  888 Jan 28 23:47 /var/lib/tor-instances/rot2/keys/secret_onion_key
-rw------- 1 _tor-rot2 _tor-rot2   96 Jan 28 23:47 /var/lib/tor-instances/rot2/keys/secret_onion_key_ntor
-rw------- 1 _tor-rot2 _tor-rot2   96 Jan 28 23:05 /var/lib/tor-instances/rot2/keys/secret_onion_key_ntor.old
-rw------- 1 _tor-rot2 _tor-rot2  888 Jan 28 23:05 /var/lib/tor-instances/rot2/keys/secret_onion_key.old
# md5sum /var/lib/tor-instances/rot*/keys/secret_onion_key*
fb2a8a8f9de56f061eccbb3fedd700c4  /var/lib/tor-instances/rot1/keys/secret_onion_key
2066ab7e01595adf42fc791ad36e1fc5  /var/lib/tor-instances/rot1/keys/secret_onion_key_ntor
md5sum: /var/lib/tor-instances/rot1/keys/secret_onion_key_ntor.old: Is a directory
md5sum: /var/lib/tor-instances/rot1/keys/secret_onion_key.old: Is a directory
fb8a5e8787141dba4e935267f818cc2a  /var/lib/tor-instances/rot2/keys/secret_onion_key
2c3f7d81e96641e2c04fb9c452296337  /var/lib/tor-instances/rot2/keys/secret_onion_key_ntor
2066ab7e01595adf42fc791ad36e1fc5  /var/lib/tor-instances/rot2/keys/secret_onion_key_ntor.old
fb2a8a8f9de56f061eccbb3fedd700c4  /var/lib/tor-instances/rot2/keys/secret_onion_key.old

The rot1 instance's `LastRotatedOnionKey` remains the same, while rot2's is updated:

# grep LastRotatedOnionKey /var/lib/tor-instances/rot*/state
/var/lib/tor-instances/rot1/state:LastRotatedOnionKey 2021-12-15 00:00:00
/var/lib/tor-instances/rot2/state:LastRotatedOnionKey 2022-01-28 23:47:02

The rot1 instance's log shows the failure to rotate the keys:

/var/lib/tor-instances/rot1/onionrotate.info.log

Jan 28 23:46:59.000 [info] rotate_onion_key_callback(): Rotating onion key.
Jan 28 23:46:59.000 [warn] Couldn't rotate onion key.
Jan 28 23:46:59.000 [info] router_rebuild_descriptor(): Rebuilding relay descriptor (forced)
...
Jan 28 23:46:59.000 [info] check_onion_keys_expiry_time_callback(): Expiring old onion keys.

While the rot2 rotation was successful:

/var/lib/tor-instances/rot2/onionrotate.info.log

Jan 28 23:47:02.000 [info] rotate_onion_key_callback(): Rotating onion key.
Jan 28 23:47:02.000 [info] rotate_onion_key(): Rotating onion key
Jan 28 23:47:02.000 [info] mark_my_descriptor_dirty(): Decided to publish new relay descriptor: rotated onion key

After 1 hour, the rot1 instance tries to rebuild its relay descriptor, and triggers a `BUG` non-fatal assertion failure in [`router_rebuild_descriptor`](router.c\relay\feature\src - tor - Tor's source code). I let it run for 1 more hour after that, and it happened again.

/var/lib/tor-instances/rot1/onionrotate.info.log

Jan 29 00:46:59.000 [info] router_rebuild_descriptor(): Rebuilding relay descriptor (forced)
Jan 29 00:46:59.000 [warn] The IPv4 ORPort address 127.0.0.1 does not match the descriptor address 172.105.3.197. If you have a static public IPv4 address, use 'Address <IPv4>' and 'OutboundBindAddress <IPv4>'. If you are behind a NAT, use two ORPort lines: 'ORPort <PublicPort> NoListen' and 'ORPort <InternalPort> NoAdvertise'.
Jan 29 00:46:59.000 [info] extrainfo_dump_to_string_stats_helper(): Adding stats to extra-info descriptor.
Jan 29 00:46:59.000 [info] read_file_to_str(): Could not open "/var/lib/tor-instances/rot1/stats/bridge-stats": No such file or directory
Jan 29 00:46:59.000 [warn] tor_bug_occurred_(): Bug: ../src/feature/relay/router.c:2452: router_rebuild_descriptor: Non-fatal assertion !(desc_gen_reason == NULL) failed. (on Tor 0.4.5.10 )
Jan 29 00:46:59.000 [warn] Bug: Tor 0.4.5.10: Non-fatal assertion !(desc_gen_reason == NULL) failed in router_rebuild_descriptor at ../src/feature/relay/router.c:2452. Stack trace: (on Tor 0.4.5.10 )
Jan 29 00:46:59.000 [warn] Bug:     /usr/bin/tor(log_backtrace_impl+0x57) [0x5638b9538047] (on Tor 0.4.5.10 )
Jan 29 00:46:59.000 [warn] Bug:     /usr/bin/tor(tor_bug_occurred_+0x16b) [0x5638b954327b] (on Tor 0.4.5.10 )
Jan 29 00:46:59.000 [warn] Bug:     /usr/bin/tor(router_rebuild_descriptor+0x13d) [0x5638b94f4e1d] (on Tor 0.4.5.10 )
Jan 29 00:46:59.000 [warn] Bug:     /usr/bin/tor(+0x21f163) [0x5638b9665163] (on Tor 0.4.5.10 )
Jan 29 00:46:59.000 [warn] Bug:     /usr/bin/tor(+0x83577) [0x5638b94c9577] (on Tor 0.4.5.10 )
Jan 29 00:46:59.000 [warn] Bug:     /lib/x86_64-linux-gnu/libevent-2.1.so.7(+0x239ef) [0x7f701bae49ef] (on Tor 0.4.5.10 )
Jan 29 00:46:59.000 [warn] Bug:     /lib/x86_64-linux-gnu/libevent-2.1.so.7(event_base_loop+0x52f) [0x7f701bae528f] (on Tor 0.4.5.10 )
Jan 29 00:46:59.000 [warn] Bug:     /usr/bin/tor(do_main_loop+0x101) [0x5638b94b1321] (on Tor 0.4.5.10 )
Jan 29 00:46:59.000 [warn] Bug:     /usr/bin/tor(tor_run_main+0x1d5) [0x5638b94acdd5] (on Tor 0.4.5.10 )
Jan 29 00:46:59.000 [warn] Bug:     /usr/bin/tor(tor_main+0x49) [0x5638b94a92e9] (on Tor 0.4.5.10 )
Jan 29 00:46:59.000 [warn] Bug:     /usr/bin/tor(main+0x19) [0x5638b94a8ec9] (on Tor 0.4.5.10 )
Jan 29 00:46:59.000 [warn] Bug:     /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xea) [0x7f701b391d0a] (on Tor 0.4.5.10 )
Jan 29 00:46:59.000 [warn] Bug:     /usr/bin/tor(_start+0x2a) [0x5638b94a8f1a] (on Tor 0.4.5.10 )
Jan 29 00:46:59.000 [info] router_upload_dir_desc_to_dirservers(): Uploading relay descriptor to directory authorities
Jan 29 00:46:59.000 [info] directory_post_to_dirservers(): Uploading an extrainfo too (length 822)
Jan 29 00:46:59.000 [info] rep_hist_note_used_internal(): New port prediction added. Will continue predictive circ building for 3332 more seconds.
Jan 29 00:46:59.000 [info] connection_ap_make_link(): Making internal anonymized tunnel to [scrubbed]:9001 ...
Jan 29 00:46:59.000 [info] connection_ap_make_link(): ... application connection created and linked.
Jan 29 00:46:59.000 [info] check_onion_keys_expiry_time_callback(): Expiring old onion keys.

Stopping and restarting the tor1 instance keeps the same onion keys, and the first rotation does not hit the assertion failure:

# service tor@rot1 stop
# service tor@rot1 start
# md5sum /var/lib/tor-instances/rot*/keys/secret_onion_key*
fb2a8a8f9de56f061eccbb3fedd700c4  /var/lib/tor-instances/rot1/keys/secret_onion_key
2066ab7e01595adf42fc791ad36e1fc5  /var/lib/tor-instances/rot1/keys/secret_onion_key_ntor
md5sum: /var/lib/tor-instances/rot1/keys/secret_onion_key_ntor.old: Is a directory
md5sum: /var/lib/tor-instances/rot1/keys/secret_onion_key.old: Is a directory
fb8a5e8787141dba4e935267f818cc2a  /var/lib/tor-instances/rot2/keys/secret_onion_key
2c3f7d81e96641e2c04fb9c452296337  /var/lib/tor-instances/rot2/keys/secret_onion_key_ntor
2066ab7e01595adf42fc791ad36e1fc5  /var/lib/tor-instances/rot2/keys/secret_onion_key_ntor.old
fb2a8a8f9de56f061eccbb3fedd700c4  /var/lib/tor-instances/rot2/keys/secret_onion_key.old

/var/lib/tor-instances/rot1/onionrotate.info.log

Jan 29 02:06:13.000 [info] rotate_onion_key_callback(): Rotating onion key.
Jan 29 02:06:13.000 [warn] Couldn't rotate onion key.
Jan 29 02:06:13.000 [info] router_rebuild_descriptor(): Rebuilding relay descriptor (forced)
...
Jan 29 02:06:13.000 [info] check_onion_keys_expiry_time_callback(): Expiring old onion keys.
···

_______________________________________________
tor-relays mailing list
tor-relays@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays

1 Like

David,

It’s nice to see that the Snowflake daemon offers a native configuration option for LimitNOFile. I ran into a similar issue with my initial loadbalanced Tor Relay Nodes that was solved at the O/S level using ulimit. It would be nice if torrc had a similar option.

LimitNOFile is actually not a Snowflake thing, it’s a systemd thing. It’s the same as ulimit -n. See:

https://www.freedesktop.org/software/systemd/man/systemd.exec.html#Process%20Properties

Ah… My mistake. In my cursory review of your “Draft installation guide” I only saw snowflake-server. and assumed it was .conf where in actuality it is .service. I should have noticed the /etc/systemd path. Thank you for the correction.

From your documentation, it sounds like you’re running everything on the same machine? When expanding to additional machines, similar to the file limit issue, you’ll have to expand the usable ports as well.

I don’t think I understand your point. At 64K simultaneous connections, you run out of source ports for making connection 4-tuple unique, but I don’t see how the same or different hosts makes a difference, in that respect.

On many Linux distros, the default ip_local_port_range is between 32768 - 61000.

cat /proc/sys/net/ipv4/ip_local_port_range

32768 61000

The Tor Project recommends increasing it.

echo 15000 64000 > /proc/sys/net/ipv4/ip_local_port_range

I found your HAProxy configuration in your “Draft installation guide.” It seems you’re using regular TCP streaming mode with the Snowflake instances vs transparent TCP streaming mode, which is a notable difference with the directly loadbalanced Tor Relay configuration.

I admit I did not understand your point about transparent proxying. If it’s about retaining the client’s source IP address for source IP address pinning, I don’t think that helps us.

In Transparent TCP Steam mode, the Loadbalancer clones the IP address of the connecting Tor Client/Relay for use on the internal interface with connections to the upstream Tor Relay Nodes, so the Upstream Tor Relay Nodes believe they’re talking to the actual connecting Tor Client/Relay.

This is a bridge, not a relay, and the source IP address that haproxy sees is several steps removed from the client’s actual IP address. haproxy receives connections from a localhost web server (the server pluggable transport that receives WebSocket connections); the web server receives connections from Snowflake proxies (which can and do have different IP addresses during the lifetime of a client session); only the Snowflake proxies themselves receive direct traffic from the client’s own source IP address.

You are correct. This makes more sense why HAProxy’s Regular TCP Streaming Mode works in this paradigm. I believe what was confusing was the naming convention of your Tor instances (i.e., snowflake#), which lead me to believe that your Snowflake proxy instances were upstream and not downstream. However, correlating the IP address assignments between configurations confirms HAProxy is loadbalancing upstream to your Tor Nodes.

The client’s IP address is tunnelled all the way through to tor, for metrics purposes, but that uses the ExtORPort protocol and the load balancer isn’t going to understand that.

As long as HAProxy is configured to use TCP Streaming Mode, it doesn’t matter what protocol is used as it will be passed through encapsulated in TCP. That’s the beauty of TCP Streaming Mode.

I think that transparent proxying would only transparently proxy the localhost IP addresses from the web server, which doesn’t have any benefit, I don’t think.

Agreed.

You might test using a timeout value of 0s (to disable the timeout at the loadbalancer) and allow the Snowflake instances to preform state checking to ensure HAProxy isn’t throttling your bridge.

Thanks for that hint. So far, 10-minute timeouts seem not to be causing a problem. I don’t know this software too well, but I think it’s an idle timeout, not an absolute limit on connection lifetime.

It’s HAProxy’s Passive Health Check Timeout. The reason why I disabled (0s) this timeout is I felt that the Tor instances know their state threshold better and if they became overloaded would tell the DirectoryAuthorities. One scenario where a lengthy HAProxy timeout might be of value is if a single instance was having issues and causing a reported overloaded state for the rest. However, this would more likely occur in a multi-physical/virtual-node environment. You’ll have to continue to update me with your thoughts on this subject as you continue your testing.

Any reason why you chose HAProxy over Nginx?

Shelikhoo drafted a configuration using Nginx, which for the time being you can see here:

https://bugs.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/40091#note_2768891

https://pad.riseup.net/p/pvKoxaIcejfiIbvVAV7j#L416

I don’t have a strong preference and I don’t have a lot of experience with either one. haproxy seemed to offer fewer opportunities for error, because the default Nginx installation expects to run a web server, which I would have to disable and ensure it did not fight with snowflake-server for port 443. It just seemed simpler to have one configuration file to edit and restart the daemon.

My Nginx configuration is actually smaller than my HAProxy configuration. All you really need from either Nginx/HAProxy configurations are the Global Default settings (especially the file/connection limits) and your TCP Streaming settings. As stated previously, I would recommend using Nginx simply for the fact that it forks additional child processes as connections/demand increases, which I could never figured out with HAProxy.

I did notice that you’re using the AssumeReachable 1 directive in your torrc files. Are you running into an issue where your Tor instances are failing the reachability test?

It’s because this bridge does not expose its ORPort, which is the recommended configuration for default bridges. The torrc has ORPort 127.0.0.1:auto, so the bridges will never be reachable over their ORPort, which is intentional. Bridges that want to be distributed by BridgeDB need to expose their ORPort, which is an unfortunate technical limitation that makes the bridges more detectable (https://bugs.torproject.org/tpo/core/tor/7349), but for default bridges it’s not necessary. To be honest, I’m not sure that AssumeReachable is even required anymore for this kind of configuration; it’s just something I remember having to do years ago for some reason. It may be superfluous now that we have BridgeDistribution none.

Interesting… This shows my lack of knowledge regarding bridges as I have never run a bridge. Additionally, it highlights the major differences in running a Loadbalanced Tor Bridge vs a Loadbalanced Tor Relay and the necessity of using Transparent TCP Streaming Mode when the ORPort is exposed vs using Regular TCP Streaming Mode when the ORPort is not exposed. My Nginx Loadbalancer sits on the border of my network, listens on ORPort 9001, and uses Transparent TCP Streaming to loadbalance connections upstream to my Tor Relay Nodes.

Do your Snowflake instances not have issues reporting to different DirectoryAuthorities?

Other than the possible metrics anomalies, I don’t know what kind of issue you mean. It could be that, being a bridge, it has fewer constraints than your relays. A bridge doesn’t have to be listed in the consensus, for example.

Yes… It’s issues with consensus that I run into, if I don’t configure my Tor Relay Nodes to send updates to a single DirectoryAuthority. This appears to be another major difference between running a Loadbalanced Tor Bridge vs a Loadbalanced Tor Relay.

With regard to loadbalanced Snowflake sessions, I’m curious to know what connections (i.e., inbound, outbound, directory, control, etc) are being displayed within nyx?

I’m not using nyx. I’m just looking at the bandwidth on the network

interface.

If you have time, would you mind installing nyx to validate observed similarities/differences between our loadbalanced configurations?

Your Heartbeat logs continue to appear to be in good health. When keys are rotated,

We’re trying to avoid rotating keys at all. If the read-only files do not work, we’ll instead probably periodically rewrite the state file to push the rotation into the future.

I’m especially interested in this topic. Please keep me updated!

I worried a bit about the “0 with IPv6” in a previous comment. Looking at the bridge-stats files, I don’t think there’s a problem.

I’m glad to hear you feel the IPv6 reporting appears to be a false-negative. Does this mean there’s something wrong with IPv6 Heartbeat reporting?

I don’t know if it’s wrong, exactly. It’s reporting something different than what ExtORPort is providing. The proximate connections to tor are indeed all IPv4.

I see. Perhaps IPv6 connections are less prolific and require more time to ramp?

Are your existing 8 cpu’s only single cores? Is it too difficult to upgrade with your VPS provider?

Sure, there are plenty of ways to increase resources of the bridge, but I feel that’s a different topic.

After expanding my reading of your related “issues,” I see that your VPS provider only offers up to 8 cores. Is it possible to spin-up another VPS environment, with the same provider, on a separate VLAN, allowing route/firewall access between the two VPS environments? This way you could test loadbalancing a Tor Bridge over a local network using multiple virtual environments. Perhaps, the Tor Project might even assist you with such a short-term investment (I read the meeting notes). :wink:

Thanks for your comments.

Thank you for your responses.

Respectfully,

Gary

···

On Thursday, January 27, 2022, 1:03:25 AM MST, David Fifield david@bamsoftware.com wrote:

This Message Originated by the Sun.

iBigBlue 63W Solar Array (~12 Hour Charge)

  • 2 x Charmast 26800mAh Power Banks

= iPhone XS Max 512GB (~2 Weeks Charged)

David,

Making secret_onion_key and secret_onion_key_ntor read-only does not quite work, because tor first renames them to secret_onion_key.old and secret_onion_key_ntor.old before writing new files. (Making the *.old files read-only does not work either, because the tor_rename function first unlinks the destination.)
https://gitweb.torproject.org/tor.git/tree/src/feature/relay/router.c?h=tor-0.4.6.9#n497

But a slight variation does work: make secret_onion_key.old and secret_onion_key_ntor.old directories, so that tor_rename cannot rename a file over them. It does result in an hourly BUG stack trace, but otherwise it seems effective.

Directories instead of read-only files. Nice Out-Of-The-Box Thinking!

Now, the question becomes whether there are any adverse side-effects, with the DirectoryAuthorities, from the secret_onion_keys not being updated over time?

Excellent Work!

Much Respect.

Gary

···


This Message Originated by the Sun.
iBigBlue 63W Solar Array (~12 Hour Charge)

  • 2 x Charmast 26800mAh Power Banks
    = iPhone XS Max 512GB (~2 Weeks Charged)

> > From your documentation, it sounds like you're running everything on the same machine? When expanding to additional machines, similar to the file limit issue, you'll have to expand the usable ports as well.

> I don't think I understand your point. At 64K simultaneous connections, you run out of source ports for making connection 4-tuple unique, but I don't see how the same or different hosts makes a difference, in that respect.

On many Linux distros, the default ip_local_port_range is between 32768 - 61000.

The Tor Project recommends increasing it.

# echo 15000 64000 > /proc/sys/net/ipv4/ip_local_port_range

Thanks, that's a good tip. I added it to the installation guide.

> I'm not using nyx. I'm just looking at the bandwidth on the network interface.

If you have time, would you mind installing nyx to validate observed similarities/differences between our loadbalanced configurations?

I don't have plans to do that.

> > I'm glad to hear you feel the IPv6 reporting appears to be a false-negative. Does this mean there's something wrong with IPv6 Heartbeat reporting?

> I don't know if it's wrong, exactly. It's reporting something different than what ExtORPort is providing. The proximate connections to tor are indeed all IPv4.

I see. Perhaps IPv6 connections are less prolific and require more time to ramp?

No, it's not that. The bridge has plenty of connections from clients that use an IPv6 address, as the bridge-stats file shows:

bridge-ip-versions v4=15352,v6=1160

It's just that, unlike a direct TCP connection as the the case with a guard relay, the client connections pass through a chain of proxies and processes on the way to the tor: client → Snowflake proxy → snowflake-server WebSocket server → extor-static-cookie adapter → tor. The last link in the chain is IPv4, and evidently that is what the heartbeat log reports. The client's actual IP address is tunnelled, for metrics purposes, through this chain of proxies and processes, to tor using a special protocol called ExtORPort (see USERADDR at 196-transport-control-ports.txt\proposals - torspec - Tor's protocol specifications). It looks like the bridge-stats descriptor pays attention to the USERADDR information and the heartbeat log does not, that's all.

After expanding my reading of your related "issues," I see that your VPS provider only offers up to 8 cores. Is it possible to spin-up another VPS environment, with the same provider, on a separate VLAN, allowing route/ firewall access between the two VPS environments? This way you could test loadbalancing a Tor Bridge over a local network using multiple virtual environments.

Yes, there are many other potential ways to further expand the deployment, but I do not have much interest in that topic right now. I started the thread for help with a non-obvious point, namely getting past the bottleneck of a single-core tor process. I think that we have collectively found a satisfactory solution for that. The steps after that for further scaling are relatively straightforward, I think. Running one instance of snowflake-server on one host and all the instances of tor on a nearby host is a logical next step.

···

On Sat, Jan 29, 2022 at 02:54:40AM +0000, Gary C. New via tor-relays wrote:
_______________________________________________
tor-relays mailing list
tor-relays@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays

I did not follow the thread closely, but if you want a file or directory
contents unchangeable, and not allowed to rename/delete even by root, there's
the "immutable" attribute (chattr +i).

···

On Fri, 28 Jan 2022 19:58:49 -0700 David Fifield <david@bamsoftware.com> wrote:

> On the matter of onion key rotation, I had the idea of making the onion key files read-only. Roger did some source code investigation and said that it might work to prevent onion key rotation, with some minor side effects. I plan to give the idea a try on a different bridge. The possible side effects are that tor will continue trying and failing to rotate the onion key every hour, and "force a router descriptor rebuild, so it will try to publish a new descriptor each hour."

Making secret_onion_key and secret_onion_key_ntor read-only does not quite work, because tor first renames them to secret_onion_key.old and secret_onion_key_ntor.old before writing new files. (Making the *.old files read-only does not work either, because the `tor_rename` function first unlinks the destination.)
router.c\relay\feature\src - tor - Tor's source code

But a slight variation does work: make secret_onion_key.old and secret_onion_key_ntor.old *directories*, so that tor_rename cannot rename a file over them. It does result in an hourly `BUG` stack trace, but otherwise it seems effective.

I did a test with two tor instances. The rot1 instance had the directory hack to prevent onion key rotation. The rot2 had nothing to prevent onion key rotation.

--
With respect,
Roman
_______________________________________________
tor-relays mailing list
tor-relays@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays

1 Like

I’m not using nyx. I’m just looking at the bandwidth on the network interface.

If you have time, would you mind installing nyx to validate observed similarities/differences between our loadbalanced configurations?

I don’t have plans to do that.

I appreciate you setting expectations.

I’m glad to hear you feel the IPv6 reporting appears to be a false-negative. Does this mean there’s something wrong with IPv6 Heartbeat reporting?

I don’t know if it’s wrong, exactly. It’s reporting something different than what ExtORPort is providing. The proximate connections to tor are indeed all IPv4.

I see. Perhaps IPv6 connections are less prolific and require more time to ramp?

No, it’s not that. The bridge has plenty of connections from clients that use an IPv6 address, as the bridge-stats file shows:

bridge-ip-versions v4=15352,v6=1160

It’s just that, unlike a direct TCP connection as the the case with a guard relay, the client connections pass through a chain of proxies and processes on the way to the tor: client → Snowflake proxy → snowflake-server WebSocket server → extor-static-cookie adapter → tor. The last link in the chain is IPv4, and evidently that is what the heartbeat log reports. The client’s actual IP address is tunnelled, for metrics purposes, through this chain of proxies and processes, to tor using a special protocol called ExtORPort (see USERADDR at https://gitweb.torproject.org/torspec.git/tree/proposals/196-transport-control-ports.txt). It looks like the bridge-stats descriptor pays attention to the USERADDR information and the heartbeat log does not, that’s all.

Ah… Gotcha. Thank you for clarifying.

After expanding my reading of your related “issues,” I see that your VPS provider only offers up to 8 cores. Is it possible to spin-up another VPS environment, with the same provider, on a separate VLAN, allowing route/ firewall access between the two VPS environments? This way you could test loadbalancing a Tor Bridge over a local network using multiple virtual environments.

Yes, there are many other potential ways to further expand the deployment, but I do not have much interest in that topic right now. I started the thread for help with a non-obvious point, namely getting past the bottleneck of a single-core tor process. I think that we have collectively found a satisfactory solution for that. The steps after that for further scaling are relatively straightforward, I think. Running one instance of snowflake-server on one host and all the instances of tor on a nearby host is a logical next step.

Understand. I appreciate the work you have done and the opportunity to compare and contrast Loadbalanced Tor Bridges vs Loadbalanced Tor Relays.

Please update the tor-relays mailing-list with any new findings related to subversion of the onion keys rotation.

Excellent Work!

Respectfully,

Gary

···

On Saturday, January 29, 2022, 9:46:59 PM PST, David Fifield david@bamsoftware.com wrote:

This Message Originated by the Sun.
iBigBlue 63W Solar Array (~12 Hour Charge)

  • 2 x Charmast 26800mAh Power Banks
    = iPhone XS Max 512GB (~2 Weeks Charged)
1 Like

contents unchangeable, and not allowed to rename/delete even by root, there’s
the “immutable” attribute (chattr +i).

I like the immutable attribute approach. It can be applied to the original secret_onion_key and secret_onion_key_ntor files.

Appreciate the input.

Respectfully,

Gary

···

On Sunday, January 30, 2022, 2:26:08 AM PST, Roman Mamedov rm@romanrm.net wrote:

On Fri, 28 Jan 2022 19:58:49 -0700 David Fifield <david@bamsoftware.com> wrote:

But a slight variation does work: make secret_onion_key.old and secret_onion_key_ntor.old directories, so that tor_rename cannot rename a file over them. It does result in an hourly BUG stack trace, but otherwise it seems effective.

I did a test with two tor instances. The rot1 instance had the directory hack to prevent onion key rotation. The rot2 had nothing to prevent onion key rotation.

I did not follow the thread closely, but if you want a file or directory

This Message Originated by the Sun.
iBigBlue 63W Solar Array (~12 Hour Charge)

  • 2 x Charmast 26800mAh Power Banks
    = iPhone XS Max 512GB (~2 Weeks Charged)
1 Like

The load-balanced Snowflake bridge is running in production since
2022-01-31. Thanks Roger, Gary, Roman for your input.

Hopefully reproducible installation instructions:
  Snowflake Bridge Installation Guide · Wiki · The Tor Project / Anti-censorship / Team · GitLab
Observations since:
  Add load balancing to bridge (#40095) · Issues · The Tor Project / Anti-censorship / Pluggable Transports / Snowflake · GitLab

Metrics graphs are currently confused by multiple instances of tor
uploading descriptors under the same fingerprint. Particularly in the
interval between 2022-01-25 and 2022-02-03, when a production bridge and
staging bridge were running in parallel, with four instances being used
and another four being mostly unused.
  Relay Search
  Users – Tor Metrics
Since 2022-02-03, it appears that Metrics is showing only one of the
four running instances per day. Because all four instances are about
equally used (as if load balanced, go figure), the values on the graph
are 1/4 what they should be. The reported bandwidth of 5 MB/s is
actually 20 MB/s, and the 2500 clients are actually 10000. All the
necessary data are present in Collector, it's just a question of data
processing. I opened an issue for the Metrics graphs, where you can also
see some manually made graphs that are closer to the true values.
  Graphs for multiple relays that have the same fingerprint (#40022) · Issues · The Tor Project / Network Health / Metrics / Onionoo · GitLab

I started a thread on tor-dev about the issues of onion key rotation and
ExtORPort authentication.
  The tor-dev February 2022 Archive by thread

···

_______________________________________________
tor-relays mailing list
tor-relays@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays

1 Like

David,

Excellent Documentation and References!

I hope the proposed RFC’s (auth, key, and metrics) for loadbalanced Tor topologies are seriously considered and implemented by Tor Core and Tor Metrics.

Great Work!

Respectfully,

Gary

···


This Message Originated by the Sun.
iBigBlue 63W Solar Array (~12 Hour Charge)

  • 2 x Charmast 26800mAh Power Banks
    = iPhone XS Max 512GB (~2 Weeks Charged)

On Tuesday, February 8, 2022, 10:02:53 AM MST, David Fifield david@bamsoftware.com wrote:

The load-balanced Snowflake bridge is running in production since
2022-01-31. Thanks Roger, Gary, Roman for your input.

Hopefully reproducible installation instructions:
https://gitlab.torproject.org/tpo/anti-censorship/team/-/wikis/Survival-Guides/Snowflake-Bridge-Installation-Guide?version_id=6de6facbb0fd047de978a561213c59224511445f
Observations since:
https://bugs.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/40095#note_2774428

Metrics graphs are currently confused by multiple instances of tor
uploading descriptors under the same fingerprint. Particularly in the
interval between 2022-01-25 and 2022-02-03, when a production bridge and
staging bridge were running in parallel, with four instances being used
and another four being mostly unused.
https://metrics.torproject.org/rs.html#details/5481936581E23D2D178105D44DB6915AB06BFB7F
https://metrics.torproject.org/userstats-bridge-transport.html?start=2021-11-10&end=2022-02-08&transport=snowflake
Since 2022-02-03, it appears that Metrics is showing only one of the
four running instances per day. Because all four instances are about
equally used (as if load balanced, go figure), the values on the graph
are 1/4 what they should be. The reported bandwidth of 5 MB/s is
actually 20 MB/s, and the 2500 clients are actually 10000. All the
necessary data are present in Collector, it’s just a question of data
processing. I opened an issue for the Metrics graphs, where you can also
see some manually made graphs that are closer to the true values.
https://bugs.torproject.org/tpo/network-health/metrics/onionoo/40022

I started a thread on tor-dev about the issues of onion key rotation and
ExtORPort authentication.
https://lists.torproject.org/pipermail/tor-dev/2022-February/thread.html


tor-relays mailing list
tor-relays@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays

1 Like

David,

Has Tor Metrics implemented your RFC related to Written Bytes per Second and Read Bytes per Second on Onionoo?

As of the 27th of February, I’ve noticed a change in reporting that accurately reflects the aggregate of my Tor Relay Nodes opposed to the previously reported Single Tor Node. Are you seeing a similar change for snowflake.torproject.org?

Additionally, other than the hourly stacktrace errors in the syslog, the secure_onion_key workaround seems to be working well without any ill side-effects. I’ve been able to operate with the same secure_onion_key for close to 5 weeks, now. Have you run into any issues?

Thank you for your response.

Respectfully,

Gary

···


This Message Originated by the Sun.
iBigBlue 63W Solar Array (~12 Hour Charge)

  • 2 x Charmast 26800mAh Power Banks
    = iPhone XS Max 512GB (~2 Weeks Charged)

On Tuesday, February 8, 2022, 11:49:47 PM MST, Gary C. New via tor-relays tor-relays@lists.torproject.org wrote:

David,

Excellent Documentation and References!

I hope the proposed RFC’s (auth, key, and metrics) for loadbalanced Tor topologies are seriously considered and implemented by Tor Core and Tor Metrics.

Great Work!

Respectfully,

Gary

This Message Originated by the Sun.
iBigBlue 63W Solar Array (~12 Hour Charge)

  • 2 x Charmast 26800mAh Power Banks
    = iPhone XS Max 512GB (~2 Weeks Charged)

On Tuesday, February 8, 2022, 10:02:53 AM MST, David Fifield david@bamsoftware.com wrote:

The load-balanced Snowflake bridge is running in production since
2022-01-31. Thanks Roger, Gary, Roman for your input.

Hopefully reproducible installation instructions:
https://gitlab.torproject.org/tpo/anti-censorship/team/-/wikis/Survival-Guides/Snowflake-Bridge-Installation-Guide?version_id=6de6facbb0fd047de978a561213c59224511445f
Observations since:
https://bugs.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/40095#note_2774428

Metrics graphs are currently confused by multiple instances of tor
uploading descriptors under the same fingerprint. Particularly in the
interval between 2022-01-25 and 2022-02-03, when a production bridge and
staging bridge were running in parallel, with four instances being used
and another four being mostly unused.
https://metrics.torproject.org/rs.html#details/5481936581E23D2D178105D44DB6915AB06BFB7F
https://metrics.torproject.org/userstats-bridge-transport.html?start=2021-11-10&end=2022-02-08&transport=snowflake
Since 2022-02-03, it appears that Metrics is showing only one of the
four running instances per day. Because all four instances are about
equally used (as if load balanced, go figure), the values on the graph
are 1/4 what they should be. The reported bandwidth of 5 MB/s is
actually 20 MB/s, and the 2500 clients are actually 10000. All the
necessary data are present in Collector, it’s just a question of data
processing. I opened an issue for the Metrics graphs, where you can also
see some manually made graphs that are closer to the true values.
https://bugs.torproject.org/tpo/network-health/metrics/onionoo/40022

I started a thread on tor-dev about the issues of onion key rotation and
ExtORPort authentication.
https://lists.torproject.org/pipermail/tor-dev/2022-February/thread.html


tor-relays mailing list
tor-relays@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays


tor-relays mailing list
tor-relays@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays

Gary C. New via tor-relays:

David,
Has Tor Metrics implemented your RFC related to Written Bytes per Second and Read Bytes per Second on Onionoo?

That's probably

, no?

Georg

···

As of the 27th of February, I've noticed a change in reporting that accurately reflects the aggregate of my Tor Relay Nodes opposed to the previously reported Single Tor Node. Are you seeing a similar change for snowflake.torproject.org?
Additionally, other than the hourly stacktrace errors in the syslog, the secure_onion_key workaround seems to be working well without any ill side-effects. I've been able to operate with the same secure_onion_key for close to 5 weeks, now. Have you run into any issues?
Thank you for your response.
Respectfully,

Gary—
This Message Originated by the Sun.
iBigBlue 63W Solar Array (~12 Hour Charge)
+ 2 x Charmast 26800mAh Power Banks
= iPhone XS Max 512GB (~2 Weeks Charged)

     On Tuesday, February 8, 2022, 11:49:47 PM MST, Gary C. New via tor-relays <tor-relays@lists.torproject.org> wrote:
    David,
Excellent Documentation and References!
I hope the proposed RFC's (auth, key, and metrics) for loadbalanced Tor topologies are seriously considered and implemented by Tor Core and Tor Metrics.
Great Work!
Respectfully,

Gary—
This Message Originated by the Sun.
iBigBlue 63W Solar Array (~12 Hour Charge)
+ 2 x Charmast 26800mAh Power Banks
= iPhone XS Max 512GB (~2 Weeks Charged)

     On Tuesday, February 8, 2022, 10:02:53 AM MST, David Fifield <david@bamsoftware.com> wrote:
    The load-balanced Snowflake bridge is running in production since
2022-01-31. Thanks Roger, Gary, Roman for your input.

Hopefully reproducible installation instructions:
Snowflake Bridge Installation Guide · Wiki · The Tor Project / Anti-censorship / Team · GitLab
Observations since:
Add load balancing to bridge (#40095) · Issues · The Tor Project / Anti-censorship / Pluggable Transports / Snowflake · GitLab

Metrics graphs are currently confused by multiple instances of tor
uploading descriptors under the same fingerprint. Particularly in the
interval between 2022-01-25 and 2022-02-03, when a production bridge and
staging bridge were running in parallel, with four instances being used
and another four being mostly unused.
Relay Search
Users – Tor Metrics
Since 2022-02-03, it appears that Metrics is showing only one of the
four running instances per day. Because all four instances are about
equally used (as if load balanced, go figure), the values on the graph
are 1/4 what they should be. The reported bandwidth of 5 MB/s is
actually 20 MB/s, and the 2500 clients are actually 10000. All the
necessary data are present in Collector, it's just a question of data
processing. I opened an issue for the Metrics graphs, where you can also
see some manually made graphs that are closer to the true values.
Graphs for multiple relays that have the same fingerprint (#40022) · Issues · The Tor Project / Network Health / Metrics / Onionoo · GitLab

I started a thread on tor-dev about the issues of onion key rotation and
ExtORPort authentication.
The tor-dev February 2022 Archive by thread
_______________________________________________
tor-relays mailing list
tor-relays@lists.torproject.org
tor-relays Info Page
   _______________________________________________
tor-relays mailing list
tor-relays@lists.torproject.org
tor-relays Info Page
   
_______________________________________________
tor-relays mailing list
tor-relays@lists.torproject.org
tor-relays Info Page

Has Tor Metrics implemented your RFC related to Written Bytes per Second and
Read Bytes per Second on Onionoo?

As of the 27th of February, I've noticed a change in reporting that accurately
reflects the aggregate of my Tor Relay Nodes opposed to the previously reported
Single Tor Node. Are you seeing a similar change for snowflake.torproject.org?

You're right. I see a change since 2022-02-27, but in the case of the
snowflake bridge the numbers look wrong, about 8× too high. I posted an
update on the issue. Thanks for noticing.

Additionally, other than the hourly stacktrace errors in the syslog, the
secure_onion_key workaround seems to be working well without any ill
side-effects. I've been able to operate with the same secure_onion_key for
close to 5 weeks, now. Have you run into any issues?

Yes, it's still working well here.

···

On Thu, Mar 03, 2022 at 08:13:34PM +0000, Gary C. New wrote:
_______________________________________________
tor-relays mailing list
tor-relays@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays

Georg,

Yes! That is precisely it!

Please know that the change appears to be working with my loadbalanced Tor Relay deployment as well.

Are there any “Issues” submitted for a similar change to Concensus Weight and Relay Probability to Tor Metrics on Onionoo? It appears these values are still only being reported for a Single Tor Node.

A BIG Thank You to the Tor Metrics Team for the Issue-40022 implementation.

Respectfully,

Gary

···


This Message Originated by the Sun.
iBigBlue 63W Solar Array (~12 Hour Charge)

  • 2 x Charmast 26800mAh Power Banks
    = iPhone XS Max 512GB (~2 Weeks Charged)

On Thursday, March 3, 2022, 1:28:12 PM MST, Georg Koppen gk@torproject.org wrote:

Gary C. New via tor-relays:

David,
Has Tor Metrics implemented your RFC related to Written Bytes per Second and Read Bytes per Second on Onionoo?

That’s probably

https://gitlab.torproject.org/tpo/network-health/metrics/onionoo/-/issues/40022

, no?

Georg

As of the 27th of February, I’ve noticed a change in reporting that accurately reflects the aggregate of my Tor Relay Nodes opposed to the previously reported Single Tor Node. Are you seeing a similar change for snowflake.torproject.org?
Additionally, other than the hourly stacktrace errors in the syslog, the secure_onion_key workaround seems to be working well without any ill side-effects. I’ve been able to operate with the same secure_onion_key for close to 5 weeks, now. Have you run into any issues?
Thank you for your response.
Respectfully,

Gary—
This Message Originated by the Sun.
iBigBlue 63W Solar Array (~12 Hour Charge)

  • 2 x Charmast 26800mAh Power Banks
    = iPhone XS Max 512GB (~2 Weeks Charged)

On Tuesday, February 8, 2022, 11:49:47 PM MST, Gary C. New via tor-relays <tor-relays@lists.torproject.org> wrote:

David,
Excellent Documentation and References!
I hope the proposed RFC’s (auth, key, and metrics) for loadbalanced Tor topologies are seriously considered and implemented by Tor Core and Tor Metrics.
Great Work!
Respectfully,

Gary—
This Message Originated by the Sun.
iBigBlue 63W Solar Array (~12 Hour Charge)

  • 2 x Charmast 26800mAh Power Banks
    = iPhone XS Max 512GB (~2 Weeks Charged)

On Tuesday, February 8, 2022, 10:02:53 AM MST, David Fifield <david@bamsoftware.com> wrote:

The load-balanced Snowflake bridge is running in production since
2022-01-31. Thanks Roger, Gary, Roman for your input.

Hopefully reproducible installation instructions:
https://gitlab.torproject.org/tpo/anti-censorship/team/-/wikis/Survival-Guides/Snowflake-Bridge-Installation-Guide?version_id=6de6facbb0fd047de978a561213c59224511445f
Observations since:
https://bugs.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/40095#note_2774428

Metrics graphs are currently confused by multiple instances of tor
uploading descriptors under the same fingerprint. Particularly in the
interval between 2022-01-25 and 2022-02-03, when a production bridge and
staging bridge were running in parallel, with four instances being used
and another four being mostly unused.
https://metrics.torproject.org/rs.html#details/5481936581E23D2D178105D44DB6915AB06BFB7F
https://metrics.torproject.org/userstats-bridge-transport.html?start=2021-11-10&end=2022-02-08&transport=snowflake
Since 2022-02-03, it appears that Metrics is showing only one of the
four running instances per day. Because all four instances are about
equally used (as if load balanced, go figure), the values on the graph
are 1/4 what they should be. The reported bandwidth of 5 MB/s is
actually 20 MB/s, and the 2500 clients are actually 10000. All the
necessary data are present in Collector, it’s just a question of data
processing. I opened an issue for the Metrics graphs, where you can also
see some manually made graphs that are closer to the true values.
https://bugs.torproject.org/tpo/network-health/metrics/onionoo/40022

I started a thread on tor-dev about the issues of onion key rotation and
ExtORPort authentication.
https://lists.torproject.org/pipermail/tor-dev/2022-February/thread.html


tor-relays mailing list
tor-relays@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays


tor-relays mailing list
tor-relays@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays


tor-relays mailing list
tor-relays@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays


tor-relays mailing list
tor-relays@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays

Gary C. New via tor-relays:

Georg,
Yes! That is precisely it!
Please know that the change appears to be working with my loadbalanced Tor Relay deployment as well.
Are there any "Issues" submitted for a similar change to Concensus Weight and Relay Probability to Tor Metrics on Onionoo? It appears these values are still only being reported for a Single Tor Node.

Hrm, good question. I don't think so and I am not sure yet, whether we should make such a change.

A BIG Thank You to the Tor Metrics Team for the Issue-40022 implementation.

You are welcome. It seems, though, the implementation was not correct. We therefore reverted it for now. However, we are on it. :slight_smile:

Georg

···

Respectfully,

Gary—
This Message Originated by the Sun.
iBigBlue 63W Solar Array (~12 Hour Charge)
+ 2 x Charmast 26800mAh Power Banks
= iPhone XS Max 512GB (~2 Weeks Charged)

     On Thursday, March 3, 2022, 1:28:12 PM MST, Georg Koppen <gk@torproject.org> wrote:
    Gary C. New via tor-relays:

David,
Has Tor Metrics implemented your RFC related to Written Bytes per Second and Read Bytes per Second on Onionoo?

That's probably

Graphs for multiple relays that have the same fingerprint (#40022) · Issues · The Tor Project / Network Health / Metrics / Onionoo · GitLab

, no?

Georg

As of the 27th of February, I've noticed a change in reporting that accurately reflects the aggregate of my Tor Relay Nodes opposed to the previously reported Single Tor Node. Are you seeing a similar change for snowflake.torproject.org?
Additionally, other than the hourly stacktrace errors in the syslog, the secure_onion_key workaround seems to be working well without any ill side-effects. I've been able to operate with the same secure_onion_key for close to 5 weeks, now. Have you run into any issues?
Thank you for your response.
Respectfully,

Gary—
This Message Originated by the Sun.
iBigBlue 63W Solar Array (~12 Hour Charge)
+ 2 x Charmast 26800mAh Power Banks
= iPhone XS Max 512GB (~2 Weeks Charged)

   On Tuesday, February 8, 2022, 11:49:47 PM MST, Gary C\. New via tor\-relays &lt;tor\-relays@lists\.torproject\.org&gt; wrote:

     David,
Excellent Documentation and References!
I hope the proposed RFC's (auth, key, and metrics) for loadbalanced Tor topologies are seriously considered and implemented by Tor Core and Tor Metrics.
Great Work!
Respectfully,

Gary—
This Message Originated by the Sun.
iBigBlue 63W Solar Array (~12 Hour Charge)
+ 2 x Charmast 26800mAh Power Banks
= iPhone XS Max 512GB (~2 Weeks Charged)

   On Tuesday, February 8, 2022, 10:02:53 AM MST, David Fifield &lt;david@bamsoftware\.com&gt; wrote:

     The load-balanced Snowflake bridge is running in production since
2022-01-31. Thanks Roger, Gary, Roman for your input.

Hopefully reproducible installation instructions:
Snowflake Bridge Installation Guide · Wiki · The Tor Project / Anti-censorship / Team · GitLab
Observations since:
Add load balancing to bridge (#40095) · Issues · The Tor Project / Anti-censorship / Pluggable Transports / Snowflake · GitLab

Metrics graphs are currently confused by multiple instances of tor
uploading descriptors under the same fingerprint. Particularly in the
interval between 2022-01-25 and 2022-02-03, when a production bridge and
staging bridge were running in parallel, with four instances being used
and another four being mostly unused.
Relay Search
Users – Tor Metrics
Since 2022-02-03, it appears that Metrics is showing only one of the
four running instances per day. Because all four instances are about
equally used (as if load balanced, go figure), the values on the graph
are 1/4 what they should be. The reported bandwidth of 5 MB/s is
actually 20 MB/s, and the 2500 clients are actually 10000. All the
necessary data are present in Collector, it's just a question of data
processing. I opened an issue for the Metrics graphs, where you can also
see some manually made graphs that are closer to the true values.
Graphs for multiple relays that have the same fingerprint (#40022) · Issues · The Tor Project / Network Health / Metrics / Onionoo · GitLab

I started a thread on tor-dev about the issues of onion key rotation and
ExtORPort authentication.
The tor-dev February 2022 Archive by thread
_______________________________________________
tor-relays mailing list
tor-relays@lists.torproject.org
tor-relays Info Page
_______________________________________________
tor-relays mailing list
tor-relays@lists.torproject.org
tor-relays Info Page
     
_______________________________________________
tor-relays mailing list
tor-relays@lists.torproject.org
tor-relays Info Page

_______________________________________________
tor-relays mailing list
tor-relays@lists.torproject.org
tor-relays Info Page
   
_______________________________________________
tor-relays mailing list
tor-relays@lists.torproject.org
tor-relays Info Page