Standalone proxy - traffic relayed almost zero with restricted NAT type

After approx. 25h zero traffic, snowflake proxy reported io errors, like
io.Copy inside CopyLoop generated an error: websocket: close 1006 (abnormal closure): unexpected EOF
This did start today, approx. 07:28:14 UTC.
Three seconds later a couple of lines like this
error dialing relay: wss://snowflake.torproject.net/?client_ip=[scrubbed] = dial tcp [scrubbed]: con nect: connection refused
did show up in proxy log.
First connected to relay ...did show on 07:31:27 UTC.
Probably a snowflake server restart?
Relaying did restart on 07:35:53 UTC.

I was having the same problem with my standalone proxy the last few days and updating the proxy worked. I recommend trying this.

1 Like

Now, approx. 50h after having been restarted, standalone proxy stopped to write to stdout that gets redirected into nohup.out!
Regardless of this, process is still running but hasn’t relayed any traffic during last 24h.
Process still holds 5 TCP connections to host snowflake-01. This is a waste of resources on server side!
I’ll restart proxy now.

David Fifield documented restart of snowflake-01 yesterday (see commit a179646e on The Tor Project / Network Health / Metrics / Timeline · GitLab). This correlates with my reported stop of stdout messages in nohup.out, although I don’t understand why this might have been triggered by a server restart.

Anyway, snowflake proxy did relay messages for 5 hours. Then drained out. Now 12 hours no relaying!! Process has 5 TCP connections to snowflake-01 established.

I guess there’s something going wrong on server side. As mentioned in previous message: Not using existing connections is a waste of resources. I can’t imagine that there so little demand that during 12 hours not a single message gets relayed.

This is how it looks in snowflake proxy log now:

2022/10/20 21:39:13 Traffic throughput (up|down): 1 MB|163 KB -- (456 OnMessages, 1147 Sends, over 9 seconds)
2022/10/20 21:39:13 copy loop ended
2022/10/20 21:39:13 datachannelHandler ends
2022/10/20 21:39:14 sdp offer successfully received.
2022/10/20 21:39:14 Generating answer...
2022/10/20 21:39:22 OnDataChannel
2022/10/20 21:39:22 Connection successful.
2022/10/20 21:39:22 OnOpen channel
2022/10/20 21:39:23 connected to relay: wss://snowflake.torproject.net/
2022/10/20 22:00:27 In the last 1h0m0s, there were 5 connections. Traffic Relayed ↑ 30 MB, ↓ 1 MB.
2022/10/20 23:00:27 In the last 1h0m0s, there were 0 connections. Traffic Relayed ↑ 0 B, ↓ 0 B.
2022/10/21 00:00:27 In the last 1h0m0s, there were 0 connections. Traffic Relayed ↑ 0 B, ↓ 0 B.
2022/10/21 01:00:27 In the last 1h0m0s, there were 0 connections. Traffic Relayed ↑ 0 B, ↓ 0 B.
2022/10/21 02:00:27 In the last 1h0m0s, there were 0 connections. Traffic Relayed ↑ 0 B, ↓ 0 B.
2022/10/21 03:00:27 In the last 1h0m0s, there were 0 connections. Traffic Relayed ↑ 0 B, ↓ 0 B.
2022/10/21 04:00:27 In the last 1h0m0s, there were 0 connections. Traffic Relayed ↑ 0 B, ↓ 0 B.
2022/10/21 05:00:27 In the last 1h0m0s, there were 0 connections. Traffic Relayed ↑ 0 B, ↓ 0 B.
2022/10/21 06:00:27 In the last 1h0m0s, there were 0 connections. Traffic Relayed ↑ 0 B, ↓ 0 B.
2022/10/21 07:00:27 In the last 1h0m0s, there were 0 connections. Traffic Relayed ↑ 0 B, ↓ 0 B.
2022/10/21 08:00:27 In the last 1h0m0s, there were 0 connections. Traffic Relayed ↑ 0 B, ↓ 0 B.
2022/10/21 09:00:27 In the last 1h0m0s, there were 0 connections. Traffic Relayed ↑ 0 B, ↓ 0 B.
2022/10/21 10:00:27 In the last 1h0m0s, there were 0 connections. Traffic Relayed ↑ 0 B, ↓ 0 B.
1 Like

@mjacobs-de can you test how commit 1bc54948 will work?

1 Like

What do you mean?
Please tell me how can i implement that workaround in my docker build?
One if my builds has stoped for 12hours and another had low traffic…i will upload logs to Reduction in traffic relayed by standalone proxy (#40211) · Issues · The Tor Project / Anti-censorship / Pluggable Transports / Snowflake · GitLab maybe it helps but i also want to try that workaround

I have no experience with Docker to say exactly what to do.
I think that it may be possible to change this line to RUN git clone --depth=1 -b issue/40220 https://gitlab.torproject.org/cohosh/snowflake.git
But I don’t know what to do next.

Another option is just to wait for merge of !108 and for release of new version.

1 Like

Yes, cloned, compiled, started, running, … I’ll keep you updated.

2 Likes

@Vort Sorry, I didn’t recognize that your commit is a branch, so my git clone operation just did get the same code as before.
Now I did git clone of branch issue/40220 with commit 1bc54948, compiled, deployed, startet snowflake proxy, now running …

I’ll keep you updated.

@Vort Now 11 non zero traffic summary lines in sequence since restart. Looks promising :slight_smile:

Update
Now 21 non zero traffic summary lines in sequence since restart. That’s a significant improvement.

Update #2 (Sorry, not more then 3 answers in sequence are allowed)
Since restart:
35 non zero traffic summary lines in sequence! That’s really good :slight_smile:

New findings in nohup.out
5 times this (timestamp is local time instead of UTC):
sctp ERROR: 2022/10/22 21:24:18 [0xc001108000] stream 1 not found)
and some hours later again but 7 times
sctp ERROR: 2022/10/23 08:04:29 [0xc0009ba000] stream 1 not found)

There is nothing special in regular snowflake proxy log at given time stamp.

1 Like

There are many topics about it already on forum:
https://forum.torproject.org/t/sctp-error-0x378e7e0-stream-1-not-found/4984
https://forum.torproject.org/t/snowflake-sctp-error-0xc0001bf880/4983
https://forum.torproject.org/t/sctp-error-in-snowflake-log/3854
No one can say for sure why it is happening, but this problem do not significantly affect proxy operation.

2 Likes

free message to allow @mjacobs-de to post updates :slight_smile:

@SirNeo Thanks!

I’ll ignore those sctp errors now.

Short summary, since restart (now more than 64h continues operation):
613 log lines reporting “Traffic throughput”, among those
53 with with 51[0123] seconds.

Many other values found, lowest value 0, top scorer 21994 seconds!

Question: Does this log message report on a per connection level? If yes, wouldn’t it be helpful to report something like a thread id or (local) port number?

1 Like

Hi. OP here. Big thanks everybody! :slightly_smiling_face:

Any news about the fix?
I got tired restarting the bridges every 24/48 hours…

1 Like

@SirNeo There is certainly a lot of work going on right now as far as I can see in the issue tracker. It’s worth keeping an eye on. The problem is limited resources (as in time and people and funding).

@Quartermarsh I disagree with you.

1 Like

!108 is merged.
I applied it manually, so I’m fine.
However I don’t know how fast this change will propagate to various official releases.
If anyone knows, please tell about it.

Also docker image has been updated so i have recreated my images with the latest version :slight_smile:

1 Like

Just a thought. I have always had problems with snowflake and restricted NAT. Have you thought about running it as unrestricted? All NAT types can run snowflake as unrestricted. You can open a limited number of UDP ports and tell snowflake to only use those ports.

For example you can allow -let’s say- udp ports 10000-20000 through your firewall and then add this line to your docker-compose.yml

command: [ "-ephemeral-ports-range", "10000:20000" ]

and restart. You’ll show as having unrestricted NAT in the logs and you’ll be relaying far more traffic. The port range could be smaller in size or larger.