[tor-relays] inet_csk_bind_conflict

Christopher_Sheats · December 29, 2022, 8:04pm

I am happy to report that we have upgraded all our relays to Tor 0.4.8.0-alpha-dev and for the pst 8 days since the upgrade the bind conflict has ceased. No firewall rules are being used. No sysctl settings helped.

···

–
Christopher Sheats (yawnbox)
Executive Director
Emerald Onion
Signal: +1 206.739.3390
Website: https://emeraldonion.org/
Mastodon: AS 396507 (@EmeraldOnion@digitalcourage.social) - digitalcourage.social

On Dec 12, 2022, at 1:18 PM, Anders Trier Olesen anders.trier.olesen@gmail.com wrote:

It is surprising, isn’t it? It certainly feels like calling connect
without first binding to an address should have the same effect as
manually binding to an address and then calling connect, especially if
the address you bind to is the same as the kernel would have chosen
automatically. It seems like it might be a bug, but I’m not qualified to

judge that.
Yes, I’m starting to think so too. And strange that Cloudflare doesn’t mention stumbling upon this problem in their blogpost on running out of ephemeral ports. [1]

If I find the time, I’ll make an attempt at understanding exactly what is going on in the kernel.

If I am interpreting your results correctly, it means that either of the
two extremes is safe
Yes. That is what I think too.

Anyway, thank your for the insight. I apologize if I was inconsiderate
in my prior reply.
Likewise!

Best regards

Anders Trier Olesen

[1] https://blog.cloudflare.com/how-to-stop-running-out-of-ephemeral-ports-and-start-to-love-long-lived-connections/

On Mon, Dec 12, 2022 at 4:16 PM David Fifield <david@bamsoftware.com> wrote:

On Mon, Dec 12, 2022 at 12:39:50AM +0100, Anders Trier Olesen wrote:

I wrote some tests[1] which showed behaviour I did not expect.
IP_BIND_ADDRESS_NO_PORT seems to work as it should, but calling bind without it
enabled turns out to be even worse than I thought.
This is what I think is happening: A successful bind() on a socket without
IP_BIND_ADDRESS_NO_PORT enabled, with or without an explicit port configured,
makes the assigned (or supplied) port unavailable for new connect()s (on
different sockets), no matter the destination. I.e if you exhaust the entire
net.ipv4.ip_local_port_range with bind() (no matter what IP you bind to!),
connect() will stop working - no matter what IP you attempt to connect to. You
can work around this by manually doing a bind() (with or without an explicit
port, but without IP_BIND_ADDRESS_NO_PORT) on the socket before connect().

What blows my mind is that after running test2, you cannot connect to anything
without manually doing a bind() beforehand (as shown by test1 and test3 above)!
This also means that after running test2, software like ssh stops working:

When using IP_BIND_ADDRESS_NO_PORT, we don’t have this problem (1 5 6 can be
run in any order):

Thank you for preparing that experiment. It’s really valuable, and it
looks a lot like what I was seeing on the Snowflake bridge: calls to
connect would fail with EADDRNOTAVAIL unless first bound concretely to a
port number. IP_BIND_ADDRESS_NO_PORT causes bind not to set a concrete
port number, so in that respect it’s the same as calling connect without
calling bind first.

It is surprising, isn’t it? It certainly feels like calling connect
without first binding to an address should have the same effect as
manually binding to an address and then calling connect, especially if
the address you bind to is the same as the kernel would have chosen
automatically. It seems like it might be a bug, but I’m not qualified to
judge that.

If I am interpreting your results correctly, it means that either of the
two extremes is safe: either everything that needs to bind to a source
address should call bind with IP_BIND_ADDRESS_NO_PORT, or else
everything (whether it needs a specific source address or not) should
call bind without IP_BIND_ADDRESS_NO_PORT. (The latter situation is
what we’ve arrived at on the Snowflake bridge.) The middle ground, where
some connections use IP_BIND_ADDRESS_NO_PORT and some do not, is what
causes trouble, because connections that do not use
IP_BIND_ADDRESS_NO_PORT somehow “poison” the ephemeral port pool for
connections that do use IP_BIND_ADDRESS_NO_PORT (and for connections
that do not bind at all). It would explain why causing HAProxy not to
use IP_BIND_ADDRESS_NO_PORT resolved errors in my case.

Removing the IP_BIND_ADDRESS_NO_PORT option from Haproxy and
doing nothing else is sufficient to resolve the problem.

Maybe there are other processes on the same host which calls bind() without
IP_BIND_ADDRESS_NO_PORT, and blocks the ports? E.g OutboundBindAddress or
similar in torrc?

OutboundBindAddress is a likely culprit. We did end up setting
OutboundBindAddress on the bridge during the period of intense
performance debugging at the end of September.

One thing doesn’t quite add up, though. The earliest EADDRNOTAVAIL log
messages started at 2022-09-28 10:57:26:
https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/-/issues/40198
Whereas according to the change history of /etc on the bridge,
OutboundBindAddress was first set some time between 2022-09-29 21:38:37
and 2022-09-29 22:37:06, over 30 hours later. I would be tempted to say
this is a case of what you initially suspected, simple tuple exhaustion
between two static IP addresses, if not for the fact that pre-binding an
address resolved the problem in that case as well (“I get EADDRNOTAVAIL
sometimes even with netcat, making a connection to the haproxy port—but
not if I specify a source address in netcat”). But I only ran that
netcat test after OutboundBindAddress had been set, so there may have
been many factors being conflated.

Anyway, thank your for the insight. I apologize if I was inconsiderate
in my prior reply.

tor-relays mailing list
tor-relays@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays

tor-relays mailing list
tor-relays@lists.torproject.org
tor-relays Info Page

dcf · March 21, 2023, 1:27am

> It is surprising, isn't it? It certainly feels like calling connect
> without first binding to an address should have the same effect as
> manually binding to an address and then calling connect, especially if
> the address you bind to is the same as the kernel would have chosen
> automatically. It seems like it might be a bug, but I'm not qualified to
> judge that.

Yes, I'm starting to think so too. And strange that Cloudflare doesn't mention
stumbling upon this problem in their blogpost on running out of ephemeral
ports. [1]
[1]How to stop running out of ephemeral ports and start to love long-lived connections
If I find the time, I'll make an attempt at understanding exactly what is going
on in the kernel.

Cloudflare has another blog post today that gets into this topic.

It investigates the difference in behavior between
inet_csk_bind_conflict and __inet_hash_connect that I commented on at
[tor-relays] inet_csk_bind_conflict - #13 by dcf and
Out of ephemeral ports on link between haproxy and extor-static-cookie (#40201) · Issues · The Tor Project / Anti-censorship / Pluggable Transports / Snowflake · GitLab.
Setting the IP_BIND_ADDRESS_NO_PORT option leads to __inet_hash_connect;
not setting it leads to inet_csk_bind_conflict.

The author attributes the difference in behavior to the fastreuse field
in the bind hash bucket:

···

On Mon, Dec 12, 2022 at 10:18:53PM +0100, Anders Trier Olesen wrote:

The bucket might already exist or we might have to create it first.
But once it exists, its fastreuse field is in one of three possible
states: -1, 0, or +1.

…

…inet_csk_get_port() skips conflict check for fastreuse == 1 buckets.
…__inet_hash_connect() skips buckets with fastreuse != -1.

_______________________________________________
tor-relays mailing list
tor-relays@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays