[tor-relays] Next Tor relay operator meetup - November 19th at 19.00 UTC

Dear relay operators,

The next Tor relay operator meetup will happen on Saturday, November 19th at 19.00 UTC.

## Where

BigBlueButton: https://tor.meet.coop/gus-og0-x74-dzn

Agenda (WIP)

  • Announcements
  • State of DoS attack
  • Q&A
  • Next Tor Relay Operator Meetup

Meeting pad:

Everyone is free to bring up additional questions or topics at the
meeting itself.

Registration

No need for a registration or anything else, just use the room-link
above. We will open the room 10 minutes before so you can test your mic
setup.

Please share with your friends, social media and other mailing lists!

cheers,
Gus

4 Likes

Hi,

Thanks for joining the relay operator meetup!

Here are the meeting notes.

cheers,
Gus

Notes - Tor Relay Operator Meetup - 2022-11-19 - 19.00 UTC

Announcements

-State of the onion 2022: You’re Invited: State of the Onion 2022 | The Tor Project

Idea: for next year's community edition of SOTO, let's consider
inviting relay operator associations to tell us about their work -- what
they did over the year, why they run relays, who they are. First because
users want to know who runs the relays, and second because it is an
opportunity for the relay non-profits to ask for and receive donations
directly.

Also new this year, we had the SOTO livestream available over an
onion address, thanks to the Bornhack streaming server. We need to
assess how well it worked, to decide whether to do it again next year;
let us know if you watched that way and how it went!

Initial feedback: Some Tor paths (circuits) had good enough
performance to watch it, and some paths didn't.

Forum thread for feedback on the State of Onion:

  • Snowflake proxies (new record):
      Sources – Tor Metrics
      
    Running Snowflake is very popular in Germany in particular, and many
    people are doing grassroots advocacy. This is amazing!

On Nov 17, we had 128k people running Snowflakes around the world.
From that number, 60k was from Germany. Snowflake on German public TV
https://twitter.com/alexschnapper/status/1593980648636747776

We have more than 100k running the webextension. If you're on a good
network connection, especially if you're not behind a NAT, consider
running the standalone version of Snowflake, because it will scale
better for users.

Snowflakes not behind NATs are especially useful to the world. If you
find yourself in an advocacy situation, make it clear to people that
getting Snowflakes not behind restrictive NATs is very much more useful
to censored users.
(Still, having 100k+ people running Snowflakes is a *political*
statement -- saying you don't support censorship and you want the world
to be different.)

  • Bridge (obfs4) usage spike in China:

https://metrics.torproject.org/userstats-bridge-combined.html?start=2022-10-31&end=2022-11-19&country=cn

Tor Browser has a new feature named "Connection Assist" which will
help your Tor Browser walk through recommended circumvention mechanisms
for your country. So we are getting better at steering users who need
obfs4 bridges into finding and using the right flavor of bridge.

There are still usability improvements to make, e.g. sometimes it
takes many minutes to time out and move to the next mechanism in the
list. We continue to tune it.

Seeing obfs4 traffic going up in China is neat because the last spike
was meek, which uses domain fronting and is expensive to operate. So
something more sustainable needs to be the future.

State of DoS attack

Things have improved in the recent past, e.g. the past week or two.
They are not great yet, but they are better than they were a month ago.

There's also a new Tor version released last week that has some new
features to help here: there are some bugfixes in onion service
stability and reachability, and there are more metrics published via the
metrics port.

The performance for public (non onion) Tor traffic is way better than
it was a few weeks ago. But onion service DoS remains and may have
changed lately.

So, the DoS issues are not over yet -- there were many attacks
happening in parallel, and only some of them have gotten better.

For the onion service overload in particular, one of our future hopes
is the PoW design: see
prop327: Implement PoW over Introduction Circuits (#40634) · Issues · The Tor Project / Core / Tor · GitLab for details.

We are currently advertising for a new software engineer to join the
network team, specifically to work on C-Tor and onion services. This
role will overlap a lot with the DDoS questions. Consider applying!

The metrics port is especially useful for us to actually understand
how the attacks evolve.

The "compression subsystem" in Tor is one possible place where relays
would use a surprising amount of memory, and the next metrics port
changes will start exporting this detail.

Arti development is still focused on the client side for 2023, so we are
hoping to use this new onion service developer position to give some
love to the server side of C-Tor.

Another reason to get a new onion service developer is because right now
we have one-ish person on the network team with onion service clue, so
if that person is busy / on vacation then we have nobody with enough
onion service clue. So it is also about building redundancy in our org
too.

Q: Last time we discussed scripts, like iptables rules, to help relays
survive the DoS attacks better. What is the state of those scripts now?
Is there one that emerged as the consensus winner? Are relay operators
happy with them?

A: The artikel10 script seems to have worked for some operators, but it is not fully automatic.
So it is safe to suggest to people.
Specifically, the recommended mode of operation is to run the script
once, to learn which IP addresses are being most overloading, and then
to manually block those addresses. Because if you run it in an automated
way, perhaps an attacker could use the script itself to start censoring
Tor.
Artikel10: We're running this script in a cronjob (with human eyes on
the blocked IP addresses regularly), and we've not seen this type of
attack. The set of targeted IP addresses also keeps changing, so running
the script only once will not protect you for long. Please note: the
script deals with *outgoing* connection DoS, not the incoming one.

Idea: if there are things the network team could help with, e.g. to
export IP addresses or whatever on the control port, file a ticket! We
want to support these external scripts better.

Q: We recently had some RAM issues. What's a good way to investigate how
a certain Tor process uses RAM?

A: in the distant past, we had "kill -USR1" dump info on connections and
circuits and also dump a memory summary. i wonder if that memory summary
part still works.

Next Tor Relay Operator meetup

December meetup: we will check with leibi if he can organize it.

Tor @ Fosdem 2023

More details here: Tor activities at FOSDEM 2023 (#40017) · Issues · The Tor Project / Community / Outreach · GitLab
   
We are planning some Tor activities at Fosdem on this ticket. It would
be great for somebody in the relay operator community to organize a
relay operator meetup. Volunteers?

Rumors of alex, emmapeel, rene, hackerncoder all attending.

Q&A

  • Did someone else recently noticed increased memory usage?
    At Artikel10, we seem to be seeing (according to htop) instances
    consuming significantly more than MaxMemInQueues, not just a bit,
    as documented.

A: Knowing whether the increased memory usage happened *before* the
exit DDoS stopped, or after, could be useful. When did it stop?
Around Oct 28 or 29. Looks like the memory increase happened *after*
that. Maybe, now that the bytes are flowing better, more bytes are
flowing?

  • When filtering inbound connections to reduce parallel connections, how much connections is considered too much?

A: If Tor is behaving properly, then there should not be many
parallel connections, and so you should not be killing any of them. So,
it depends what is causing these extra connections -- is it carrier
grade nat and many Tor clients? Is it a modified Tor client? We need to
understand the attacks better to be able to answer this question, but
the conservative answer is to try not to kill connections.

There have been plans for a public letter, but we have been waiting
for documents that the police were supposed to provide, but they did not
provide them, so maybe we should push forward with the letter anyway.

Let us all know if we can help with the letter!

Now the topic of Snowflake, anti-censorship, awareness in general
about human rights is high, so it could be a good time to push things
forward.

  • Are Snowflake family flags planned?

No. Families are most useful to stop clients from using multiple
relays controlled by one org in a single circuit. Since
Snowflakes are only the first hop, it's less urgent to get Family
information about them.

Though! If the big Snowflake operators are also big relay
operators, then it could be useful still.

Overall, I would say don't worry too much about it. There is a
good argument that the Family flag idea itself is harmful,
because they steer traffic away from honest relay operators toward
people who don't set the flags.
   

  • Any news on the board?

There is a board meeting this coming Monday. I believe there were a
bunch of nominated positions, and some interviews happened, and no
results are known yet.
   

  • Is there a bigger need for Snowflake standalone proxies than for
      obfs4proxy bridges?

If you have two servers, run one of each! Just, don't run both on a
single IP address, because then whichever one gets blocked first will
implicitly get the other one blocked.

  • What's the biggest difference between obfs4/azure bridge and
    snowflake? Is it the difficulty of hosting it that differs them or
    is it something about the protocol?

A: Detailed answer: https://www.youtube.com/watch?v=ZB8ODpw_om8

  • Can you adjust the weights so exits only get used as exits and we exit
    ops can mostly stop accepting connections from non-authenticating
    clients by setting DoSConnectionMaxConcurrentCount to 1? and we could
    restrict incoming connections to known Tor Relay IPs

A: The reason is that in theory clients should only be using exits for
the third hop, and so clients should never be connecting directly to
exits, and so it would be great if exits can simply *block* connections
from clients. I think the summary is that we would like to get to that
point, but we don't know if we are in that point now. So don't just
block all the clients yet.

Would it kill my consensus weight if I block all non Tor IPs? It would
yes, because there are edge cases like bandwidth authorities, which
measure relays but are not themselves relays. There are probably other
surprise edge cases like this too.

  • Back in the days when teor was managing the fallbackdir list, it was
      possible to opt-out. Why do you no longer allow opt-out?

A: We simplified the process of automation in picking fallbackdirs --
to remove the laborious human interaction steps from it. Now it is
more automated, which is great because it saves time, but sad because we
pick relays that disappear faster. So we need to refresh the fallbackdir
list at each release now.

Follow-up question: why did you want to opt out?
why: because we are already under load which we can hardly handle
and because we are exits - see points above - wrt to blocking non
authenticating connections

Suggestion: I would say, don't worry about it. The goal is that
there are enough fallbackdirs that some of them work and are
available. So it is an explicit tradeoff between automation and having
the ideal fallbackdir list. The fix should be that we push out a fresh
list more often.

Especially with the denial of service issues over the past months,
we had higher relay churn than usual so the timeline for refreshing the
list was accelerated.

Remember that the Tor network is a tiny network on a much bigger
internet, so we need to think carefully about our approaches to network
stability.

  • Please offer an option to not get the guard flag to run relays with
      less hassle (ddos)

A: You can switch it off for a day and then come back ~ all 2 weeks or so
There is a new MiddleOnly option that *directory authorities* can
pick, to avoid giving a relay the Guard flag. But I think this person
wants their relay to self-nominate that it never wants to get the Guard
flag.
In the past we have avoided adding a feature like this on the relay
operator side, because we want flexibility to assign Guard flags in new
and smarter ways in the future.

  • Doesn't setting a daily/monthly accounting bandwith limit help with the
    above issue?

I think if you wanted to avoid getting the Guard flag today, you could
set DirCache 0 and no dir auth will assign the Guard flag to you.

Is this feature wanted by the really fast relays, or by the tiny rasbpi
relays?

  • what is the state of snowflake debian package

There is a package but it is not as easy to use as it should be.

We are talking to the deb.torproject.org operator about

  • One of the remaining issues with regards to cpu load on exits are
    outbound floods, are there any plans to rate limit outbound
    connections per circuit on exits? (allow a circuit to consume a certain
    budget and limit after that is used)

A: Current answer, no, no plans.
iptables is not possible because it can not link outbound tcp to inbound
circuits.
iptables: not possible by circuit but just globally. PoW?
Summary: we ultimately need to fix this inside Tor, but fixing it well
is really hard. Keep it on the agenda!
We also have distant future research plans to use privcount so that all
relays across the network can coordinate, in a privacy preserving way,
to share info about overloaders. This way people could block attackers
without needing to know who exactly they attacked.

make circuit ID of connections available via stem [ahf: stem is
deprecated; i don't know what the alternative is right now, but i think
there is some work on that]

  • Do you know why middle probability of guards dropped to 0? was that a
    deliberate decision by directory authorities to reduce load on guards?

A: No deliberate decision. It was simply the change in load automatically
shifted the weights.
We should check if this is still happening right now. Because if Guard
capacity is scarce, this is a network bottleneck that we can fix simply
by assigning more Guard flags.

- Please allow exit operators to update their exit policy without
  restart + taking effect, that means: kill existing connections to
then-forbidden destinations

See: ExitPolicy should apply to already established outbound connections (with a config option, off by default) (#40676) · Issues · The Tor Project / Core / Tor · GitLab

  • Please allow exit operators to extract the IPs that the DDoS
    protections triggered so it can be used to feed into the other tor
    instances or iptables rules or exit policies

follow-up: somebody should make a ticket for this idea. Tor has the info
in its logs [EDIT: actually they are in logs in the moria1 branch, but
not in the mainline Tor yet. See
Connection DDoS defenses never applied to DirPort so dir auths still impacted (#40622) · Issues · The Tor Project / Core / Tor · GitLab ] and we could
turn that into controller events.

It's planned for 2023.

2 Likes