[tor-relays] Relay MIGHTYWANG consensus issues and loss of STABLE flag

Mighty_Wang · October 29, 2021, 4:10pm

Hello fellow operators

I have one pretty large relay, MIGHTYWANG which is an IP4/6 guard, dedicated hardware running on a 1Gb line uncontended. It is usually one of the top 5 relays by consensus weight but on the morning of 14th October it lost Guard status on account of losing the stable flag.

I checked logs, connectivity and server health - nothing unusual, everything is generally pretty bullet proof in and around the relay and it had been running for well over a year without a reboot - just the very occasional Tor daemon restart following upgrades but no such activity prior to the 14th.

So next I checked the consensus and I see that around half of the directory authorities seem to be not assigning the stable flag. See attached screenshot showing current consensus.

The peering to each of those relays seems OK from what I can see (IP4 and IP6) so any idea what gives?

I’ve got a MIGHTYWANG sitting here twiddling it’s thumbs because have the directory authorities don’t want to use it. Bit of a waste.

I had similar things happen a few years ago with one of my old relays; again no obvious reason, just seemed to be the a random whim of the directory authorities.

I’ve noticed a couple of other long term relays are in a similar position - is this some time of attack, deliberate action or just Tor magic?

Wang

···

-- 
MIGHTYWANG 9B2BC7EFD661072AFADC533BE8DCF1C19D8C2DCC

Sebastian_Hahn · October 29, 2021, 6:04pm

Hi Wang,

I have one pretty large relay, MIGHTYWANG which is an IP4/6 guard, dedicated hardware running on a 1Gb line uncontended. It is usually one of the top 5 relays by consensus weight but on the morning of 14th October it lost Guard status on account of losing the stable flag.

I checked logs, connectivity and server health - nothing unusual, everything is generally pretty bullet proof in and around the relay and it had been running for well over a year without a reboot - just the very occasional Tor daemon restart following upgrades but no such activity prior to the 14th.

So next I checked the consensus and I see that around half of the directory authorities seem to be not assigning the stable flag. See attached screenshot showing current consensus.

The peering to each of those relays seems OK from what I can see (IP4 and IP6) so any idea what gives?

I've got a MIGHTYWANG sitting here twiddling it's thumbs because have the directory authorities don't want to use it. Bit of a waste.

I had similar things happen a few years ago with one of my old relays; again no obvious reason, just seemed to be the a random whim of the directory authorities.

I've noticed a couple of other long term relays are in a similar position - is this some time of attack, deliberate action or just Tor magic?

Wang

I operate gabelmoo and your relay seems to be unreachable via IPv6 from here. Here's a traceroute:

traceroute to 2a02:29d0:8008:c0de:bad:beef:: (2a02:29d0:8008:c0de:bad:beef::), 30 hops max, 80 byte packets
1 informatik.gate.uni-erlangen.de (2001:638:a000:4140::1) 1.966 ms 2.037 ms 2.214 ms
2 constellation.gate.uni-erlangen.de (2001:638:a000::3341:33) 0.718 ms 0.770 ms 0.831 ms
3 yamato.gate.uni-erlangen.de (2001:638:a000::3033:30) 0.829 ms 1.122 ms 1.234 ms
4 * * *
5 * * *
6 * * *
7 ffm-bb1-v6.ip.twelve99.net (2001:2034:1:6b::1) 19.795 ms 19.786 ms 19.779 ms
8 prs-bb1-v6.ip.twelve99.net (2001:2034:1:be::1) 20.489 ms prs-bb2-v6.ip.twelve99.net (2001:2034:1:c1::1) 20.931 ms prs-bb1-v6.ip.twelve99.net (2001:2034:1:be::1) 20.509 ms
9 ldn-bb4-v6.ip.twelve99.net (2001:2034:1:7b::1) 19.517 ms ldn-bb1-v6.ip.twelve99.net (2001:2034:1:7a::1) 19.390 ms 19.334 ms
10 * * *
11 vaioni-ic326121-ldn-b2.ip.twelve99-cust.net (2001:2000:3080:937::2) 20.387 ms 19.464 ms 20.446 ms
12 2a02:29d0:0:1:: (2a02:29d0:0:1: 39.577 ms 39.414 ms 39.363 ms
13 2a02:29d0:3:1003::1 (2a02:29d0:3:1003::1) 20.520 ms 20.514 ms *
14 * * *
15 * * *
16 * * *
17 * * *
18 * * *
19 * * *
20 * * *
21 * * *
22 * * *
23 * * *
24 * * *
25 * * *
26 * * *
27 * * *
28 * * *
29 * * *
30 * * *

Perhaps this helps analyze the problem?

Cheers
Sebastian

···

On 29. Oct 2021, at 18:10, Mighty Wang <wang@mighty.wang> wrote:

_______________________________________________
tor-relays mailing list
tor-relays@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays

Eddie · October 29, 2021, 5:57pm

Welcome to the club: Since Georg opened that (on my behalf) I too have lost the Stable flag. Cheers.

···

https://gitlab.torproject.org/tpo/network-health/team/-/issues/128

On 10/29/2021 9:10 AM, Mighty Wang wrote:

Hello fellow operators

I have one pretty large relay, MIGHTYWANG which is an IP4/6 guard, dedicated hardware running on a 1Gb line uncontended. It is usually one of the top 5 relays by consensus weight but on the morning of 14th October it lost Guard status on account of losing the stable flag.

I checked logs, connectivity and server health - nothing unusual, everything is generally pretty bullet proof in and around the relay and it had been running for well over a year without a reboot - just the very occasional Tor daemon restart following upgrades but no such activity prior to the 14th.

So next I checked the consensus and I see that around half of the directory authorities seem to be not assigning the stable flag. See attached screenshot showing current consensus.

The peering to each of those relays seems OK from what I can see (IP4 and IP6) so any idea what gives?

I’ve got a MIGHTYWANG sitting here twiddling it’s thumbs because have the directory authorities don’t want to use it. Bit of a waste.

I had similar things happen a few years ago with one of my old relays; again no obvious reason, just seemed to be the a random whim of the directory authorities.

I’ve noticed a couple of other long term relays are in a similar position - is this some time of attack, deliberate action or just Tor magic?

Wang

-- 
MIGHTYWANG 9B2BC7EFD661072AFADC533BE8DCF1C19D8C2DCC

_______________________________________________
tor-relays mailing list

tor-relays@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays

Roman_Mamedov · October 29, 2021, 6:15pm

I operate gabelmoo and your relay seems to be unreachable via IPv6 from here. Here's a traceroute:

Ping and traceroute to that IP don't reach for me either, from anywhere*, but
TCP connection to port 443 works. Perhaps you could recheck that too, on your
end?

traceroute to 2a02:29d0:8008:c0de:bad:beef:: (2a02:29d0:8008:c0de:bad:beef::), 30 hops max, 80 byte packets
1 informatik.gate.uni-erlangen.de (2001:638:a000:4140::1) 1.966 ms 2.037 ms 2.214 ms
2 constellation.gate.uni-erlangen.de (2001:638:a000::3341:33) 0.718 ms 0.770 ms 0.831 ms
3 yamato.gate.uni-erlangen.de (2001:638:a000::3033:30) 0.829 ms 1.122 ms 1.234 ms
4 * * *
5 * * *
6 * * *
7 ffm-bb1-v6.ip.twelve99.net (2001:2034:1:6b::1) 19.795 ms 19.786 ms 19.779 ms
8 prs-bb1-v6.ip.twelve99.net (2001:2034:1:be::1) 20.489 ms prs-bb2-v6.ip.twelve99.net (2001:2034:1:c1::1) 20.931 ms prs-bb1-v6.ip.twelve99.net (2001:2034:1:be::1) 20.509 ms
9 ldn-bb4-v6.ip.twelve99.net (2001:2034:1:7b::1) 19.517 ms ldn-bb1-v6.ip.twelve99.net (2001:2034:1:7a::1) 19.390 ms 19.334 ms
10 * * *
11 vaioni-ic326121-ldn-b2.ip.twelve99-cust.net (2001:2000:3080:937::2) 20.387 ms 19.464 ms 20.446 ms
12 2a02:29d0:0:1:: (2a02:29d0:0:1: 39.577 ms 39.414 ms 39.363 ms
13 2a02:29d0:3:1003::1 (2a02:29d0:3:1003::1) 20.520 ms 20.514 ms *
14 * * *
15 * * *
16 * * *
17 * * *
18 * * *
19 * * *

* a terrible firewalling practice, see what confusion it leads to.

···

On Fri, 29 Oct 2021 20:04:11 +0200 Sebastian Hahn <mail@sebastianhahn.net> wrote:

--
With respect,
Roman
_______________________________________________
tor-relays mailing list
tor-relays@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays

Mighty_Wang · October 29, 2021, 10:17pm

Thanks Sebastian

Hi Wang,

I have one pretty large relay, MIGHTYWANG which is an IP4/6 guard, dedicated hardware running on a 1Gb line uncontended. It is usually one of the top 5 relays by consensus weight but on the morning of 14th October it lost Guard status on account of losing the stable flag.

I checked logs, connectivity and server health - nothing unusual, everything is generally pretty bullet proof in and around the relay and it had been running for well over a year without a reboot - just the very occasional Tor daemon restart following upgrades but no such activity prior to the 14th.

So next I checked the consensus and I see that around half of the directory authorities seem to be not assigning the stable flag. See attached screenshot showing current consensus.

The peering to each of those relays seems OK from what I can see (IP4 and IP6) so any idea what gives?

I've got a MIGHTYWANG sitting here twiddling it's thumbs because have the directory authorities don't want to use it. Bit of a waste.

I had similar things happen a few years ago with one of my old relays; again no obvious reason, just seemed to be the a random whim of the directory authorities.

I've noticed a couple of other long term relays are in a similar position - is this some time of attack, deliberate action or just Tor magic?

Wang

I operate gabelmoo and your relay seems to be unreachable via IPv6 from here. Here's a traceroute:

traceroute to 2a02:29d0:8008:c0de:bad:beef:: (2a02:29d0:8008:c0de:bad:beef::), 30 hops max, 80 byte packets
  1 informatik.gate.uni-erlangen.de (2001:638:a000:4140::1) 1.966 ms 2.037 ms 2.214 ms
  2 constellation.gate.uni-erlangen.de (2001:638:a000::3341:33) 0.718 ms 0.770 ms 0.831 ms
  3 yamato.gate.uni-erlangen.de (2001:638:a000::3033:30) 0.829 ms 1.122 ms 1.234 ms
  4 * * *
  5 * * *
  6 * * *
  7 ffm-bb1-v6.ip.twelve99.net (2001:2034:1:6b::1) 19.795 ms 19.786 ms 19.779 ms
  8 prs-bb1-v6.ip.twelve99.net (2001:2034:1:be::1) 20.489 ms prs-bb2-v6.ip.twelve99.net (2001:2034:1:c1::1) 20.931 ms prs-bb1-v6.ip.twelve99.net (2001:2034:1:be::1) 20.509 ms
  9 ldn-bb4-v6.ip.twelve99.net (2001:2034:1:7b::1) 19.517 ms ldn-bb1-v6.ip.twelve99.net (2001:2034:1:7a::1) 19.390 ms 19.334 ms
10 * * *
11 vaioni-ic326121-ldn-b2.ip.twelve99-cust.net (2001:2000:3080:937::2) 20.387 ms 19.464 ms 20.446 ms
12 2a02:29d0:0:1:: (2a02:29d0:0:1: 39.577 ms 39.414 ms 39.363 ms
13 2a02:29d0:3:1003::1 (2a02:29d0:3:1003::1) 20.520 ms 20.514 ms *
14 * * *
15 * * *
16 * * *
17 * * *
18 * * *
19 * * *
20 * * *
21 * * *
22 * * *
23 * * *
24 * * *
25 * * *
26 * * *
27 * * *
28 * * *
29 * * *
30 * * *

Perhaps this helps analyze the problem?

Cheers
Sebastian

Strangely your relay gabelmoo is one of the relays I checked IP4/IP6 connectivity to and I can hit your IP6 OK from MIGHTYWANG so there is a route,

traceroute to 2001:638:a000:4140::ffff:189 (2001:638:a000:4140::ffff:189), 30 hops max, 80 byte packets
1 beijing.dsd-labs.com (2a02:29d0:8008::1) 0.091 ms 0.076 ms 0.088 ms
2 2a02:29d0:3:1003:: (2a02:29d0:3:1003: 1.367 ms 1.378 ms 1.364 ms
3 2a02:29d0:0:1::1 (2a02:29d0:0:1::1) 1.487 ms 1.443 ms 1.458 ms
4 * * *
5 ldn-bb4-v6.ip.twelve99.net (2001:2034:1:7b::1) 1.839 ms ldn-bb1-v6.ip.twelve99.net (2001:2034:1:7a::1) 17.684 ms 17.402 ms
6 prs-bb1-v6.ip.twelve99.net (2001:2034:1:be::1) 17.454 ms prs-bb2-v6.ip.twelve99.net (2001:2034:1:c1::1) 18.639 ms 18.623 ms
7 ffm-bb1-v6.ip.twelve99.net (2001:2034:1:6b::1) 18.859 ms 17.696 ms ffm-bb2-v6.ip.twelve99.net (2001:2034:1:6c::1) 18.092 ms
8 kr-erl156-0.x-win.dfn.de (2001:638:c:a039::2) 21.150 ms 20.963 ms 21.541 ms
9 constellation.gate.uni-erlangen.de (2001:638:a000::3033:33) 21.509 ms 21.291 ms 22.232 ms
10 * informatik.gate.uni-erlangen.de (2001:638:a000::3341:41) 20.767 ms 21.339 ms
11 despari.informatik.uni-erlangen.de (2001:638:a000:4140::ffff:189) 21.215 ms 20.971 ms 20.725 ms

I think your UDP based traceroute is hitting my firewall and getting dropped but you do have a route to me - in fact your relay has a long term active connection to mine via IP6 right now:

tcp6 0 0 2a02:29d0:8008:c0de:bad:beef:::443 2001:638:a000:4140::ffff:189:41011 ESTABLISHED

So it isn't an IP6 issue from what I can see (although that was an issue about 18 months ago as a result of some temporary peering issues).

I checked all the DA relays on IP6 and IP4 and all have active connections to me via IP6 (where they support it) or IP4 so if it is a connectivity issue it must be transient and so far undetectable.

There is something else happening here but I don't know what yet.

thanks

Wang

···

On 29/10/21 19:04, Sebastian Hahn wrote:

On 29. Oct 2021, at 18:10, Mighty Wang <wang@mighty.wang> wrote:

_______________________________________________
tor-relays mailing list
tor-relays@lists.torproject.org
tor-relays Info Page

--
MIGHTYWANG 9B2BC7EFD661072AFADC533BE8DCF1C19D8C2DCC

_______________________________________________
tor-relays mailing list
tor-relays@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays

Mighty_Wang · October 29, 2021, 10:28pm

Hi Eddie

Yes I saw your post on the day it happened and guessed that we are suffering from exactly the same issue that started at exactly the same time.

I couldn’t correlate the loss of stable flag with anything in recent Tor server releases but I am going to recheck those; I am currently working my way back through the consensus voting lists in the run-up to the 14th October to try and understand where the problem started,

I’ll report back here.

thanks

W

···

On 29/10/21 18:57, Eddie wrote:

Welcome to the club: Since Georg opened that (on my behalf) I too have lost the Stable flag. Cheers.

-- 
MIGHTYWANG 9B2BC7EFD661072AFADC533BE8DCF1C19D8C2DCC

https://gitlab.torproject.org/tpo/network-health/team/-/issues/128

On 10/29/2021 9:10 AM, Mighty Wang wrote:

Hello fellow operators

I have one pretty large relay, MIGHTYWANG which is an IP4/6 guard, dedicated hardware running on a 1Gb line uncontended. It is usually one of the top 5 relays by consensus weight but on the morning of 14th October it lost Guard status on account of losing the stable flag.

I checked logs, connectivity and server health - nothing unusual, everything is generally pretty bullet proof in and around the relay and it had been running for well over a year without a reboot - just the very occasional Tor daemon restart following upgrades but no such activity prior to the 14th.

So next I checked the consensus and I see that around half of the directory authorities seem to be not assigning the stable flag. See attached screenshot showing current consensus.

The peering to each of those relays seems OK from what I can see (IP4 and IP6) so any idea what gives?

I’ve got a MIGHTYWANG sitting here twiddling it’s thumbs because have the directory authorities don’t want to use it. Bit of a waste.

I had similar things happen a few years ago with one of my old relays; again no obvious reason, just seemed to be the a random whim of the directory authorities.

I’ve noticed a couple of other long term relays are in a similar position - is this some time of attack, deliberate action or just Tor magic?

Wang

-- 
MIGHTYWANG 9B2BC7EFD661072AFADC533BE8DCF1C19D8C2DCC

_______________________________________________
tor-relays mailing list

tor-relays@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays

_______________________________________________
tor-relays mailing list

tor-relays@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays