Browser Fingerprinting

trinity-1686a · December 27, 2023, 10:19pm

It’s suprising that coveryourtracks.eff.org finds you are unique. It tells me I’m not.
For AmIUnique, it’s a known issue that it considers TorBrowser unique, so unique in fact that it thinks you are unique again the 2nd time you come. That’s because TorBrowser randomize some piece of information. It’s unique, but also changing every time, so it’s not actually useful as a fingerprint. It’s hard to tell if amiunique sees something else until that issue is fixed, but the people working on amiunique are working on a new version of their platform which they haven’t open-sourced yet, so we can’t do much about it, yet.

hobbitus · December 28, 2023, 11:53am

Thank you for response.

I have compared the results from https://coveryourtracks.eff.org/ and the only difference between my desktop and laptop is the screen resolution, which is probably the reason why the fingerprints are unique.

At least I cannot find anything else. Or any another suggestions?

I appreciate that 1200x1600 of my desktop is not a common screen size, but surprisingly 1920x1200 of my laptop is also so uncommon that consequently [my] “browser fingerprint appears to be unique among the 188,439 tested in the past 45 days.”

Therefore, the results of ‘Protecting you from fingerprinting?’ is “Yours browser has a unique fingerprint”.

Lettterboxing (privacy.resistFingerprinting.letterboxing) does not make it any better, but even more unique. It would be great if it was possible to set letterboxing to 1920x1080 which is one of the most common screen resolution, at least on my laptop.

thorin · December 28, 2023, 12:42pm

sites that give entropy figures are nonsense … let me count the ways

very limited datasets
data sets are heavily skewed by privacy conscious users (FF is massively over represented for example)
data sets are further tainted by repeat visitors
data sets are even further tainted by repeat visitors changing settings and repeatedly visiting

Sites that claim to provide entropy figures are absolute snake oil. They may be good to see what is reported, but that’s it. EFF’s cover your tracks had a purpose, to show that fingerprinting was a real threat - they should add disclaimers about their BS figures

Stop making assumptions. Do you understand what and how it is being tested and how it is used to calculate anything?

Comparing tests is a waste of time (well, the entropy figures are nonsense for a start), as the tests and purpose of each test can vary. For example, CYT detects some randomness and can thus return a static value for that test, such as “canvas: random”, but amiunique doesn’t do this, so it will also return a unique result for canvas, and thus an overall unique results

A fingerprint is just a snapshot in time, and can be manipulated after the fact - it is not incumbent on sites to TELL you what is used and what isn’t - and what can be bypassed or discarded in order to linkfy other fingerprints. Always treat fingerprints as snapshopts, that can be fuzzed after the fact.

lets look at this: Global Statistics- Am I Unique ? - last 30 days

36% of users are using Firefox
- in reality we know that FF is about 3% worldwide share, or 6% on desktop
72% are using requesting en* (english)
- it’s a shame this is not broken down by locale
- this is simply not true. We’re talking users/profiles on the internet, not people in the world, so some languages will be under-represented, and a lot of users users do use en-* as their second language. But almost three quarters of internet users being english is a stretch
22% are in timezone UTC0
- it’s a shame this not broken down by actual timezone name instead of classifying everything as UTC-something
- again with internet users vs populations this is a bit vague - but 22% of users being in greenwich mean time is bollocks
and I could go on

Lets look at some more nonsense (but I get that these sites are using all visitors). On CYT using TB (en-US) for windows

userAgent: (FF115 windows 10 64 bit)
- says 1 in 3.45 browsers have this value
- reality says FF is 3% (call it 1 in 33) worldwide, windows is 80% (1 in 1.25), and ESR is about 10% (1 in 10), so the real figure is approx 1 in 413
- you also can’t hide the fact that you’re using TB or your OS, and TB has e.g. 1 million windows daily desktop users, so entropy (as far as we’re concerned) is the barest of buckets (equivalency) is actually zero
this one might explain the zero entropy better
- says my timezone of UTC is entropy 2.35 bits
- ALL tor browser users report his value, so it’s NIL (for our set)

The way we defeat fingerprinting linkability is to take each metric and reduce the entropy in it in our set (our set being TB users) - and there is are some things you can’t lie about (such as requesting web content in a language - e.g. if you need arabic, then request arabic) or hide (version, os, fonts). So for lack of a better word, we call this equivalency. E.g. if you have windows fonts, that’s equivalency of being on windows (os). Or if you have certain default fonts, that’s equivalency of language), etc. We can randomize if we want (per execution or per session+eTLD+1) but ultimately all randomizing can be detected. So this is not some magic bullet - it only exposes that some sites/scripts are lazy. We assume advanced scripts. So we protect each metric one by one, making it harder and more costly for scripts, until they give up and it becomes prohibitive - but we must balance that with usability and compat

The way to determine how many values a metric may return is to test and collect + analyze the data (e.g. checking for equivalency or other external factors such as device pixel ratio), and then the only way to get any real world entropy is to do a large scale test collecting the data, one per profile (so as to not taint the data set)

for example - collect TB115 only fingerprint data: this immediately removes all non-TB noise and e.g. UTC0 = all users = zero entropy (for us) - capisce?

tl;dr: stop comparing different sites’ results, stop using entropy figures from sites

I’m just going to stop here - I’m supposedly to writing this all up for some doc/blog

thorin · December 28, 2023, 12:46pm

wrong! LBing actually enforces a much smaller subset of inner window sizes (which we use to report outer window, screen + available screen) - without is it, TB users who tile, manually resize, maximize, go full screen, or have inner windows off ±1 pixel (due to rounding issues due to device pixel ratios/system scaling) would create numerous (millions) of different potential sizes

no one cares what the rest of the world is, it only matters what TB users are within our own set

thorin · December 29, 2023, 5:55am

actually, I got that a little mixed up … windows is 80% of desktop, but desktop is just under half of all users … so it’s double that … 1 in 826 - this is rather different from 1 in 3.45

edit: had it right the first time 3% already includes worldwide