What’s Brave Done For My Privacy Lately? Episode #4: Fingerprinting Defenses 2.0
Brave Fingerprinting Protections v2: Farbling for Greater Good
Brave is redesigning its browser fingerprinting defenses to build on the randomization-based techniques discussed in the previous post. These new defenses provide stronger and more web-compatible protections by default: for users who are willing to accept some broken sites for further privacy, they can opt into an even stronger set of defenses. This system is currently being developed, and parts of it can be used today in our Nightly builds.
Brave’s goal is to both be the best browser for protecting your privacy, and the best browser for day-to-day, full-featured Web use. This post describes new privacy features being developed in Brave to better protect user privacy, without breaking privacy-respecting, user-serving websites. These new features build on the randomization based defenses described in the previous entry in this series, and we detail how these randomization defenses will be applied to more of the Web API in Brave. We will cover three areas:
- Past Generation Fingerprinting Defenses
- Brave’s New Fingerprinting Defense System: Farbling
- A Possible Future of Fingerprinting Protections in Brave
1. Past and current Generation Fingerprinting Protections
Brave’s goal is to improve the Web by, among other things, providing a full featured, pleasant to use browser that protects your privacy without degrading the browsing experience. One difficult type of privacy-threat that Brave protects against is “browser fingerprinting”. This section briefly describes what browser fingerprinting is, how browsers (including Brave) have attempted to defend against it, and why current protections are insufficient.
What is Fingerprinting? (a brief refresher)
Browser fingerprinting is a technique for identifying and tracking people on the Web by combining multiple semi-identifiers (things that are slightly different about each person’s browser, such as the size of the browser window or computer hardware details) and combining them into a single, unique identifier.
The technique works like this: different people have different language preferences, use different operating systems, etc. In isolation, none of these differences is likely to be unique enough to identify you. A website can see that one user reads French, while another reads Malay, but there will be many French and Malay speakers (amongst others), so a site can’t track individuals based on whether they prefer French, Malay, or any other language. Similarly, some users use MacOS, others Windows and others Linux; there are many people in each category.
However, by combining a large number of these semi-identifiers, sites can identify individuals. A site might have a lot of French readers, and a lot of Linux users; only a much smaller number of people will be both using Linux and preferring French. By combining many such semi-identifiers, sites can often uniquely identify (and so track) a large percentage of their users.
How Do Most Browsers Currently Address Fingerprinting?
Most browsers try to prevent fingerprinting by reducing the number of possible values each of these semi-identifying features return. For example, instead of specifying that a user prefers “Australian English” or “British English”, the browser might just report “English” in order to reduce how identifying a language preference is.
While this can bring some privacy improvement at the margins, in practice these defenses are insufficient for several reasons. First, there are many semi-identifiers in the browser, so reducing the identifiability of just some isn’t sufficient to keep most users private (especially for people on uncommon hardware, or uncommon languages, etc). Second, privacy-through-similarity fingerprinting protections are only effective on popular sites; if a site doesn’t have large numbers of visitors that are similar to you, your browser will still look unique. Third, and most tricky, modifying values to be less identifying can break websites that rely on that value being accurate. This category of fingerprinting protection then creates a privacy/functionality trade-off.
How is Brave’s Current System Evolving to Further Prevent Fingerprinting?
Brave is transitioning between two systems for preventing fingerprinting; in some places, Brave removes or otherwise modifies browser features, to try to make different Brave users look similar. While useful, these defenses have all the weaknesses discussed above. Because of these weaknesses, and because these defenses often break websites, Brave applies these defenses only in third-party frames by default. Again, useful but not ideal.
However, Brave also ships some defenses that employ a very different strategy: instead of trying to make everyone look identical, these defenses try to make everyone look different, to each website, for each session. Since this category of defenses make Brave users look different to each site, sites cannot use fingerprinting to track users across sites.
These defenses were discussed in detail in the previous entry in this series, and are currently applied to the canvas and Web Audio APIs. The rest of this post describes both how Brave will improve its existing randomization-based defenses, and how Brave plans to apply these randomization-techniques to many more parts of the browser.
2. Fingerprinting Protections 2.0: Farbling for Great Good
Brave currently applies randomization-based protections to canvas and Web audio based fingerprinting. We call this privacy-through-randomization technique “farbling”. This section presents how Brave is building on existing farbling protections to further protect against fingerprinting.
What is Farbling?
Farbling is Brave’s term for slightly randomizing the output of semi-identifying browser features, in a way that’s difficult for websites to detect, but doesn’t break benign, user-serving websites1. These “farbled” values are deterministically generated using a per-session, per-eTLD+1 seed2 so that a site will get the exact same value each time it tries to fingerprint within the same session, but that different sites will get different values, and the same site will get different values on the next session. This technique has its roots in prior privacy research, including the PriVaricator (Nikiforakis et al, WWW 2015) and FPRandom (Laperdrix et al, ESSoS 2017) projects.
Brave’s farbling-based fingerprinting protections have three levels, each described in more detail in the following subsections:
- Off: no fingerprinting protections are applied
- Default: protections that prevent fingerprinting and have a low risk of breaking websites
- Maximum: protections designed to provide fundamental defenses against fingerprinting, even at the risk of breaking sites
This system is currently under development, with some parts shipping in Nightly builds, and others still being built. We anticipate the full system to be completed in the next few months, and deployed in our release builds shortly after.
These defenses are applied in both first and third-party contexts. Since it’s trivial and common for third-party frames to collude with the first-party when tracking users, we need to apply our defenses accordingly.
Farbling Level: Off
In this setting, Brave will apply no fingerprinting protection techniques. Our goal is for users to never apply this setting. The “off” setting is included for sites requiring a high level of trust, for developers testing functionality, or other uncommon cases.
Farbling Level: Default
The default setting for farbling-based fingerprinting is to add small amounts of randomness to semi-identifying endpoints; small enough that it’s not noticeable to humans, but sufficient to prevent sites from tracking you.
This setting will be the default configuration, and is designed as a balance between privacy protections and web compatibility. We will continue to tweak and improve as we see how online trackers respond.
The primary goal of these defenses is to provide strong protections against web-scale trackers and advertisers, who want to identify users, but can’t spend outsized effort on any one target. As a secondary goal, these defenses aim to make benign use look very different from malicious use, so as to leave room for further intervention opportunities if needed. And third, when the prior two goals aren’t achievable (because of the nature of the APIs), we aim to significantly increase the amount of work required by the attacker, by requiring attackers to enumerate devices they’re trying to fingerprint, or reduce the amount of material fingerprints can draw from.
We should note that the “default” category still uses values derived from the “true” values of the underlying features. Because of this, we expect that these defenses could be circumvented by motivated, targeted attacks. Protecting against these hypertargeted attacks is not Brave’s primary goal; Brave’s goal is to protect users from the kinds of online trackers and privacy violations that are (sadly) pervasive on the Web, which in turn depend on large-scale economic return on investment to the fingerprinting adversary. Users requiring protection against targeted attacks may be better served by using tools specifically designed to protect against those, such as the Tor Browser Bundle.
Farbling Level: Maximum
Finally, Brave’s new fingerprinting defenses will include a third, “maximum” protection setting that provides additional privacy protections, although it may also break sites. In this category, randomized values are returned without incorporating any “true” underlying input. Where the “default” setting will add subtle randomness to fingerprintable outputs, the “maximum” setting is only the random values.
Because, in this configuration, the returned values are unrelated to the “true”, fingerprintable values otherwise returned by the relevant APIs, we do not expect these defenses to be susceptible to the kinds of statistical or easily-distinguishable attacks that are possible in the “default” configuration. This, however, carries with it a heightened risk of breaking sites, which may not function correctly when given random values.
What Will Get Farbled (and How, and When)?
The previous subsections described the goals and levels of protection in Brave’s new farbling defenses. This section describes which browser features will be modified under the “default” and “maximum” configurations, and how.
The below table lists which features will be modified by our farbling-based fingerprinting defenses. The first column describes the interface being modified, and the second column lists the specific properties and functions being modified. The third column gives the link and issue number we use internally for tracking the implementation status of each change.
3. Fingerprinting Protections Still to Come
We’re excited to share our new farbling-focused fingerprinting defenses. Fingerprinting is an extremely difficult problem to solve without breaking the Web, so much so that browser implementers often pose the problem as “browser features or privacy, pick one”. Farbling is part of Brave’s effort to show that this is a false dilemma, and that the Web can be richly featured and privacy respecting.
However, while we’re confident farbling improves privacy on the Web, there is still more work to do.
Fingerprinting Protections v3?
The initial version of farbling-based defenses, currently being implemented and described in this post, cover the majority of APIs used in popular fingerprinting attacks today. However, there are additional semi-identifying features we plan to address too. For example, we have in-development plans to address font-enumeration based fingerprinting, and we’d like to share less identifying values about graphics-card hardware by default. We’ll share more about these plans as we complete development of the initial farbling system.
Fighting Fingerprinting By Improving Standards
Finally, the farbling-defenses described here will end up not being useful if new browser features are added to the Web that allow for new forms of tracking and identification. To prevent this from happening, Brave works in the W3C to make sure new browser features are privacy-preserving and human-respecting.
The main, though not only, way Brave works for privacy in standards is through PING, the W3C body responsible for reviewing new specifications and fighting for them to be privacy-respecting. This is ongoing, perpetual work, and we’re eager to work with privacy-minded partners in the W3C to improve the Web for everyone.
Brave currently leads the way in fingerprinting protection, and no other browser offers users the functionality that Brave presently features or is working on implementing. When users browse with Brave, they know they’re using a browser that puts their interests first, and that our novel fingerprinting protection techniques prevent them from being tracked by sites and third parties. We look forward to receiving feedback on our new techniques, and hope that others will choose to implement similar approaches to give users the privacy they deserve.
- We’re not sure where the term farbling comes from. Brendan Eich pointed to prior use of the term to mean “twiddling”, or “subtly modifying”. That phrase was more convenient to say than “privacy protections based on randomization” so we ran with it.
- Each time you start Brave, a unique, random session token is created. This is never exposed to websites in any form, and is regenerated when you restart Brave. This token is then mixed (i.e. HMAC256) with each first-party, top-frame domain you visit, to generate a new, session-lifetime token per domain. All farbled values are generated from these per-session, per-domain random seeds.
Continue reading for news on ad blocking, features, performance, privacy and Basic Attention Token related announcements.
Brave opposes FLoC, a recent Google proposal that would have your browser share your browsing behavior and interests by default with every site and advertiser with which you interact.
You can learn quite a bit about a browser from observing the requests it makes in its first moments with a new user profile. Often, a cursory examination will tell you a great deal about how the browser thinks about, and handles, user privacy and security.
This post presents “ephemeral site storage”, a new strategy for managing third-party storage in Brave, designed to improve Web compatibility, while maintaining the same level of privacy protection.