Brave, Fingerprinting, and Privacy Budgets
This work was written by Peter Snyder, Privacy researcher at Brave, and Dr. Ben Livshits, Brave’s Chief Scientist.
One of Brave’s core goals is to protect peoples’ privacy online, and to prevent online trackers from following you around the Web. Brave protects users against many different forms of tracking, including state-based tracking, IP tracking, and fingerprinting (or passive) tracking.
Other browsers have announced plans to protect user privacy, including against fingerprinting attacks. Recently, Google announced that part of their planned defenses against fingerprinting attacks would be a system called privacy budget, where sites would be able to access semi-identifying browser functionality up to some threshold, after which sites would be restricted to a less identifying data. This approach is part of a lineage of well-explored dynamic-protection approaches.
In this post, we discuss why Brave is skeptical about such dynamic approaches to Web privacy, and outline our plans for future protections. The post first summarizes what browser fingerprinting is, and common defenses. Second, the post presents problems with “dynamic privacy approaches”, and why Brave is skeptical they are effective for protecting against fingerprinting. Third, the post presents Brave’s fingerprinting protections, current, upcoming and longer-term.
What is Browser Fingerprinting
Browser fingerprinting is a technique websites and advertisers use to track people without consent. The approach works by combining a large number of semi-identifiers together, to build a highly identifying value to identify users with.
Examples of semi-identifiers commonly used by trackers include browser version, plugins installed, and window size. Mozilla has compiled a more complete list of fingerprinting methods. The Electronic Frontier Foundation’s Panopticlick project is a good demonstration of how browser fingerprinting works.
Fingerprinting is increasingly common, partially in response to privacy focused browsers blocking cookie-based tracking methods, either across the entire Web (e.g. Brave and Safari), or for domains suspected of being tracking related (e.g. Firefox). Browser vendors must develop new protections, to prevent fingerprinting-based tracking.
Approaches to Protecting Against Fingerprinting
Broadly, there are four ways to defend against fingerprinting:
1. Remove Functionality
The simplest option for dealing with privacy harming functionality is to just remove the functionality. This might also take the form of changing the functionality of the feature to always return the same value. This approach is effective, but risks breaking websites.
2. Determine Access By Trust
Browsers can also restrict access to functionality based on trust, allowing sites the user trusts to access the functionality, and blocking all other sites from accessing. The most common way browsers infer trusts is by treating the first party (i.e. the domain of the URL that appears in the URL bar) as trusted, and third parties (i.e any other domains that appear in the URLs of <iframe>s) as less trusted.
3. Add Randomness to Fingerprinting Functionality
Another approach is to add randomness to the outputs of semi-identifying functionality. For example, browsers might make small, frequent changes to the User Agent string, so that each time a site reads the value, its slightly different, preventing identification. This approach has so far been more popular in research than in shipping browsers (e.g. the PriVaricator and FPRandom papers). The approach is appealing because it exploits a quirk of popular fingerprinting libraries. Most popular libraries build identifiers by mixing multiple values. Randomizing one value would have the downstream effect of randomizing the entire identifier.
4. Determine Access By Threshold (e.g. Privacy Budgets)
A final approach to preventing fingerprinting is to restrict access to fingerprinting functionality based on how the website has behaved previously, specific to the user. This is different from the “determined by trust” approach, where access decisions are made using more general heuristics, and not user-by-user.
Google takes this approach in the “Privacy Budget” component of its Privacy Sandbox plan. As currently described, sites can access some identifying information, but once the site hits some threshold of “identifiability”, the site is prevented from accessing further related functionality.
Dynamic Privacy Budgets, And Where They Fail
Brave Browser is a privacy focused browser. Brave is built on Chromium, the same code base used in Google’s Chrome browser. As a result, we’ve been asked if we plan on adopting Google’s Privacy Budget approach to fingerprinting protections. While the specifics of the privacy budget proposal haven’t been presented yet, in this section we explain why Brave is in general skeptical of dynamic-budget approaches.
1. Privacy Budgets Are a Mismatch for Fingerprinting
Privacy budget approaches for privacy-preserving technologies have a long history in industry and production systems. Perhaps the most popular application is differential privacy, a category of data anonymization techniques that attempt to provide anonymity without sacrificing (too much) usefulness by (to simplify how it works) adding small amounts of randomness.
The designers of these systems anticipated that attackers could “reverse” this randomness by asking lots of questions. As a countermeasure, many differential privacy systems include a “privacy budget”, where people can query the data to a limit, after which more randomness is added, to prevent attackers from “reversing” the injected noise. An implication of the above is that, as a fixed data set is queried more and more, the “defender” / “anonymiser” needs to add more and more randomness to maintain privacy.
Dynamic budget approaches to fingerprinting protections follow a similar strategy: allow websites to learn about the visitor up to some “safe” threshold, after which websites are prevented from learning more.
However, two aspects of the Web prevent a differential-privacy privacy-budget approach from being successful. First, browsers can’t add ever greater “noise” to the user experience, in a way analogous to adding increasing noise to a datasets. The result would be a browser that performed strictly worse over time, as the client had to add more and more noise to prevent websites from learning about the fixed underlying, fingerprintable values.
Second, there are few-to-no mechanisms in place to prevent websites from repeatedly “querying” the browser for fingerprinting information. This point is discussed in more detail in the next section.
2. No Good Budget Scope to Protect Privacy
Budget-based approaches to fingerprinting protection also have the problem of determining the correct scope for the budget. There are, broadly speaking, two possible approaches here: budgets per origin (e.g. every domain gets its own budget), or budgets per document (every origin / frame on a page shares a single budget). Both of these approaches are unsatisfactory, with intermediate options having the problems of both extremes.
If each origin has its own budget, attackers are not meaningfully constrained. Domains are cheap, and so trackers can cheaply extend their budget. In practice, an attacker would register (say) 10 domains, inject <iframe>s that point to each of those domains, carry out 10% of the needed fingerprinting in each frame and postMessage the result back to the parent frame.
The other approach is to scope the budget to the document (or, the first party origin). Under this approach, every frame in the page would share the same budget. While this approach would solve the problems in the previous approach, it has its own fatal problem; child frames can exhaust the budget of the parent frame, meaning that the page a user wishes to visit could be broken by the behavior of an unrelated, “isolated” third party frame. Such a policy would quickly undo an enormous amount of security and privacy focused work that’s gone into increasing the isolation between first and third parties; embedded sites would be able to affect, control, or break the functionality of the top level document.
3. Budget Lifetimes Trade Off Against Web Compatibility
Similarly, budget-approaches do not work on the Web because there is no workable lifetime for such budgets. If the budget is short (e.g. resets after page load or an hour or day or similar), then it provides little protection.
If budgets are long, then the approach will be unworkable from a web-compatibility perspective, as sites will be unable to update confidently. Consider this contrived but we hope demonstrative example: a privacy budget policy that resets after 3 days, and allows websites to access a single feature. On Monday, example.com uses feature X. Then on Tuesday, they push an update and use feature Y. In isolation this appears not to be a problem, using either X or Y is within the allowed budget.
However, if a user visits the site on Monday and Tuesday, they’ll find the site is broken, as both days are within the same budget, and using X and Y within the same site is prohibited. And while this is a contrived example, it demonstrates how a budget-based-approach constrains even well-behaved sites. Pushing site updates becomes more like staging DNS changes, and methods such as A/B testing become all but impossible.
4. Privacy Budgets Would Cause an Unpredictable, Terrible Developer Experience
A privacy budget approach for fingerprinting defense would also be harmful for developers, likely unworkably so. Programming on the web is difficult because application developers need to account for differences among browsers, versions of the same browser, rendering differences among operating systems, etc. Web standards efforts have done excellent work to improve this situation by defining a solid, cross-browser, and powerful set of functions that Web developers can target.
Dynamic privacy approaches would undo this work, and make Web programming difficult in a manner that would make “Netscape 4 vs. IE6” era developers wince. Instead of needing to account for differences between a small number of browsers, Web developers would need to account for environmental differences imposed by budgets that could expire at any time. The execution environment for a page would vary not just between browser and version, but based on the behavior of other scripts on the same page, and even the previous execution of the same script.
Instead of a consistent stable Web API, developers would need to account for an execution environment that could change underneath them at any time.
5. Ouroboros: Enabling New Forms of Fingerprinting
As a final, ironic consequence, budget-based approaches to fingerprinting protections may enable new forms of fingerprinting. At root, fingerprinting attacks leverage differences in browser environments that are detectable within the browser environment. A privacy budget introduces many such new page-detectable differences, and it’s easy to imagine ways pages could figure out how much budget remained, how previous budget was spent, and so on.
Put differently, a budget is new, persistent per-domain (or per-page) state, maintained by the browser, and exposed (directly or indirectly) to possible trackers. Trackers can modify this state in a way unique to each user (i.e. by exhausting the budget in a unique way for each user). This page-writable, page-readable, persistent state is the primitive that most tracking mechanisms are built on.
Fingerprinting Protections in Brave (Present, Planned, and Under Consideration)
The previous sections described why Brave is skeptical of budget-based approaches to fingerprinting protections. This section presents Brave’s approach, a mixture of removing differences between browsers we don’t expect to be useful for users, adding noise to fingerprinting vectors, and improving our ability to determine which script should have access to fingerprinting functionality, and which scripts to block.
Current Fingerprinting Protections in Brave
Currently Brave protects against fingerprinting by preventing third party sites from accessing functionality frequently used to fingerprint users. This includes highly identifying parts of the Canvas, Web Audio and WebGL APIs, among others. These default settings can be changed through Brave’s Shields interface, where users can disable these protections if needed, or also extend them to the first party.
Brave prevents known fingerprinting scripts from being loaded, through the browser’s Shields protections. Brave both uses, and contributes to the development of, privacy-protecting filter lists like EasyList and EasyPrivacy.
Upcoming Fingerprinting Protections
While Brave’s fingerprinting protections are strong and do a good job of protecting user privacy (especially when combined with Brave’s blocking of cross-site tracking scripts and frames), we have plans to do even better, to keep ahead of even the most determined adversaries. We plan to implement protections against font–based fingerprinting, and identifying users based on hardware capabilities (e.g. hardware concurrency, device memory, etc.). Additionally, we will soon begin adding noise to values used in fingerprinting, such as varying OS version numbers in the user agent string.
Brave is also working with Web standards bodies to pursue privacy protections at a cross-browser level. Currently browser vendors need to deviate from standards to protect user privacy, a situation which is both a condemnation of the state of privacy on the Web, and puts privacy-focused browsers at a constant disadvantage. Through our work in the W3C, including co-chairing PING (the W3C’s primary privacy body) and reviewing specs for privacy issues, Brave is working to improve fingerprinting protections for all Web users.
Ongoing Research on Fingerprinting Protections
Browser fingerprinting is a difficult threat for the privacy community to defend against, and one that’s likely to get ever more challenging as browsers deploy protections against traditional, cookie-based tracking systems. Because there is so much privacy-harming technical debt in the Web platform, successful defenses will require addressing the problem at every level: resource blocking, fingerprint randomization, restricting which sites have access to risky functionality, and improving privacy in Web standards.
Approaches that attempt to maintain an “acceptable” amount of identification and tracking online, however well-meaning, are antithetical to the goal of a truly privacy-respecting Web. We expect that “budget”-based approaches to Web privacy will not be effective privacy protections, and that the steps described above will be far more effective at protecting user privacy. We’re excited to work with others in the privacy community to continue to improve privacy on the Web.
Continue reading for news on ad blocking, features, performance, privacy and Basic Attention Token related announcements.
Brave opposes FLoC, a recent Google proposal that would have your browser share your browsing behavior and interests by default with every site and advertiser with which you interact.
You can learn quite a bit about a browser from observing the requests it makes in its first moments with a new user profile. Often, a cursory examination will tell you a great deal about how the browser thinks about, and handles, user privacy and security.
This post presents “ephemeral site storage”, a new strategy for managing third-party storage in Brave, designed to improve Web compatibility, while maintaining the same level of privacy protection.