Grab bag: query stripping, referrer policy, and reporting API

Published Jul 20, 2020

Privacy updates

This post describes work done by Senior Product Security Engineer François Marier (@fmarier), Senior Security Engineer Pranjal Jumde (@pjumde), Senior Software Engineer Ivan Efremov, and Senior Privacy Researcher Peter Snyder.

In order to stay one step ahead of online trackers, Brave regularly releases new privacy features and improvements. This post discusses three recent changes in Brave that each help make the web a more privacy, and person, respecting platform. All of these are tweaks, subtle changes, or first-steps of a new approach to a more private, more compatible Web.

1. Removing Known Tracking Parameters From URLs

First, Brave removes common tracking parameters from URLs¹ by default. These are parameters trackers use to collect your activity across the web. Common examples of these include the “Facebook Click Identifier” (fbclid), used by Facebook to record which sites you visit when you’re not on Facebook, the “Google Click Identifier” (gclid), used by Google to link the advertising and analytics data they have on you, and Microsoft’s equivalent (msclkid).

These values are added to URLs you click, so that advertisers can learn more about you and your behavior on the web. Brave currently removes these tracking-related query parameters from URLs, allowing you to visit the sites you want to visit, without being tracked.

What is Query Parameter Tracking (and why is it tricky to stop)?

Query parameter tracking (sometimes called “link decoration”) refers to trackers adding uniquely identifying query parameters (typically, what appears after the ? in the URL) to links when you leave a site, and then reading them back out of the URL when you land on a different, possibly unrelated site. This allows the tracker to connect your behavior between two different, independent websites.

Query parameter tracking is a particularly difficult form of tracking to block. As a result, most browsers do not have effective countermeasures to query parameter tracking². To understand why query parameter tracking is difficult to block, contrast it to how most tracking is done on the web.

Most online tracking depends on your browser fetching URLs you’d rather your browser didn’t fetch, either because of third-party cookies carried on that request, or because of the JavaScript returned in that request. As a result, browsers can (in principal, though not always in practice) prevent such tracking by either not sending cookies on third party requests, or by blocking requests for tracking-related JavaScript.

And in fact most major browsers provide some combination of these defenses. Brave provides the strongest protections, by blocking third party cookies³, blocking third party storage, and blocking (or replacing) requests for tracking-related JavaScript. Safari blocks third-party cookies, and partitions third party storage by default. Firefox and Edge also restrict third-party cookies and storage for known trackers⁴ and block requests for social media trackers by default. Chrome is the only major browser that does not currently provide any of these protections.

Query parameter tracking is difficult to block, though, because it happens through URLs you want to visit; query parameter identifiers sit alongside benign, user-serving values in the URL. Blocking requests every time these identifiers appeared in the URL would prevent you from visiting many sites you wish to visit.

How Does Brave Prevent Query Parameter Tracking?

Whenever you visit a page with a known tracking query parameter in the URL, such as https://example.org?fbclid=uniquevalue, Brave removes the value from the URL before your browser makes the request. Because these query parameters are used for tracking, but not necessary for the site to work correctly, removing the query parameter prevents the tracking, but otherwise doesn’t affect the site’s behavior.

Brave currently determines which parameters are tracking related by building a list of known tracking parameters and building them into the browser. These parameters are determined by reading the documentation provided by the trackers themselves, drawing on existing crowdsourced lists, and through parameters Brave developers and users have encountered themselves. The current list of removed query parameters can be found in Brave’s source code, and the technique and its caveats are described further in our wiki.

Next Steps For Query Parameter Tracking Protections

While we’re excited to deploy and share this privacy protection, we’re also being cautious to do so carefully. We want to be careful to not break websites that may use the same query parameter names as trackers, but for completely benign, first-party purposes. Similarly, while Brave wants to prevent users from being tracked, we don’t want to break privacy-preserving types of campaign tracking, or query parameters that are not unique to individuals. Brave wants to prevent privacy-harming behavior, while preserving legitimate, privacy-respecting analytics systems.

Finally, Brave has several projects under development to further improve our query parameter tracking protections. We will soon have automated crawls to detect additional query parameter tracking, using our PageGraph instrumentation. We’re also evaluating ways of reducing possible false positives, and otherwise making sure that we remove tracking without affecting the privacy-respecting parts of the web. As we continue building out these protections, we’ll share more here.

2. Referrer Policy Changes

Note: As of 2021, this section is no longer current. The Android and Desktop versions of Brave browser now apply a “referer” policy of strict-origin-when-cross-origin by default. More information is available on the tracking issue.

Last but not least, Brave has changed how it handles the referrer, or “referer” (sic),⁵ policy. Our previous approach frequently broke websites, requiring users to turn off Shields to use a site, and so lose all privacy protections. Our new approach greatly reduces the number of broken sites, while still aggressively protecting your privacy.

What is “Referer” Policy?

Referrer Policy is the system that browsers and websites use to inform destination websites about the source website whence a resource request is coming. For example, if you visit a website (say https://fake-cat-site.org), and it includes an image requests for a cat (say, https://example.org/cat.png), referrer policy instructs your browser to tell example.org that the request for the image came from fake-cat-site.org. Similarly, when you click on a link (say, from a search results page), referrer policy typically instructs your browser to tell the site you’re visiting that you came from the search results page.⁶

While referer-policy can be useful to site owners to, for example, restrict which sites can reference their resources, referer-policy also poses a clear privacy harm to users. It tells sites you might not trust about your browsing behavior, what site led you to the site you’re viewing now, and all sorts of other history leaks. For this purpose, referrer header information is frequently recorded by tracking scripts across the web.

Brave’s Previous “Referer” Policy

Previously, in an effort to best protect user privacy, Brave would exclude or “lie” about⁷ this referrer information. For cross-origin resource requests (e.g., the images loaded in a page), Brave would report that you were actually requesting from the site you’re requesting, to prevent the request-receiving site from learning cross-origin browsing behavior. For cross-origin navigations (e.g., clicking on a link to visit a new site), Brave would omit the value all together.

In the abstract this might seem the best way to protect user privacy. However, we’ve since realized that this isn’t the case. Many sites break with these referrer protections in place, requiring Brave users to turn off Shields to use whatever site is breaking, leaving Brave users with no protections at all. Ultimately, our extremely aggressive referrer policy ended up in the worst case harming user privacy, by breaking too many sites.

Brave’s New and Improved “Referer” Policy

Brave’s new referrer policy tries to achieve two goals: i) dramatically reduce the amount of cross origin information sent in referer headers, and ii) dramatically reduce (possibly completely) how often shields need to be dropped to “unbreak” a site. The full details of the new policy are detailed in our wiki, but at a high level are:

When navigating to a new site, never send a referer header.
When submitting forms and similar actions across site, send only the origin, which is already present in the origin header.
When requesting cross-site sub-resources, send only the origin.
When requesting same-site sub-resources, send the full URL.

In some cases Brave will send less information than the above, but Brave never sends more.

Brave isn’t the only browser to reduce the amount of information in referer headers. Though Brave goes further than the others, Firefox, Safari, and Chrome have reduced referer information. Brave is also supportive of and participates in discussions in the W3C’s Privacy Community Group to push such privacy improvements into web standards.

3. Removing Reporting API

Finally, Brave has removed Reporting API from the browser until we’re confident that all of its uses are compatible with Brave’s user-first, privacy-respecting commitments. Reporting API is a Google authored spec proposal. While we may re-enable some or all of the proposal’s functionality in the future, we do not think it’s appropriate for the web in its current form.

What is the Reporting API?

The Reporting API is an under-development Google draft specification that defines ways a website can instruct your browser to send information to an arbitrary party, or to registered JavaScript code, when certain events of interest occur in the browser. These events include changes in network behavior, performance or privacy interventions the browser has carried out on behalf of the user (and potentially in opposition to the website), and crashing or out-of-memory events. The proposal is a generalization of the kinds of reporting first introduced around content-security-policy.

Reporting API and the associated report types are not W3C standards or proposals. They are currently in different stages of incubation for consideration by working and community groups in the W3C. Versions of the functionality have been implemented in Chromium, though, and enabled in Chrome by default⁸. Because Brave builds on and modifies Chromium, we needed to decide if we wanted to include the feature in Brave, or take the additional steps of removing it. The rest of this section describes our concerns with the Reporting API, why we removed it, and what we think might be a possible, user-and-privacy-friendly alternative way forward.

What’s Wrong With the Reporting API?

Brave disabled the Reporting API in our browser for several reasons. The following is not a comprehensive list; it is only a sample of the concerns we have with the proposal.

Exposing New Categories of Sensitive User Data

Several of the kinds of reporting would allow websites (or third parties) to collect new, sensitive categories of data that websites cannot easily capture now. For example, Reporting API (through the Network Error Logging system) would allow websites to learn about when your network conditions change.

For instance, the proposal would allow sites to get notified if a user receives a reporting policy while in a low-privacy configuration (e.g., browsing with Shields down, or using traditional DNS) and then moves into a high-privacy configuration (e.g., raising Shields, enabling a VPN that provides DNS-level filtering). In these situations, the Reporting API would cause the browser to send information to trackers about the user’s network / blocking configuration the user might not want to share.

In a similar way, these reports share exactly the same kind of cross-site information our referrer policy changes were trying to protect. For example, these reports allow third-parties to learn when their resources were blocked on a different site (e.g., a resource, served from origin-Y, was blocked on URL Z, where URL Z has a origin that is not Y). These are just a few examples of privacy risks introduced by the Reporting API.

User Control and Transparency

Finally, several other features of the Reporting API add to the feature’s privacy risk. Reports are not tied to the page they occurred on; rather, they happen “out of band”, or through a process unrelated to the page’s tab. That means that if you think there is something shady with a page, simply closing the page may not prevent further communication between the page and the page’s server; records are queued and sent at an arbitrary time later on. This further erodes the control users have over their interactions with sites. Closing a page should give users certainty that the page cannot continue to communicate⁹.

Similarly, the Reporting API allows reports to be sent to third parties, and there are already companies with business strategies focused on being the single third party that collects reports about users from across the web. While we are sure that the specific people and companies working in this space have good intentions and want to help the Web, the history of the Web is full of examples of where amassing data about millions of people ended up as a privacy catastrophe (among many other unintended negative consequences). We should be extremely cautious about building more systems aggregating more data about users without their knowledge or consent.

Can We Improve the Reporting API?

As mentioned, there are aspects of the Reporting API that we think would be good for the web, if they could be severed from the privacy and user-control risks. It would be great for users who wished to be able to easily notify websites of problems they encountered, especially given that some problems can be very serious security and privacy vulnerabilities. Brave has been working with the proposal’s authors in W3C, raising the issues and making the suggestions discussed in this post. If the proposal were modified to value and require user consent, restrict reporting to first-parties, remove privacy-risking report-types, and didn’t make it harder for users to control when pages can (and cannot) communicate, we believe it could be a great addition to the web platform. Until then though, we will continue protecting your control and autonomy by disabling the Reporting API in Brave.

Conclusion

In this post we shared three more ways Brave works to protect your privacy online. We’re working on many new exciting features, big and small, that we’re excited to tell you about in the coming weeks. We’ll also be sharing more, shorter blog posts about Brave’s involvement in web standards, and the new proposals that have us concerned, excited or otherwise interested.

Grab bag: query stripping, referrer policy, and reporting API

1. Removing Known Tracking Parameters From URLs

What is Query Parameter Tracking (and why is it tricky to stop)?

How Does Brave Prevent Query Parameter Tracking?

Next Steps For Query Parameter Tracking Protections

2. Referrer Policy Changes

What is “Referer” Policy?

Brave’s Previous “Referer” Policy

Brave’s New and Improved “Referer” Policy

3. Removing Reporting API

What is the Reporting API?

What’s Wrong With the Reporting API?

Exposing New Categories of Sensitive User Data

User Control and Transparency

Can We Improve the Reporting API?

Conclusion

Related articles

Brave’s unique Shred button now available for Android, letting users enhance their privacy by easily discarding tracking site data

Brave overhauls adblock engine, cutting its memory consumption by 75%

Brave blocks Microsoft Recall by default

Almost there…

Please continue the installation of Brave in the Google Play app.

Please continue the installation of Brave in the App Store.

You’re just 60 seconds away from the best privacy online

Download Brave

Run the installer

Import settings

Download Brave

Run the installer

Import settings