Grab bag: query stripping, referrer policy, and reporting API
By the Brave Privacy Team
This post describes work done by Senior Product Security Engineer François Marier (@fmarier), Senior Security Engineer Pranjal Jumde (@pjumde), Senior Software Engineer Ivan Efremov, and Senior Privacy Researcher Peter Snyder.
In order to stay one step ahead of online trackers, Brave regularly releases new privacy features and improvements. This post discusses three recent changes in Brave that each help make the web a more privacy, and person, respecting platform. All of these are tweaks, subtle changes, or first-steps of a new approach to a more private, more compatible Web.
1. Removing Known Tracking Parameters From URLs
First, Brave removes common tracking parameters from URLs1 by default. These are parameters trackers use to collect your activity across the web. Common examples of these include the “Facebook Click Identifier” (fbclid), used by Facebook to record which sites you visit when you’re not on Facebook, the “Google Click Identifier” (gclid), used by Google to link the advertising and analytics data they have on you, and Microsoft’s equivalent (msclkid).
These values are added to URLs you click, so that advertisers can learn more about you and your behavior on the web. Brave currently removes these tracking-related query parameters from URLs, allowing you to visit the sites you want to visit, without being tracked.
What is Query Parameter Tracking (and why is it tricky to stop)?
Query parameter tracking (sometimes called “link decoration”) refers to trackers adding uniquely identifying query parameters (typically, what appears after the ? in the URL) to links when you leave a site, and then reading them back out of the URL when you land on a different, possibly unrelated site. This allows the tracker to connect your behavior between two different, independent websites.
Query parameter tracking is a particularly difficult form of tracking to block. As a result, most browsers do not have effective countermeasures to query parameter tracking2. To understand why query parameter tracking is difficult to block, contrast it to how most tracking is done on the web.
Query parameter tracking is difficult to block, though, because it happens through URLs you want to visit; query parameter identifiers sit alongside benign, user-serving values in the URL. Blocking requests every time these identifiers appeared in the URL would prevent you from visiting many sites you wish to visit.
How Does Brave Prevent Query Parameter Tracking?
Whenever you visit a page with a known tracking query parameter in the URL, such as https://example.org?fbclid=uniquevalue, Brave removes the value from the URL before your browser makes the request. Because these query parameters are used for tracking, but not necessary for the site to work correctly, removing the query parameter prevents the tracking, but otherwise doesn’t affect the site’s behavior.
Brave currently determines which parameters are tracking related by building a list of known tracking parameters and building them into the browser. These parameters are determined by reading the documentation provided by the trackers themselves, drawing on existing crowdsourced lists, and through parameters Brave developers and users have encountered themselves. The current list of removed query parameters can be found in Brave’s source code, and the technique and its caveats are described further in our wiki.
Next Steps For Query Parameter Tracking Protections
While we’re excited to deploy and share this privacy protection, we’re also being cautious to do so carefully. We want to be careful to not break websites that may use the same query parameter names as trackers, but for completely benign, first-party purposes. Similarly, while Brave wants to prevent users from being tracked, we don’t want to break privacy-preserving types of campaign tracking, or query parameters that are not unique to individuals. Brave wants to prevent privacy-harming behavior, while preserving legitimate, privacy-respecting analytics systems.
Finally, Brave has several projects under development to further improve our query parameter tracking protections. We will soon have automated crawls to detect additional query parameter tracking, using our PageGraph instrumentation. We’re also evaluating ways of reducing possible false positives, and otherwise making sure that we remove tracking without affecting the privacy-respecting parts of the web. As we continue building out these protections, we’ll share more here.
2. Referrer Policy Changes
Note: As of 2021, this section is no longer current. The Android and Desktop versions of Brave browser now apply a “referer” policy of
strict-origin-when-cross-origin by default. More information is available on the tracking issue.
Last but not least, Brave has changed how it handles the referrer, or “referer” (sic),5 policy. Our previous approach frequently broke websites, requiring users to turn off Shields to use a site, and so lose all privacy protections. Our new approach greatly reduces the number of broken sites, while still aggressively protecting your privacy.
What is “Referer” Policy?
Referrer Policy is the system that browsers and websites use to inform destination websites about the source website whence a resource request is coming. For example, if you visit a website (say https://fake-cat-site.org), and it includes an image requests for a cat (say, https://example.org/cat.png), referrer policy instructs your browser to tell example.org that the request for the image came from fake-cat-site.org. Similarly, when you click on a link (say, from a search results page), referrer policy typically instructs your browser to tell the site you’re visiting that you came from the search results page.6
While referer-policy can be useful to site owners to, for example, restrict which sites can reference their resources, referer-policy also poses a clear privacy harm to users. It tells sites you might not trust about your browsing behavior, what site led you to the site you’re viewing now, and all sorts of other history leaks. For this purpose, referrer header information is frequently recorded by tracking scripts across the web.
Brave’s Previous “Referer” Policy
Previously, in an effort to best protect user privacy, Brave would exclude or “lie” about7 this referrer information. For cross-origin resource requests (e.g., the images loaded in a page), Brave would report that you were actually requesting from the site you’re requesting, to prevent the request-receiving site from learning cross-origin browsing behavior. For cross-origin navigations (e.g., clicking on a link to visit a new site), Brave would omit the value all together.
In the abstract this might seem the best way to protect user privacy. However, we’ve since realized that this isn’t the case. Many sites break with these referrer protections in place, requiring Brave users to turn off Shields to use whatever site is breaking, leaving Brave users with no protections at all. Ultimately, our extremely aggressive referrer policy ended up in the worst case harming user privacy, by breaking too many sites.
Brave’s New and Improved “Referer” Policy
Brave’s new referrer policy tries to achieve two goals: i) dramatically reduce the amount of cross origin information sent in referer headers, and ii) dramatically reduce (possibly completely) how often shields need to be dropped to “unbreak” a site. The full details of the new policy are detailed in our wiki, but at a high level are:
- When navigating to a new site, never send a referer header.
- When submitting forms and similar actions across site, send only the origin, which is already present in the origin header.
- When requesting cross-site sub-resources, send only the origin.
- When requesting same-site sub-resources, send the full URL.
In some cases Brave will send less information than the above, but Brave never sends more.
Brave isn’t the only browser to reduce the amount of information in referer headers. Though Brave goes further than the others, Firefox, Safari, and Chrome have reduced referer information. Brave is also supportive of and participates in discussions in the W3C’s Privacy Community Group to push such privacy improvements into web standards.
3. Removing Reporting API
Finally, Brave has removed Reporting API from the browser until we’re confident that all of its uses are compatible with Brave’s user-first, privacy-respecting commitments. Reporting API is a Google authored spec proposal. While we may re-enable some or all of the proposal’s functionality in the future, we do not think it’s appropriate for the web in its current form.
What is the Reporting API?
Reporting API and the associated report types are not W3C standards or proposals. They are currently in different stages of incubation for consideration by working and community groups in the W3C. Versions of the functionality have been implemented in Chromium, though, and enabled in Chrome by default8. Because Brave builds on and modifies Chromium, we needed to decide if we wanted to include the feature in Brave, or take the additional steps of removing it. The rest of this section describes our concerns with the Reporting API, why we removed it, and what we think might be a possible, user-and-privacy-friendly alternative way forward.
What’s Wrong With the Reporting API?
Brave disabled the Reporting API in our browser for several reasons. The following is not a comprehensive list; it is only a sample of the concerns we have with the proposal.
Exposing New Categories of Sensitive User Data
Several of the kinds of reporting would allow websites (or third parties) to collect new, sensitive categories of data that websites cannot easily capture now. For example, Reporting API (through the Network Error Logging system) would allow websites to learn about when your network conditions change.
For instance, the proposal would allow sites to get notified if a user receives a reporting policy while in a low-privacy configuration (e.g., browsing with Shields down, or using traditional DNS) and then moves into a high-privacy configuration (e.g., raising Shields, enabling a VPN that provides DNS-level filtering). In these situations, the Reporting API would cause the browser to send information to trackers about the user’s network / blocking configuration the user might not want to share.
In a similar way, these reports share exactly the same kind of cross-site information our referrer policy changes were trying to protect. For example, these reports allow third-parties to learn when their resources were blocked on a different site (e.g., a resource, served from origin-Y, was blocked on URL Z, where URL Z has a origin that is not Y). These are just a few examples of privacy risks introduced by the Reporting API.
User Control and Transparency
Finally, several other features of the Reporting API add to the feature’s privacy risk. Reports are not tied to the page they occurred on; rather, they happen “out of band”, or through a process unrelated to the page’s tab. That means that if you think there is something shady with a page, simply closing the page may not prevent further communication between the page and the page’s server; records are queued and sent at an arbitrary time later on. This further erodes the control users have over their interactions with sites. Closing a page should give users certainty that the page cannot continue to communicate9.
Similarly, the Reporting API allows reports to be sent to third parties, and there are already companies with business strategies focused on being the single third party that collects reports about users from across the web. While we are sure that the specific people and companies working in this space have good intentions and want to help the Web, the history of the Web is full of examples of where amassing data about millions of people ended up as a privacy catastrophe (among many other unintended negative consequences). We should be extremely cautious about building more systems aggregating more data about users without their knowledge or consent.
Can We Improve the Reporting API?
As mentioned, there are aspects of the Reporting API that we think would be good for the web, if they could be severed from the privacy and user-control risks. It would be great for users who wished to be able to easily notify websites of problems they encountered, especially given that some problems can be very serious security and privacy vulnerabilities. Brave has been working with the proposal’s authors in W3C, raising the issues and making the suggestions discussed in this post. If the proposal were modified to value and require user consent, restrict reporting to first-parties, remove privacy-risking report-types, and didn’t make it harder for users to control when pages can (and cannot) communicate, we believe it could be a great addition to the web platform. Until then though, we will continue protecting your control and autonomy by disabling the Reporting API in Brave.
In this post we shared three more ways Brave works to protect your privacy online. We’re working on many new exciting features, big and small, that we’re excited to tell you about in the coming weeks. We’ll also be sharing more, shorter blog posts about Brave’s involvement in web standards, and the new proposals that have us concerned, excited or otherwise interested.
As is the case with all of the privacy protections described in this post, these protections are available only on desktop and Android versions of Brave, where Brave can deeply control how the browser operates. Browsers on iOS are restricted in the changes they can make to how websites work. As a result, Brave is limited in the kinds of privacy protections we can provide on iOS. However, we are constantly looking for ways of bringing more privacy improvements to Brave on iOS. ↩︎
Safari’s Intelligent Tracking Protection system, and Firefox’s Enhanced Tracking Protection system are notable, though partial, exceptions in this regard. Under certain situations Safari will remove all query parameters from the URL, to try to stymie trackers. Firefox will similarly remove the query and path portions of the referrer URL when it contains the Facebook Click Identifier. ↩︎
Brave makes a very small number of exceptions here, for web compatibility reasons. ↩︎
Firefox and Edge allow storage in tracking third-party frames up to a threshold, after which storage is restricted. ↩︎
Referrer was comically misspelled “referer” in the original standards text, and so will continue to be misspelled in the HTTP protocol spoken “on the wire”, until the end of time. ↩︎
Referrer policy also allows sites to instruct your browser not to send the referrer header, or to send only part of it, or to make other, similar modifications. Mozilla’s documentation on referer-policy does a great job explaining all the possible configurations. ↩︎
Or in polite web euphemism, “spoof” ↩︎
Ready for a better Internet?
Brave’s easy-to-use browser blocks ads by default, making the Web cleaner, faster, and safer for people all over the world.Download Brave