When privacy-preserving advertising measurement falls short
Privacy-preserving ad measurement should prioritize user control and transparency rather than catering to third-party advertisers.
Read this article →At Brave, we work hard to minimize the tracking of our users on the internet, and that includes interaction with services run by Brave itself. But we don’t expect you to take our word for it; we try to make our privacy features something you can verify.
Many of our features depend on services configured with various providers. We’ve created a view into how that configuration protects your privacy. It’s your data, so you deserve to know how it’s handled. You can follow the steps below to review our work.
When devices send or receive data over the internet, they use an Internet Protocol (IP) address to identify themselves. This is an intrinsic part of how the internet works. While addresses aren’t necessarily unique or long-lived, it’s inevitable that the sites you visit will see these identifiers. The web doesn’t work without them.
To address this potential exposure, we sometimes have visitors connect to our sites and services through a proxy. The proxy sees the visitor’s IP address, but when it forwards the request to our servers, we see only the proxy’s IP address, not the real address of the visitor. Traffic is encrypted so the proxy can’t learn anything about what the visitor has to say; that’s private between the visitor and us.
For example, if you enable Brave Rewards or use Brave News, the items the browser shows are decided locally, on your device—it’s none of our business which item the browser thought you’d be interested in. To protect your privacy, we download some of that content through a private content distribution network where requests go through a proxy.
Unfortunately, because user tracking is so prevalent, most proxy services have options to forward the original visitor’s IP address, decrypt the contents of the request, or both. That’s the opposite of what we want! So it’s important that we turn all those options off when we configure our sites to use a proxy.
To check the current configuration of a Brave proxy, you can visit the audit page and examine the settings for each service. They’re listed by the domain name of the service where we’re using the proxy to separate the IP address from the contents of the request.
Correct configuration will look something like this:
Examine the logs for the fetch and display config step. You should see something like this:
{
"domain": "pcdn.brave.com",
"modified_on": "2020-06-25T23:16:38.918708Z"
"protocol": "tcp/443",
"proxy_protocol": "off",
"traffic_type": "direct",
}
This displays key values that control our configuration of Cloudflare’s Spectrum proxy for our private content distribution network:
By contrast, this is what it might look like if an endpoint were configured to make it possible to track visitors by IP address:
The above example uses the port for unencrypted http, and passes on the original client IP address. We’ve added an “X” and dotted underlines to mark some of the problems with this config so it’s easy to check at a glance.
By publishing these values, we offer some assurance that we’re not able to track clients connecting to these services. Since the configuration data comes directly from Cloudflare, it accurately represents the real proxy parameters. Because the query is processed by GitHub, we can’t have changed the values before reporting them.
More technical users may want to consult the GitHub Actions workflow log linked at the bottom of the page to verify the data was obtained correctly by a runner under GitHub’s control. You’ll need to be logged in to GitHub to access data on that page.
Happy checking!
Privacy-preserving ad measurement should prioritize user control and transparency rather than catering to third-party advertisers.
Read this article →Introducing Nebula, a novel and best-in-class system developed by Brave Research for product usage analytics with differential privacy guarantees.
Read this article →Related Website Sets is a user-hostile weakening of the Web's privacy model, plainly designed to benefit websites and advertisers, to the detriment of user privacy.
Read this article →