What does a browser do when you first launch it?
One of the first things most of us do when sitting down at our computer is launch a web browser. By and large, we’re usually re-launching a browser that we were using hours before. As such, we’re picking up where we left off. The browser already has your history, bookmarks, open tabs, and more. It may even be storing (if you agreed) your passwords, payment details, and other highly sensitive bits of information. Browsers also record and adhere to user configurations, such as blocking popup windows, preventing fingerprinting, or blocking trackers and privacy-invasive ads that depend on those trackers. All of this information is part of your Profile in the browser.
What this post explores today is how browsers behave by default, on their first-run, with no preexisting user profile. By default, Brave blocks third-party trackers (and the ads that rely on them). It also prevents fingerprinting, auto-play of media, crypto-mining, and access to media input devices. These aren’t always features offered by other browsers out of the box. Many users have developed something of a ritual of downloading their preferred browser, then beginning the sometimes arduous process of hunting through web stores for security and privacy extensions to enhance their experience. But what happens between the moment you open your browser, and when you finally have it configured and augmented to your liking?
Revelations from a short, informal exercise
I decided to explore this topic more thoroughly a couple of weeks ago. I reviewed several web browsers’ network activity on their first-run, and shared some commentary and explanation for what was found via Twitter. Today, I’d like to cover the results as a whole, as well as talk a bit about how you can do a similar review.
Which browsers were included, and why?
Initially I only tested a couple of browsers, but people soon began to request others. In the end, the primary browsers tested were Brave 0.68.132, Chrome 76.0.3809.132, Chromium 78.0.3895.0, Edge (Chromium Beta) 220.127.116.11, Firefox 69, Opera 63.0.3368.71, Safari 12.1.2, and Vivaldi 2.7.1628.30. Lynx was also tested, but predictably uneventful 😉
What were the results?
|Edge (Cr Beta)||90||28|
I began each review by first removing any existing profile from my Windows machine. I then launched the browser and let it sit for 40 minutes without any user interaction. What I found were many commonalities between major web browsers. We’ll discuss those in just a moment.
Of the group, Brave issued the fewest requests (24) during its first run. It also stood out as the only browser to call out to a single, controlled TLD exclusively. What I mean by that is Brave issued requests to brave.com properties alone, shielding me from third parties from the start.
None of these requests actually return the extension itself. Instead, they reply with further instructions. The first 4 requests result in about 465 bytes of JSON telling Brave where it can find the requested extensions, as well as how to test them for authenticity. The 5th response is empty. Instead of JSON instructions, the server uses HTTP headers to inform the browser that it should request the CRLSet (Certificate Revocation List) from componentupdater.brave.com.
The call to componentupdater.brave.com results in 1.8 KB of JSON informing Brave where the CRLSet can be downloaded. All of the URLs provided come from Google, which Brave doesn’t call directly. Instead, Brave continues the practice of proxying all requests through the brave.com domain, shielding the user’s device from Google servers. The next 5 requests (to brave-core-ext.s3.brave.com and crlsets.brave.com) are the actual downloads taking place.
Once these have downloaded, Brave initiates the routine process of ensuring all extensions are up to date. All of those requests return with confirmations that these extensions do not need to be updated at this time. Brave will check later to ensure ongoing user privacy and security.
Next up is a request for /promo/custom-headers. This request returns 271 bytes of JSON that serves as a replacement for a unique user-agent string. This file instructs Brave to add a custom HTTP header to certain partner requests, enabling Brave users to anonymously enjoy free access to things like premium content on cheddar.com, and more.
We then see Brave issue 3 very peculiar requests to invalid host names. In this case, it’s to kgcemqlxlymf, jlejbuhy, and skxrkyaibq. You may spot similar requests being made by other Chromium-based browsers later. These are to detect DNS Hijacking. If at least 2 of these resolve to the same host, Brave will suspect the ISP or network of advantageously inserting content into the user’s session (often times the ISP may be showing ads).
Another set of familiar requests we’ll see in later reviews is for Google’s SafeBrowsing service. This is a large collection of URLs that are known to be harmful. There are 2 ways to interact with this service: the Lookup API, and the Update API. Brave uses the Update API, which downloads a list of URLs suspected to be harmful. This allows Brave to query a local resource rather than making numerous calls out to a third party. As we can see later, Brave continues to proxy even these requests through safebrowsing.brave.com.
That covers nearly everything Brave does during its first run experience on a new profile. We do see another call to go-updater.brave.com/extensions, checking to make sure all of our installed extensions are up to date. And this returns with a 1.7 KB confirmation response.
One thing we didn’t see here, but you may see in your own reviews, are referral pings. If you download the Brave browser via a referral link, Brave will attempt to give credit to the owner of that referral code. This means you will see a couple of additional requests. You can learn more about how we implemented the Referral System in Brave’s Use of Referral Codes on GitHub.
Chrome makes a total of 40 requests to 15 different hosts, all of which are to Google properties. The first request returns about 32 KB worth of flags and custom configurations for the browser. The next request is to retrieve an XML document containing download URLs for extensions. We saw something similar with Brave, and we’ll continue to see it with other browsers as well.
The third request is to Google Accounts and ID Administration (GAIA) at accounts.google.com. I believe this is an attempt to connect the new instance of Chrome with a user profile stored in the cloud. Fortunately, it was unable to link me with any existing account(s).
Chrome then proceeds to download translation tables and a few extensions. These extensions are for Google Docs, YouTube, and other commonly-used Google properties. Some of these will initiate further calls to accounts.google.com on their own.
Vanilla Chromium Results
Quite a few people were curious about Chromium, and how it differed (if at all) from Google Chrome. Chromium makes fewer request to fewer hosts, and downloads much less data on its first run. The difference in download size, however, is largely due to Chromium not being able to use the SafeBrowsing Update API (these requests receive an HTTP 400 response). By default, Chromium lacks an API key. Chromium users will not be able to retrieve updated threat lists.
Chromium does follow Chrome in one significant way—it too starts out its first few moments by issuing a request to GAIA, checking to see if a profile for the current user can be retrieved.
Edge Beta (Chromium version) Results
The Microsoft Edge Chromium Beta has the second-highest number of requests, to the highest number of hosts. We’ll see a similar pattern with Firefox as well, as both start with not only a default New Tab Page, but another tab opened to an external resource.
The first request Edge made was to the clients2.google.com host, requesting the Chrome Media Router extension, which is required for ChromeCast support. Like previous extension requests in other reviews, the server responds with another URL for Edge to call.
One surprise in the Edge logs were the calls to other Microsoft domains, such as speech.platform.bing.com, config.edge.skype.com, web.vortex.data.microsoft.com, dc.services.visualstudio.com, and activity.windows.com. While these types of calls aren’t necessarily harmful, they do lead to a bit of confusion on behalf of the observer.
The second tab that Edge loads by default is microsoftedgeinsider.com (since Edge is currently in preview, and part of the Insider’s Program). This page, however, loads common trackers such as sb.scorecardresearch.com, which informs the tracker I am launching the browser for the first time. As a result of this request, 2 cookies (UID, and UIDR) are now on my device.
Firefox remains one of the chattiest browsers during a first run. At 117 requests, it lead the pack with individual requests. It should be noted, however, that this isn’t the browser itself making all of these calls, but another page that is present during startup.
Like Edge, Firefox also launches with a web page opened in a second tab. While the New Tab Page is being displayed, Firefox is busy in the background issuing requests for mozilla.org/privacy/firefox and its resources.
Firefox also happens to download the most data during its first run. This is due in large part to SafeBrowsing and Widevine. Oddly enough, it appears Firefox downloaded the Widevine bits twice, with nearly 10 minutes between requests.
One thing that stood out about Firefox was its early telemetry feedback. Unlike other browsers I reviewed, Firefox seemed more eager to transmit data back regarding the new device. In fact, I saw no fewer than 4 connections to incoming.telemetry.mozilla.org during the 40-minute review. Some of these telemetry calls are not made directly by the Firefox process, but instead by Mozilla’s pingsender process.
Opera makes 51 requests to more than 20 hosts. In preparation for this write-up, I did another review of Opera to see what, if anything, has changed since the first review was done on Twitter. I’m pleased to see that Opera no longer issues nearly 20 requests to yandex.ru. Opera does, however, still call out to amazon.com, walmart.com, kayak.com, aliexpress.com, overstock.com, ebay.com, and booking.com.
Beyond the commercial requests, I was surprised to see calls to android.clients.google.com as well, considering I launched and tested Opera on a Windows device.
Safari was one of the more requested browsers. It issued 37 requests to 16 hosts.
When Safari first launches on a new profile, it begins to issue requests for thumbnails (or favicons) for popular websites. They start by pinging apple.com, bing.com, yahoo.com, wikipedia.org, and more. Social Networks are also queried, such as facebook.com, linkedin.com, and twitter.com. Requests are also made for yelp.com and tripadvisor.com.
Safari has a particular way of searching for icons/thumbnails. First it requests the landing page for the site; this results in quite a bit of data being downloaded. Once the HTML for the page has been retrieved, Safari scans the tags for Apple-specific directives. If it cannot find them, it will give up and start asking for specific filenames.
The first file Safari will seek is apple-touch-icon-precomposed.png. For the most part, none of the domains had this file, so that request fails. Safari then attempts to access apple-touch-icon.png on each domain. This, too, largely fails.
I should note that while all of these requests can (and sometimes do) result in cookies on the user’s device, it is my understanding that Safari uses a special session to make these requests, and that all of the cookies are cleared immediately afterwards.
The last browser I’d like to cover also happened to be one of the first ones I originally reviewed on Twitter. Vivaldi issues 42 requests to 13 different hosts during its first run experience. This is up from the 31 or so I originally recorded. Most of the 42 requests are to vivaldi.com, loading a resource to discuss What’s New in the latest build. Vivaldi then goes on to download numerous resources (such as SafeBrowsing lists and extensions) from Google domains. Beyond that, Vivaldi’s traffic is very familiar to us at this point, largely resembling that of other Chromium-based projects.
The Brave Difference
When I set out to conduct these reviews, I wasn’t 100% positive on what I’d find. I had high hopes for Brave, and it delivered. But I wasn’t sure just how clean and controlled the experience would be in other major web browsers. Brave impressed with its 100% pass rate for calling out to official Brave properties. The degree to which other browsers fail to do the same left me quite bewildered. Even browsers who famously market themselves as private and secure by default made far more calls to third party hosts than I would have expected.
How can others do these reviews in the future?
Once I began sharing the results of these reviews, many users asked what tooling was used, and how they too could do similar reviews on their own. I’m pleased to say that very little setup is required. Windows users can monitor network activity and more with Telerik Fiddler. macOS users should check out Charles Proxy (though Fiddler reportedly works with Mono on macOS). Fair bit of warning: Telerik will ask for information before downloading Fiddler, and Charles will only work for 30 minutes at a time as an unregistered product. There are quite a few tutorials and guides for Fiddler and Charles online, but if I can ever be of assistance, do feel free to message me on Twitter at @BraveSampson. Always happy to help.