Deep Web
What is the Deep Web?
The Deep Web is the portion of the Internet not indexed by traditional search engines. It often requires a login (such as a username and password) or special software to access. The part of the Web that’s accessible using search engines—sometimes called the Surface Web—is actually just a small piece of the overall World Wide Web. The much larger, less visible, piece is called the Deep Web. Some estimates suggest that 90-95% of the overall World Wide Web is actually the Deep Web.
Some examples of things on the Deep Web include email, subscription content (like Netflix), online banking, databases, internal company networks, or even non-public social media pages. Generally any webpage or content that requires a login or custom tool to access will be considered part of the Deep Web.
Note that the Deep Web is often confused with the Dark Web. The Dark Web is a fraction of the larger Deep Web. While the Dark Web requires a special browser to access, the Deep Web can be accessed with a standard browser (like Brave or Chrome), and is generally the portion of the Internet you simply need to log into to view. The majority of the Deep Web is similar to the Surface Web, just less public.
Expelling some myths about the Deep Web
Myth: I’ve never surfed the Deep Web.
- Fact: Probably false. If you’ve ever logged in to your bank’s website or your doctor’s patient portal, or used a music subscription to listen to a favorite playlist, or even just logged in to your social media account, you’ve participated in the Deep Web. A lot of what we do online in our daily lives involves the Deep Web.
Myth: It’s illegal to surf the Deep Web.
- Fact: False. Except for censorship laws in some countries, it’s generally not illegal to visit (or log into) websites that aren’t indexed by search engines like Google or Brave Search. What you do on the website, however, may be illegal.
Myth: The Deep Web is full of illegal activity like drugs and weapons.
- Fact: Mostly false. The truth is there’s illegal activity on the Surface Web, Deep Web, and the Dark Web. But all three have countless legitimate uses too.
Myth: You need special software to access the Deep Web.
- Fact: Generally, no. Most of the Deep Web can be accessed through your regular browser, by either knowing the URL of the webpage, or simply logging in. If you want to access the Dark Web, then you’ll need a special browser like Tor.
What webpages are on the Deep Web?
To explain what webpages are on the Deep Web, it will help to first understand a bit about how search engines work.
A Web crawler (also called a search engine crawler) is a program that visits all IP addresses, methodically follows through any links it finds, and builds an index of all the words and phrases on the pages it reaches. This is how many search indexes are built. When you search google.com for “What is the deep web?”, the search results page returns Internet locations (i.e. webpages) that contain those words.
If the crawler can’t access a webpage to index its contents, the webpage does not “surface” and thus remains unindexed—this is the Deep Web. There are several reasons a webpage may not get indexed, including:
- The content is behind a login. The crawler isn’t able to log in and thus doesn’t reach the content.
- A CAPTCHA or other “robot blocker” blocks the web crawler from accessing the content.
- The webpage is dynamic (i.e. it changes in response to a user’s needs). Crawlers often can’t interact with the page in all possible configurations and so might miss some content.
- The page is unlinked and so “hidden” from crawlers. A crawler might be able to find a “root” site (e.g. example.com) but only find a specific page (like example.com/drafts) if there’s a link to it from either the root site or another page. If there’s no link to a page and the only way to access it is to know the page’s URL, the crawler will not find or index it.
- The website owner has added code that requests that Web crawlers not index the site’s content, thus leaving otherwise “public” pages out of a search engine’s index.
How do you access content on the Deep Web?
You can access most of the Deep Web using the same browser you use for the Surface Web. Additional steps depend on what type of Deep Web content you’re looking for.
Sometimes all you need are credentials like a login or other access privileges. Logging in is how you access the part of the Web that has your personal information, or the internal network you use at your job.
Within a website, you can use the website’s search feature to find content that the web crawler wasn’t able to index. A website may have lots of material that’s located by searching for a keyword (e.g. on the example.com blog). Web crawlers don’t enter words into input fields like search boxes, so they may never get to this material. This is particularly true of information-heavy sites, like government resources or online publication sites.
If you want to view archived content (pages or whole websites that are no longer “live” online), the Wayback Machine stores older versions of many websites. It also stores content like old news reports or social media posts, and preserves information that is otherwise rendered inaccessible.
What are safety issues on the Deep Web?
Excluding the fraction of the Deep Web referred to as the “Dark Web,” browsing the Deep Web is like browsing the Surface Web, and comes with the same concerns regarding your privacy and security. You should use similar precautions such as:
- Enable Safe Browsing in your Web browser. All major browsers support this feature, which can warn you if you’re about to visit a site that is known to host malware.
- Choose a privacy focused browser, like Brave, with strong privacy protections.
- Any time you’re about to log in to a website, or enter any important information, look carefully at the website address in the URL, and make sure it’s what you expect it to be.
- Only enter personal information on websites that use HTTPS (look for “https://” in your browser’s address bar), and avoid entering personal info in non-HTTPS sites.
- Always keep the software you use updated so it has the latest security fixes. This is especially important for your operating system (OS) and browser, which will often let you know when they need updating.
- On mobile devices, only install apps from the official app store. On desktop or laptop devices, only install apps and extensions from the official store, or from reputable companies.
- If you’re connected to a public or untrusted Wi-Fi network or ISP, or if anonymity is important, use a VPN.
Finally, note that—despite some negative connotations with the name—it’s generally a good thing that much of the content we access online is on the Deep Web. Since crawlers can’t “see” behind logins, they can’t easily access our personal information, and thus our personal info can’t (or shouldn’t) appear in the results of a search engine like Google.