Requirement Analysis of Decentralized Virtual Private Networks (dVPNs)

Published May 23, 2019

Policy & legislation

Despite some advantages over VPNs, current solutions for dVPNs lack strong privacy and performance guarantees.

This research was conducted by Dr. Matteo Varvello, performance researcher at Brave, and Dr. Ben Livshits, Brave’s Chief Scientist.

In this blogpost we explore the ecosystem of decentralized Virtual Private Networks (dVPNs), a new form of VPNs with no central authority. DVPNs are fairly recent VPN solutions where the users are both client and server, in the sense that when they join a dVPN they also offer a portion of their upload bandwidth to carry traffic for other users. We first overview existing user-hosted dVPNs, both commercial and research solutions, and highlight their strengths and weaknesses. Next, we derive a set of requirements for deploying a trusted and reliable dVPN. We finally benchmark each existing dVPN against such requirements based on available public information. It is interesting to notice that these requirements should be a good starting point for classic/centralized VPNs, and this is not always the case today.

It should be noted that Brave does not recommend any particular VPN or dVPN given their limitations, which we outline in this analysis. However, such solutions may be useful to certain users seeking more security or anonymity, and our research is intended to let users make informed choices regarding various options. Last but not least, the list of dVPNs we have examined is not necessarily comprehensive. Our selection was based on a combination of popularity and publicly available information.

What is a Decentralized VPN (dVPN)?

Today, a VPN offers two major services: 1) encrypted traffic between the user (VPN client) and a VPN node, 2) a new public IP address (the one from the selected VPN node). Users leverage these two services in different ways. Encryption is beneficial, for example, when connecting to a network that the user does not trust, e.g., a public WiFi or a potentially sketchy ISP. Obtaining a new IP address is mostly useful when the user aims at bypassing censorship or geolocation blocks.

Src: https://www.cnet.com/best-vpn-services-directory/ — Source: https://www.cnet.com/best-vpn-services-directory/

Since privacy is a major concern of VPN users, there is one potential flaw with today’s centralized VPNs. The user needs to inherently trust the VPN provider not to interfere or log any of their personal traffic. It is to be noted that VPN providers are commercial entities that might offer their services relying on other commercial entities, e.g., they could use multiple cloud services to obtain a worldwide footprint. It follows that even trusted and respectable vendors might unknowingly incur in issues with a specific provider ranging from surveillance, misconfiguration, and even hacking. Either of these issues can compromise the user privacy.

In [4] the authors actively investigate 62 commercial VPN providers and find unclear policies for non logging, some evidence with tampering of their customer traffic, and a mismatch between advertised VPN node locations and actual network location. In many cases, this misbehavior was not purposely performed by the VPN provider but caused by some misconfigurations. When contacted by the authors, all providers quickly reacted to fix the reported misconfigurations.

Motivated by the above issues, decentralized Virtual Private Networks (dVPNs) are a fairly new trend. In a dVPN, users are both client and server, in the sense that when they join a dVPN they also offer a portion of their upload bandwidth to carry traffic for other users. For example, assuming Alice (France) wants to access some content only available in the US, she can piggyback on Bob’s residential IP address (US) and avoid being geoblocked. A client would discover available dVPNs nodes either via a central repository or by using a distributed repository [15].

To the best of our knowledge, Hola [5] was the first dVPN. Hola is a freemium web and mobile application which offers a dVPN service through a peer-to-peer (P2P) network. When installing Hola, the users agree to either pay a premium per month or offer part of their upload bandwidth to other Hola users. Hola has been quite successful, reaching tens of millions of nodes. At the same time, multiple incidents have been reported when people realized they were indeed carrying other people traffic. In addition, Hola’s organization originated a new company (Luminati [22]) which offers a commercial proxy service which was indeed piggybacking on Hola users.

VPN Gate [7] is another interesting dVPN. Born as a research project, it has some solid foundations that are summarized in this research paper [1]. The main motivation behind VPN Gate was to achieve blocking resistance to censorship firewalls such as the Great Firewall of China. Classic VPNs easily fail at this task because their limited and static network footprint can be easily blocked (IP blacklisting). The rationale of VPN Gate is to build a dVPN atop of volunteer machines, and realize a large set of dynamic IP addresses. The authors further inject innocent IP addresses in their public IP lists which makes it harder to perform large IP blacklist. Further, they allow their VPN nodes to cooperate in order to quickly identify a list of spies, or computers used by censorship authorities to probe the volunteer dVPN nodes. VPN Gate was launched on March 8, 2013 and it currently counts 5,529 dVPN nodes which carry, daily, more than 1 TB of traffic.

With the recent rise of blockchain, a new form of dVPNs has surfaced. In such, the rationale is to share a user’s upload bandwidth in exchange for some crypto tokens. Popular examples of such crypto dVPNs are Mysterium [2] and Sentinel [12]. Nymtech [19] and Substratum [17] are two broader approaches which are somehow related to a dVPN.

Mysterium is an open source dVPN completely built upon a P2P architecture. An immutable smart contract running on Ethereum will be used to make sure that the VPN service is paid adequately. It is currently in alpha test, and incentives will be available soon.

Sentinel is a larger project of which a dVPN is just one of the use cases. The main idea here is to use the blockchain to store a ledger of data transactions with a ‘Proof of Traffic’. A working version of the client can be downloaded for testing [18] and it clearly states that the “liability of traffic at the exit node is also upon the host”.

Nymtech is a decentralized authentication and payment protocol founded on Mixnet, a privacy preserving network which improves upon Tor. The mixnet sends all network traffic through layers of mix-nodes using Sphinx, [20] so that all packets of data are the same size and routing information is kept private. Each mix-node in the network delays the messages and generates fake “dummy messages” to create a uniform pattern of traffic which obfuscates patterns for adversaries observing the network.

In a similar spirit, Substratum [17] aims at rewarding its users for sharing resources (bandwidth, CPU, etc.). Substratum aims at building a decentralized Web with no central entity, implying that anyone can host and serve content and be paid for it. While not directly a dVPN network, it is worth mentioning since it also promises privacy and censorship circumvention.

Requirement Analysis

Open Source

A dVPN client/server code is a very critical piece of software since it can potentially gain access to very sensitive data. Despite popular VPN tunneling protocols (OpenVPN and PPTP) are inherently secure, it is important to note that misconfigurations and/or malicious code are still potential threats. It follows that a first requirement for dVPNs is to be open source so that the community can monitor the evolution of the code and report suspicious activities and/or bugs/misconfigurations, that can jeopardize a user’s privacy.

Code Execution Guarantees

While open sourcing is a good first step, a dVPN should offer stronger guarantees with respect to code execution. A Trusted-Execution-Environment (TEE) is a secure area inside the main processor which guarantees confidentiality and integrity of the code and data herein loaded. In [9], the authors show that it is indeed possible to run a VPN vantage point out of SGX, a popular TEE from Intel [10]. We are not aware of any centralized VPN offering such service, likely due to the extra cost required by such technology. However, and as demonstrated in [9], this is not impossible. The same does not hold for a dVPN due to the strict requirement of SGX.

IP Blacklisting

In order to be usable, a VPN (both centralized and distributed) needs to publish at least a portion of its vantage point list. It follows that it is relatively easy for a censorship entity or a geoblocked content provider to access such list and simply blacklist all the vantage points of a VPN. For centralized VPNs, this is an issue they constantly face and they can hardly solve. For example, content providers applying intensive geoblocking (such as Netflix) currently deny access to all major VPNs.

For dVPNs, IP blacklisting becomes a more serious problem since the IPs being banned are assigned to real users rather than machines into a data-center. At the same time, due to the potential sheer size of a dVPN it can be hard for a censorship entity or a geoblocked content provider to identify such a dynamic set of IPs. This is because VPN nodes are regular Internet users who frequently change network locations and connect from behind Network Address Translators (NATs). In this case, blocking a NATed VPN node implies blocking the whole subnet with a potentially massive service disruption. VPN Gate exploits this feature at its advantage, and it further implements defensive mechanisms to protect its volunteer IPs from being blocked. In [13], the authors proposed a distributed HTTP(S) proxying system that also leverages the same feature to protect from IP blacklisting.

QoS Guarantees

There are multiple ways to benchmark the Quality of Service (QoS) offered by a VPN service.

Networking performance — These are metrics like low latency, limited losses, and high bandwidth. While not always the case for centralized VPNs [4], there is no intrinsic reason why QoS guarantees cannot be offered with respect to these metrics. For example, Cloudflare just announced Warp [8], a large scale VPN-like system which promises both security and a faster web experience. Cloudflare’s approach is to route traffic through their overlay network composed of extremely fast and reliable links. This implies a fast and reliable lane for traffic where, for example, UDP can be used safely and effectively. The rationale behind Warp is the same for startups like Networknext [14] which, for instance, promises to improve their clients’ on-line gaming experience through their fast overlay network.

Offering high networking performance is much harder for dVPNs. This is because of client churn and heterogeneous network conditions, under which it is hard to provide some guaranteed performance. This problem is not specific to dVPNs but an overall generic issue in distributed systems. In his seminal work [16], BitTorrent’s creator (Bram Cohen) discusses the famous tit-for-tat incentive mechanism used by BitTorrent to achieve a high level of robustness and resource utilization. While great, this is still far from any sort of QoS guarantees.

Network footprint: This is another important QoS metric referring to how many unique locations a VPN can offer. As discussed in [1], VPN providers constantly battle to offer more vantage points, either by deploying new physical nodes or by playing tricks, e.g., introducing “virtual locations” based on the information available from geo-IP databases about the physical locations of their vantage points. One shared limitation among centralized VPNs is the lack of residential IP addresses, since they mostly rely on data-centers to deploy their nodes. By definition, dVPNs consist instead of a large network footprint of residential IP addresses. This is indeed one of the most attractive assets of a dVPN today.

Service availability: This refers to the percentage of time that a service is up and running correctly, e.g., the famous five nines availability (99.999%). On paper, the distributed design of a dVPN offers higher availability than a centralized VPN, with either one or N points of failure. For example, an outage in one of the cloud providers used by a centralized VPN would damage the whole service. The large and heterogenous footprint of dVPNs make the latter more unlikely. Nevertheless, serious VPN providers deploy DDOS protection and we are not aware of any big story about astonishing down time for centralized VPNs.

No Logging

Privacy is a main service that should be offered by a VPN. This implies that, at no time, a VPN node should be able to log user traffic. This means both very sensitive data (e.g., accessed URL or actual content exchanged when no HTTPS is used), but also less sensitive data like number of bytes exchanged, domain name contacted, etc. By definition, a VPN node needs visibility into the original traffic in order to forward it either to the client or to the target service, e.g., Netflix. The amount of data being visible then depends on the protocol being used, e.g., in the case of HTTPS the actual content is not visible since encrypted.

Under these conditions, how does a centralized VPN offer a “no-logs” policy? In [4], the authors investigate the usage policy offered by several commercial VPNs on their website. They find that 25% (50) of the VPN services they studied do not have a link to their privacy policy. 42% (85) of the VPN providers also did not provide terms of service. When a privacy policy was available, only 45 VPN services explicitly claimed a “no-logs” policy. This analysis suggests that VPN providers today should do a better job in terms of transparency of their actions. However, it is important to notice that some of these no-logging policies have proven to hold even during an investigation from the FBI [21].

Clearly, for a dVPN we cannot rely on any sort of usage policy. Further, in such a heterogeneous environment an even stricter no-logs requirement is needed. For the reasons above, this is hard to achieve and Hola, for instance, has been previously shamed for this issue [6]. Logging might actually be needed by a dVPN to offer protection against IP blacklisting. This is the case for VPN Gate [1][7], where each VPN node keeps connection logs (and shares them with a central repository) in order to inform the other VPN servers of a potential censorship authority attempting to discover (and block) the current dVPN footprint.

Traffic Accounting

The founding idea of a dVPN is that users share their resources, i.e., they get credited (e.g., via crypto tokens) for the traffic they carry for other dVPN users. The dVPN needs a system to account for such traffic and grant tokens, accordingly. Crypto dVPNs tackle this issue by leveraging the blockchain to keep track of proof of traffic. This can be challenging depending on which logging level is allowed/required, e.g., if just a byte counts or actual visited domains (see no logging requirement above).

Traffic Blame

From a networking perspective, VPN nodes are the entity originating the traffic they carry. This means that serious offenses (child pornography, hate speech, drug smuggling), when investigated, will point the authorities to the entity running the VPN service. At this point, the above no-logs policy comes into play where the VPN might (or not) offer extra information about who was indeed originating such traffic. In a dVPN context, there is no legal entity the authority can reach to. Instead, they would reach a victim dVPN user whose network was used to carry such traffic. In such a situation, for which again Hola has been publicly shamed [6], it can be hard for a private user to defend himself against the authority.

It is thus paramount that a dVPN implements a mechanism to avoid this kind of hairy situation. At the same time, this should be achieved guaranteeing a no-logs policy. This is challenging because, by definition, in order to allow blocking some undesired traffic, the system needs to have a sense of what this traffic is. For example [13] implements selective proxying, a selective proxying mechanisms which allows their client to have full control and transparency over what they proxy.

The table below benchmarks the existing dVPNs solutions with respect to the requirements above. In addition, the last column reports on classic centralized systems as a baseline. Note that this benchmarking was derived from the public information available about existing dVPNs.

Requirement	HOLA	VPN Gate	Mysterium Sentinel	Nymtech Substratum	Classic
Open Source	✔	✔	✔	✔	❌
IP Blacklisting	❌	✔	✔	✔	❌
QoS Guarantees	❌	❌	❌	❌	✔
No Logging	❌	❌	❌	✔	➖
Trusted Code Execution	❌	❌	❌	❌	➖
Traffic Accounting	❌	❌	✔	✔	✔
Traffic Blame	❌	❌	❌	✔	✔

Conclusions

Today, the commercial VPN space is a crowded market characterized by a “race to the bottom”, i.e., competition is mostly driven by lowering the monthly fees. No players offer strong privacy and anonymity guarantees, and several incidents have been reported. Decentralized Virtual Private Networks (dVPNs) are a novel and intriguing alternative to classic VPNs characterized by a lack of central authority. Users voluntarily join the VPN and offer to carry other users’ traffic in exchange for micro-payments or simply to gain access to the VPN service.

While building such a solution is technically not hard, this post argues that it is very hard to build a safe and reliable dVPN. First, we have drafted a set of requirements to build such a dVPN. Next, we have examined multiple existing solutions and shown that they all fail most of these requirements.

References

[1] Nobori, D. and Shinjo, Y., 2014. {VPN} Gate: A Volunteer-Organized Public {VPN} Relay System with Blocking Resistance for Bypassing Government Censorship Firewalls. In Proceedings of the 11th {USENIX} Symposium on Networked Systems Design and Implementation ({NSDI} 14)(pp. 229-241).

[2] Mysterium Network. https://mysterium.network/

[3] Mysterium Network. Code. https://github.com/MysteriumNetwork

[4] Khan, M.T., DeBlasio, J., Voelker, G.M., Snoeren, A.C., Kanich, C. and Vallina-Rodriguez, N., 2018, October. An empirical analysis of the commercial vpn ecosystem. In Proceedings of the Internet Measurement Conference 2018(pp. 443-456). ACM.

[5] Hola VPN. https://hola.org/

[6] http://adios-hola.org/ (if unavailable, check https://web.archive.org/web/20190510183022/http://adios-hola.org/)

[7] VPN Gate, https://www.vpngate.net/en/

[8] Cloudflare Warp. https://blog.cloudflare.com/1111-warp-better-vpn/

[9] Goltzsche, D., Rüsch, S., Nieke, M., Vaucher, S., Weichbrodt, N., Schiavoni, V., Aublin, P.L., Cosa, P., Fetzer, C., Felber, P. and Pietzuch, P., 2018, June. Endbox: scalable middlebox functions using client-side trusted execution. In 2018 48th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN) (pp. 386-397). IEEE.

[10] Intel SGX. https://software.intel.com/en-us/sgx

[12] Sentinel. https://sentinel.co/

[13] Nasr, M., Zolfaghari, H. and Houmansadr, A., MassBrowser: Unblocking the Web for the Masses, By the Masses.

[14] NetworkNext. https://www.networknext.com/

[15] Wolinsky, D.I., Lee, K., Boykin, P.O. and Figueiredo, R., 2010, October. On the design of autonomic, decentralized vpns. In 6th International Conference on Collaborative Computing: Networking, Applications and Worksharing (CollaborateCom 2010) (pp. 1-10). IEEE.

[16] Cohen, B., 2003, June. Incentives build robustness in BitTorrent. In Workshop on Economics of Peer-to-Peer systems (Vol. 6, pp. 68-72).

[17] Substratum. https://substratum.net/

[18] Sentinel. Code. https://github.com/sentinel-official/sentinel/blob/master/README.md

[19] Nymtech. https://nymtech.net/

[20] Sphinx. https://katzenpost.mixnetworks.org/docs/specs/sphinx.html

[21] VPN Provider’s No-Logging Claims Tested in FBI Case. https://torrentfreak.com/vpn-providers-no-logging-claims-tested-in-fbi-case-160312/

[22] Luminati. https://luminati.io/