THEMIS: Towards a Decentralized Ad Platform with Reporting Integrity (Part 1)

This post describes the work done by Gonçalo Pestana, Research Engineer, Iñigo Querejeta-Azurmendi, Cryptography Engineer, Dr. Panagiotis Papadopoulos, Security Researcher, and Dr. Ben Livshits, Chief Scientist; this post is also part of a series that focuses on further progressive decentralization for Brave ads.

Note: THEMIS is primarily a research effort for now and does not constitute a commitment regarding product plans around Brave Rewards.

The whitepaper introducing the Basic Attention Token (BAT) [1] was released mid 2017 and, since then, BAT has been used by millions of users, advertisers, and publishers, each using and earning BAT through the Brave Browser (Figure 1) [2].  It has been a long ride since 2017 and we’re very proud that BAT is acknowledged as one of the most successful use cases for decentralized ledgers and utility tokens.

The BAT token powers the BAT-based advertising ecosystem. The main goal of the BAT-based ad ecosystem is to provide the choice for users to value their attention, while keeping full control over their data and personal privacy. The main tenets of the BAT-based advertising ecosystem are to provide privacy by default, to restore control to users over their data, and to provide a decentralized marketplace where Brave Browser users are incentivized to watch ads and to contribute to creators. Through these principles, Brave’s vision is to fix the current online advertising industry [1], and get rid of widespread fraud schemes [3] [3.1], [4], market fragmentation [5] [6] and privacy issues [7] [8].

In line with these goals, Brave’s research team has been working on a decentralized and privacy-by-design protocol that further improves upon the current BAT-based ad ecosystem. In this first post in a series of blog posts, we present THEMIS: a novel privacy-by-design ad platform that requires zero trust from both users and advertisers alike. THEMIS provides auditability to all participants, rewards users for interacting with ads, and allows advertisers to verify the performance and billing reports of their ad campaigns. In this blog series, we describe the THEMIS protocol and its building blocks. In the next post, we will present a preliminary scalability evaluation of THEMIS in a deployment environment.

Figure 1. Example of an ad notification delivered through the Browser for Brave Ads users.

The current web advertising ecosystem

Digital advertising is the most popular way of funding websites. However, web advertising has fundamental flaws such as market fragmentation, rampant fraud, and unprecedented invasion of privacy. Further, web users are increasingly opting out of web advertising, costing publishers millions of dollars in ad revenues every year. A growing number of users (47% of internet users globally, as of today [13]) use ad-blockers.

Academia and industry have responded by designing new monetization systems. These systems generally emphasize properties such as user choice, privacy protection, fraud prevention, and performance improvements. Privad [11], and Adnostic [12] are examples of academic projects that focus on privacy-friendly advertising. Despite the contributions of these systems, they have significant shortcomings that have limited their adoption.  These systems either (i) do not scale, (ii) require the user to trust central authorities within the system to process ad transactions, or (iii) do not allow advertisers to accurately gauge campaign performance. 

To make matters worse, current advertising systems lack proper auditability: The ad network exclusively determines how much advertisers will be charged, as well as the revenue share that the publishers may get. Malicious ad networks can overcharge advertisers or underpay publishers. Another issue is non-repudiation, as ad networks do not generally prove that the claimed ad views/clicks occurred in reality.

Figure 1. A high-level visual overview of THEMIS. Ad distribution and ad interaction reporting activities. Users are rewarded for interacting with ads. In THEMIS, a campaign manager and advertisers agree on ad campaigns, which are encoded in a smart contract running on a side-chain. Using Brave Browser, users request rewards from a smart contract, which implements a cryptographic protocol that moves us towards decentralization, transparency, and privacy. 

Our Approach: THEMIS

In this blog post series, the Brave Research team presents THEMIS (Figure 1), a private-by-design ad platform that makes a significant step towards decentralizing the ad ecosystem by leveraging a side-chain and smart contracts to eliminate centralized ad network management. We believe in progressive decentralization, which means that the system presented in the first blog post is not yet fully decentralized; subsequent blog posts will discuss further decentralization steps. 

The current implementation of Brave Ads protects user privacy and anonymity through the use of privacy-preserving cryptographic protocols, client-side ad matching, and other anonymization techniques. For example, Brave servers cannot determine which ads a user has interacted with, and they do not receive any data concerning a specific user’s interests or browsing habits. 

The THEMIS protocol provides the same strong anonymity properties as Brave Ads, while making an important step toward progressive decentralization  of the Brave Ads ecosystem. THEMIS is highly relevant to the BAT Apollo mission [14]. As discussed in a BAT Community-run AMA [15], the main goals of the BAT Apollo mission are to improve transparency, to decrease transaction costs, and to further decentralize Brave Ads.

By combining the strong privacy properties with decentralization, THEMIS:

  • Effectively addresses the auditability and non-repudiation issues of the current ecosystem by requiring all participants to generate cryptographic proofs of correct behaviour. Every participants can verify that everybody is following the protocol correctly;
  • And provides the advertisers with the necessary feedback regarding the performance of their ad campaigns without compromising the end-user privacy. By guaranteeing the computational integrity of this reporting, advertisers can accurately learn how many users viewed and interacted with their ads without learning exactly which of them. 

Technical Background

In this section, we sketch a brief technical background regarding the mechanisms and building blocks used by THEMIS; we also describe why and how THEMIS leverages them.

Permissioned Blockchains

THEMIS relies on a blockchain with smart contract functionality to provide a decentralized ad platform. Smart contracts enable the business logic and payments to be performed without relying on a central authority. THEMIS could, for example, run on the Ethereum Mainnet. However, due to Ethereum’s low transaction throughput, the high gas costs, and the current scalability issues, THEMIS relies on a Permissioned Blockchain instead, more concretely on a Proof-of-Authority (PoA) blockchain. 

A PoA blockchain consists of a distributed ledger that relies on consensus achieved by a permissioned pool of validator nodes. PoA validators can rely on fast consensus protocols such as IBFT/IBFT2.0 and Clique, which result in faster minted blocks and thus PoA can reach higher transaction throughput than traditional PoW based blockchains.

As opposed to traditional, permissionless blockchains (such as Bitcoin and Ethereum), the number of nodes participating in the consensus is relatively small and all nodes are authenticated. In our case publishers, and other industry entities, are potential participants of the pool of validators.

Cryptographic Tools

Confidentiality

THEMIS uses an additively homomorphic encryption scheme to calculate the ads payouts for each user, while keeping the user behavior (e.g. ad clicks) private. Given a public-private key-pair [[(\sk, \pk)]], the encryption scheme is defined by four functions:

  • Encryption: first, the encryption function, where given a public key and a message, outputs a ciphertext, [[\ctxt = \enc(\pk, \message)]];
  • Decryption: secondly, the decryption function, that given a ciphertext and a private key, outputs a decrypted message, [[\message = \dec(\sk, \ctxt)]];
  • Sign: next, the signing function, where given a message and a secret key, outputs a signature on the message, [[\signature = \sign(\sk, \message)]].
  • Verify: finally, the signature verification function, where given a signature and a public key, outputs [[\bot, \top]] if the signature fails or validates respectively, [[\signverify(\signature, \pk)\in\{\bot, \top\}]].

The additive homomorphic property guarantees that the addition of two ciphertexts,

$$ \ctxt_{1} = \enc(\pk, \message_{1}), \ctxt_{2} = \enc(\pk, \message_{2}) $$

encrypted under the same key, results in the addition of the encryption of its messages, more precisely:

$$ \ctxt_{1} + \ctxt_{2} = \enc(\pk, \message_{1} + \message_{2}) $$

Some examples of such encryption algorithms are ElGamal [9] or Paillier [10] encryption schemes. 

Integrity

To prove correct decryption, THEMIS leverages Zero Knowledge Proofs (ZKP) which allow an entity (i.e. the prover) to convince a different entity (i.e. the verifier) that a certain statement is true over a private input without disclosing any other information from that input other than whether statement is true or not. We denote proofs with \(\Pi\), and its verifications as \(\verify(\Pi)\in\{\bot, \top\}\).

Distribution of trust

THEMIS distributes trust to generate a public-private key-pair for each ad campaign, under which the sensitive information is encrypted. For this, it uses a distributed key generation (DKG) protocol to share the knowledge of the secret. This allows a group of players to distributively generate the key-pair, [[(\sk_T, \pk_T)]], where each player has a share of the private key, [[\sk_{T_{i}}]], and no player ever gains knowledge of the full private key, [[\sk_{T}]]

Moreover, the resulting key-pair is a threshold key-pair which requires at least a well-defined number of participants – out of the peers that distributively generated the key – to interact during the decryption or signing operations.

We follow a similar DKG protocol as presented by Schindler et.al. [11].

In order to choose this selected group of key generation players in a distributed way, THEMIS leverages Verifiable Random Functions (VRFs). In general, VRFs enable users to generate a random number and prove its randomness. In THEMIS, we use VRFs to select a random pool of users and generate the distributed keys. Given a public-private key-pair, [[(\VRFsk, \VRFpk)]], VRFs are defined by a function which outputs a random number and a zero knowledge proof of correct generation.

System Properties and Guarantees

The main properties we focused on while designing THEMIS included privacy, accountability, reporting integrity, and decentralization:

Privacy 

In the context of a sustainable ad ecosystem, we define privacy as the ability for users and advertisers to use our system without disclosing any critical information about themselves and their business:

  • For the user, privacy means being able to interact with ads without revealing their interests/preferences to advertisers, other protocol participants or eavesdroppers. In THEMIS, we preserve the privacy of the user not only when they are interacting with ads but also when they claim the corresponding rewards for these ads. 
  • Brave Ads currently protects advertiser privacy. For advertisers, privacy means that they are able to set up ad campaigns without revealing any policies (i.e. what is the reward of each of their ads) to the prying eyes of their competitors. THEMIS keeps these ad policies confidential throughout the whole process, while enabling users to claim rewards based on ad policies. 

Decentralization and auditability

Existing works require a central authority to manage and orchestrate the proper execution of the protocol, either in terms of user privacy or billing. What if this (considered as trusted) entity censors users by denying or transferring an incorrect amount of rewards? What if it attempts to charge advertisers more than what they should pay based on users’ ad interactions? What if the advertising policies are not applied as agreed with the advertisers when setting up ad campaigns?

One of the primary goals of our system is to be decentralized and transparent. To achieve this, THEMIS leverages a permissioned blockchain with smart contract functionality.

Scalability

Ad platforms need to be able to scale seamlessly and serve millions of users. However, important proposed systems fail to achieve this. We consider scalability as an important aspect affecting the practicability of the system. THEMIS needs to not only serve ads in a privacy preserving way to millions of users but also finalize the payments related to their ad rewards as timely as possible.

Integrity

Contrary to existing works, THEMIS does not rely on a trusted central authority. Therefore, it needs to provide both the users and the advertisers with mechanisms to verify the authenticity of the statements and the performed operations. Achieving such integrity guarantees requires the use of zero-knowledge proofs to ensure every participant can prove and verify the correctness and validity of billing and reporting.

System Overview – A Strawman Approach

The remainder of this blog post will be dedicated to outline a straw-man approach to describe the basic principles and steps of THEMIS. In an upcoming blog post, we build on the straw-man approach and introduce the decentralization into the system.

Our straw-man approach is the first step towards a privacy-preserving and decentralized online advertising system. Our goal at this stage is to provide a mechanism for advertisers to create ad campaigns and to be correctly charged, based on the user’s interactions with their ads. In addition, the system aims at keeping track of the ads viewed by users, so that (i) advertisers can have feedback about their ad campaigns and (ii) users can be rewarded for interacting with ads. All these goals should be achieved while preserving the privacy of the ad policies and the user behaviour. 

We assume three different roles in this straw-man approach: (i) the users, (ii) the advertisers, and (iii) an ad Campaigns Manager (CM). The users are incentivized to view and interact with ads created by the advertisers. The CM is responsible (a) for orchestrating the protocol, (b) for handling the ad views reporting and finally (c) for calculating the rewards that need to be paid to users according to the policies defined by the advertisers. 

Note that the straw-man approach assumes a semi-trusted Campaign Manager. This role will be removed in the full THEMIS protocol, which is described in the next blogpost. For the sake of this initial introduction to THEMIS, relying on a CM entity allows us to simplify the explanation.

Privacy-preserving Ad Matching

In THEMIS – as in the current Brave Rewards architecture – the user downloads an updated version of the ad catalog, which includes ads and their metadata from all active ad-campaigns. The CM maintains and provides the ad catalog for users to download periodically. 

The ad-matching happens locally based on a pre-trained model and the user’s interests extracted from their web browsing history in a similar way as in Brave Rewards. In order to serve and match ads to the user interests, no data leaves the user’s device. This creates a walled garden of browsing data that is used for recommending the best matching ad while user privacy is guaranteed.

Incentives for Ad-viewing

User incentives to interact with ads are at the core of THEMIS. Each viewed/clicked ad yields an amount of BAT rewards. Different ads may provide different amounts of reward to the users. This amount is agreed by the corresponding ad creator (i.e. the advertiser) and the Campaign Manager. The user can claim rewards periodically (e.g. every week or every month). In Figure 4, we present an overview of the reward request generation and the steps to claim the ad rewards in the straw-man approach. 

The straw-man approach

We now outline the different phases of the straw-man version of THEMIS.

Phase 1: Defining Ad Rewards

In order for an advertiser to have their ad campaign included in the next version of the ad catalog, they first need to agree with the CM on the policies of the given campaign (i.e. rewards per ad, ad impressions per user, etc.) (step 1 in Figure 4). 

Once the advertiser agrees off-band with the CM on the ads that will be part of the campaign and respective payouts, the CM encodes the agreed policy as a vector, [[\policyvector]], where each index corresponds to the amount of tokens that an ad yields when viewed/clicked (e.g. Ad1: 0.4 BAT, Ad2: 2 BAT, Ad3: 1.2 BAT). The CM stores this vector privately and the advertiser needs to trust that the policies are respected (this will be addressed in the full THEMIS protocol  – see next blog post). The indices used in the policy vector maintain the same order as the corresponding indices of its ads in the ad catalog.

Figure 4. High-level overview of the user rewards claiming procedure of our straw-man approach. Advertisers can set how much they reward each ad click without disclosing that to competitors. The user can claim rewards without exposing which ads they interacted with. 

In addition to agreeing with the CM on the ads policies for the campaign, the advertiser also transfers to an escrow account the necessary funds to cover the campaign. At the end of the campaign, unused funds (i.e. when users have not clicked/interacted with enough ads to use up all the escrowed funds), are released back to the advertisers. 

For the sake of simplicity, throughout this section, we consider one advertiser who participates in our ad platform and runs multiple ad campaigns. In a real world scenario many advertisers can participate running many ad campaigns simultaneously. We also consider as agreed policies the amount of tokens an ad provides as reward to a clicking user. 

Phase 2: Claiming Ad Rewards

The user generates locally an interaction vector, which keeps track of  the number of times each ad of the catalog was viewed/clicked (.eg Ad1: was viewed 3 times, Ad2: was viewed 0 times, Ad3: was viewed 2 times). 

In every payout period, the user encrypts the state of the interaction vector. More technically, let [[\adclicks]] (ac in Figure 4) be the interaction vector containing the number of views/clicks of users with each ad, where element [[i]] of vector [[\adclicks]] represents the number of times [[\ad_i]] was viewed/clicked. On every payout period, the user generates a new ephemeral key pair [[\sk, \pk]], to ensure the unlinkability of the payout requests. The user then proceeds at each entry of [[\adclicks]] with the newly generate public key:

$$ \encryptedvector = \left[\enc(\pk, \nrinteractions_1)\ldots, \enc(\pk, \nrinteractions_{\nrads})\right] $$

where [[\nrinteractions_i]] is the number of interactions for ad [[i]], and [[\nrads]] is the total number of ads. It proceeds to send [[\encryptedvector]] to the Campaign Manager (step 2a in Figure 4).

Note that the CM cannot decrypt the received vector and thus cannot learn the user’s ad interactions (and consequently their interests). Instead, they leverage the additive homomorphic property of the underlying encryption scheme (as described in the Background Section) to calculate the sum of all payouts based on the interactions encoded in the encrypted vector [[\encryptedvector]] (step 2b in Figure 4). 

More formally, the CM computes the aggregate payout for the user as follows:

$$ \aggrresult = \sum_{i=1}^{\nrads} \policyvector[i]\cdot\encryptedvector[i] $$

where [[\policyvector[i]]] is the ad policy associated with the ad in the position [[i]] of the vector. Then CM signs the computed aggregate result: 

$$ \signreward = \sign(\aggrresult, \sk_{CM}) $$

and sends the 2-tuple [[(\aggrresult, \signreward)]] back to the user.

Upon receiving this tuple (step 2c in Figure 4), the user verifies the signature of the result: [[\signverify(\aggrresult, \signreward)]] and proceeds with decrypting the result of the aggregate:

$$ \decryptedaggr = \dec(\sk, \aggrresult) $$

As a final step, it proves the correctness of the decryption by creating a zero knowledge proof of correct decryption: [[\proofresult]] (i.e. proving that the decryption is, in fact, associated with the encrypted aggregate).

Phase 3: Payment Request

Finally, the user generates the payment request and sends the following 4-tuple to the CM (step 3a in Figure 4):

$$ (\decryptedaggr, \aggrresult, \signreward,\proofresult) $$

As a next step (step 3b in Figure 4), the CM verifies that the payment request is valid. More specifically, CM will reject the payment request of the user if

$$ \signverify(\pk_{CM}, \signreward, \aggrresult) = \bot $$

or

$$ \verify(\proofresult) = \bot $$

Otherwise, it proceeds with transferring the proper amount (equal to [[\decryptedaggr]]) of reward to the user. 

Reporting to Advertisers

THEMIS aims at providing feedback about the ad campaigns to the advertisers. During billing procedure the advertisers need to be able to verify the integrity of the reported statistics by the Campaign Manager regarding the number of times an ad was viewed/clicked by the users.

To achieve this, whenever a new version of the ad-catalog is online and retrieved from the users, a new key-pair, [[\pk_{T}]], is generated. This key is used to encrypt a copy of the adclicks vector CM (remember step 2a in Figure 4). 

The key used in this step, [[\pk_{T}]], is a public threshold key generated in a distributed way. In order to generate such a key, a pool of multiple participating users (Users are incentivized to participate in this pool. Details on how to orchestrate the incentives are left outside the scope of this blog post.), the consensus pool, is created (more details on how the consensus pool is created will be discussed in the next blog post). For this purpose, the consensus pool runs a distributed key generation algorithm. This results in a shared public key [[\pk_{T}]] and each consensus pool participant owning a privacy key share [[\sk_{T,i}]]. The public key, [[\pk_{T}]], is sent to the CM, so the key can be shared to all users.

Hence, apart from the [[\encryptedvector]] each user also sends [[\encryptedvector’]] to the CM, where:

$$ \encryptedvector’ = \left[\enc(\pk_{T}, \nrinteractions_{1}), \ldots, \enc(\pk_{T}, \nrinteractions_{\nrads})\right] $$

When the ads campaign is over, all the [[\encryptedvector’]] generated by the users will be processed to calculate how many rewards were paid per advertiser. By using the same additively homomorphic properties used to calculate the payouts for the users, the CM can also calculate the payout per advertiser using all [[\encryptedvector’]]. Thus, considering all the [[\encryptedvector’]] of the campaign, the encrypted amount of ads payout for the ad in position [[i]], can be calculated by the CM in the following way: 

$$ \encadspayout_i = \sum_{i=1}^{\nrads}\encryptedvector’_{0}[i] + \cdots + \encryptedvector’_{\nrusers}[i] $$

where [[\nrusers]] is the number of users. Each of the [[\encadspayout_{i}]] be decrypted using the threshold public-private key-pair, which requires a minimum number of pool participants to decrypt. The decrypted values are shared with the advertisers, which then allow them to verify whether the funds used by the CM to pay the users are the correct ones, based on the users interactions with the ad campaign.

Summary

In this first blog post, we presented the motivation and goals for THEMIS, a novel privacy-by-design ad platform design and implemented by Brave’s Research team. Similarly to Brave ads, THEMIS provides strong anonymity to users. In addition, it is decentralized and requires zero trust from users and advertisers. THEMIS core protocol (i) provides auditability to all participants, (ii) rewards users for interacting with ads, and (iii) allows advertisers to verify the performance and billing reports of their ad campaigns.

In addition to introducing and motivating THEMIS, we outlined a simplified straw-man design of the core protocol, which guarantees that:

  • The user receives rewards they earned by interacting with ads. The same property holds as with Brave Ads: THEMIS does not disclose which ads users have interacted with to Brave or advertisers. 
  • The campaign manager is able to correctly apply the pricing policy of each ad without disclosing any information to users or potential competitors of the advertiser.

However, the straw-man approach does not cover all the properties we would like to achieve for THEMIS, particularly in terms of trust. In the straw-man approach, the campaign manager is responsible for orchestrating the protocol: it handles the user request for payouts and calculates the rewards. In addition, the CM stores the ad policies privately and both users and the advertisers need to trust that the policies are respected when the payouts are calculated. Finally, the straw-man system does not address the privacy-preserving payment mechanism for rewards.

In the upcoming blog post, we improve the simplified straw-man approach and present the end-to-end THEMIS protocol; we will also present  a scalability evaluation, which shows how THEMIS operates at scale.

References 

[1] BAT whitepaper
[2] Brave Rewards Stats & Token Activity
[3] N. Kshetri, “The Economics of Click Fraud,” in IEEE Security & Privacy, vol. 8, no. 3, pp. 45-53, May-June 2010.
[3.1] The Dark Alleys of Madison Avenue: Understanding Malicious Advertisements 
[4] Kumari, Shilpa, et al. “Demystifying ad fraud.” 2017 IEEE Frontiers in Education Conference (FIE). IEEE, 2017.
[5] Bashir, Arshad, et.al “Tracing Information Flows Between Ad Exchanges Using Retargeted Ads”. 25th USENIX Security Symposium (USENIX Security 16)
[6] Papadopoulos, Kourtellis and Markatos “Cookie Synchronization: Everything You Always Wanted to Know But Were Afraid to Ask”
[7] Speicher, T., Ali, M., Venkatadri, et. al. (2018) “Potential for Discrimination in Online Targeted Advertising”. Proceedings of the 1st Conference on Fairness, Accountability and Transparency
[8] Venkatadri, Athanasios, et. al. (2018). Privacy Risks with Facebook’s PII-Based Targeting: Auditing a Data Broker’s Advertising Interface.
[9] El Gamal Encryption
[10] Paillier Cryptosystem
[11] Privad: practical privacy in online advertising
[12] Adnostic:   Privacy Preserving Targeted Advertising
[13] Global Ad-Blocking Behaviors In 2019 – Stats & Consumer Trends (infographic)
[14] BAT roadmap
[15] BAT Apollo AMA with Marshall Rose

Related articles

Ready for a better Internet?

Brave’s easy-to-use browser blocks ads by default, making the Web faster, safer, and less cluttered for people all over the world.