Privacy glossary

Tokenization

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

What is the Tokenization?

Tokenization of data is the process of replacing a sensitive data element with a non-sensitive data equivalent, known as a token, that has no meaning or value outside of the specific application it was created for. Tokenization protects sensitive data from unwanted exposure both during transit and while stored, and is thus a well-regarded security measure, meeting compliance standards set by several governing bodies tasked with ensuring data security. Data tokenization has become widely popular, particularly for financial transactions.

Tokenization of data is similar to the pre-digital era practice of using subway tokens, or game play tokens in a video arcade, to replace actual money. In both cases, the tokens only had value within their particular system, and couldn’t be used anywhere else—a subway token could only be used to purchase a ride on that subway, and a video arcade token could only be used to play a video game in that arcade.

Similarly, data tokens can only be used within the context in which they are created—they have no value outside that application. A data token created to complete your monthly transaction with Netflix has no value outside of Netflix. However, whereas a subway token only purchases one ride, a data token can be single-use or used over and over again (such as to pay for a renewing subscription).

Where is tokenization used?

Tokenization is a broad term that covers many applications. While this article focuses on data tokenization as a method of securing sensitive data, note that you may also encounter the term in other contexts (for example, in blockchain the term tokenization represents the ownership of physical or digital assets such as NFTs).

By far, the most common use of data tokenization is in payment processing, especially credit card payments. Exchanging tokenized payment data over the Internet reduces the risks of exchanging real credit card information. The token created for a payment is unique to the payer and payee, and allows the payment processor to identify who’s making a payment without exchanging any sensitive information like real name or credit card number.

In a similar way, tokenization is also the basis for the functionality of mobile wallets like Apple Pay or Android Pay. With mobile wallets, the mobile phone only stores a token, not an actual credit card number. When the phone is held near the pay terminal, for instance, only the token is transmitted to complete payment. In addition to the improved security during a transaction, storing only a token provides better security if the phone is lost or stolen.

While payment processing is where you most frequently see tokenized data, tokenization can protect a variety of data in different settings. For example, health care systems can convert an individual’s sensitive data (like name, date of birth, and medical ID number) into a token. This token can then be used to access the individual’s records. This provides both the individual and the caregivers verifiable and secure access to health data without unnecessarily exposing the person’s sensitive data. This sort of data tokenization to protect sensitive information is often employed to comply with standards set by laws like GDPR and HIPAA.

How does tokenization work?

In a typical situation that uses data tokenization, there are three parties involved:

  • The individual whose data needs protection.
  • The middle party that the individual interacts with (such as a doctor’s office or  merchant).
  • A third party that stores the data or processes the transaction (such as a healthcare system, credit card company, or an entity specializing in handling tokens). It’s this third party that stores all real data on a highly secure database, sometimes called a vault.

The token is created by the third party—the one tasked with storing the real data—in a few different ways. Most of the creation procedures—using a random number generator, hashing the original data, or assigning the next token on a preset list or index—result in a token that can’t be “decoded”. Occasionally a token is created using encryption, resulting in a token that can conceivably be decrypted. A token may also be partially masked. For example, using a random number to replace the first 12 digits of a credit card number leaves the last four digits unchanged. You see this in action when you view a list of credit cards to choose from to pay, but only the last four digits are actually displayed.

The transaction is handled by the third party on behalf of the middle party; the middle party (merchant or health care provider) stores the resulting token in their data, but no actual sensitive data. When they want to do anything related to the data (such as view a patient’s records, refund a charge for returned items, or process a monthly charge on a subscription), they contact the third party and present the token. The third party looks up the real data in their token vault, and supplies the requested transaction data or processes the requested transaction, without sending back sensitive data like a credit card number.

Example of tokenization in action

Let’s look at an example of data tokenization in which an online merchant stores a customer’s payment information for future purchases (this scenario usually occurs at the end of a transaction). Imagine you’ve placed an order and provided your method of payment. The merchant will then tell you that your payment has been confirmed, and (usually) offer to store your payment method for future use. Having payment information on file can streamline future transactions, but you want to ensure your payment information isn’t at risk of being exposed (such as through a data breach).

In this case, after the merchant submits the payment request to the payment processor via a secured “payment gateway” (an intermediary hired by the merchant to process transactions and deal with the final payment processor), the payment gateway will store the real transaction data in its secure token vault. This includes the actual credit card number, along with a token it generates; only this token is returned to the merchant. The payment gateway also sends the token to the payment processor, which will eventually use the token to complete the transaction.

The merchant then stores the token as a record of the transaction, but not any actual payment method data. The merchant can then offer to store the customer’s credit card for future use, even though what they’re really storing is the token. When the customer makes a future purchase, the merchant will forward the purchase information and the previously stored token through the payment gateway. There will be no exchange of actual credit card data.

Why is tokenization so secure?

Most tokens can’t be decoded or decrypted (depending on which is in use), so if a token is intercepted during transmission or as part of a data breach, it can’t be unscrambled to reveal the original data. Tokens are unique to the individual and entity using the individual’s data—a single credit card may be represented by multiple tokens with a different token for each merchant. If a merchant has their data breached, only that token is exposed, and can’t be used at other merchants.

Data tokenization is considered  secure enough to satisfy security requirements stipulated by HIPAA, GDPR, and PCI DSS (Payment Card Industry Data Security Standard). The PCI standards only allow merchants to store actual credit card information in their own data if they maintain strong (and expensive) encryption infrastructure. Small and mid-sized companies in particular benefit from using payment gateways and their highly secure token vaults, as it allows them to set up repeat payments and reusable payment forms without needing a complex in-house security system.

Limitations of tokenization

Data tokenization is ideal for some applications, as we’ve seen for credit card payments and account access details. However, data tokenization is somewhat limited because it works best for data that adheres to a common, predictable format (like a 16 digit credit card number or social security number).

How tokenization benefits individuals

This method of securing sensitive information increases the security and privacy of an individual’s personal data, from financial data to health data and more. It also makes everyday tasks easier and less worrisome. For example, using your mobile phone and Apple Pay at a gas station is easier than swiping or tapping a credit card, and can be more secure.

Ready for a better Internet?

Brave’s easy-to-use browser blocks ads by default, making the Web faster, safer, and less cluttered for people all over the world.