There is little question these days that tokenization is an effective way to secure sensitive data and potentially lower PCI compliance costs. What people are still debating is HOW you go about implementing a tokenization solution and what considerations must be made in doing so. What is the best token generation method? Should you build an in-house solution or work with a third party vendor? Is the tokenization process and storage facility secure? Do tokens expire? Is it possible for collisions to occur between tokens, or between tokens and real PAN information?
As is frequently the case, the answer to all these questions is… it depends. Safe to say that there is no “one size fits all” solution. In order to find a solution that works best, companies must review their environment, the size and type of their business, and match specific capabilities against their existing business processes.
Today we hope to “demystify” some common tokenization components and uncover the myths surrounding various implementation approaches. Let’s start by defining what is “tokenization”?
Simply put, tokenization is the process by which we replace a valuable piece of information with a meaningless number or token. While all types of sensitive data can be tokenized, for the purpose of this discussion, the data in question is the PAN or Primary Account Number.
MYTH #1
Randomly generated tokens are more secure than tokens generated using a sequential method.
According to the PCI Security Standards Council’s Tokenization Guidelines, the most important requirement in generating a token is that you cannot reverse engineer the token to derive the original PAN. Let’s look at an example.
Original Card Number
3872 3789 1620 3675
Token
3898 2783 2990 3675
In this example, another 16 digit number was created where the first 2 and last 4 digits of the PAN were retained and the card type is identified within the newly generated token. (In payment applications it is often advantageous to retain the 16 digit format of the original card number because other systems that use the card numbers need not be altered to accommodate the tokens. This is called “format-preserving” tokenization.) The resulting token cannot be used as a financial instrument and has no value other than as a reference to the original transaction or real account. The 10 digits in the middle can be generated using a random number generator or as a part of a sequential counter. Either way though, what is important is that there is no direct mathematical relationship between the credit card data and the token. As for those who claim the random method is more secure, the probability of someone “cracking the code” when it comes to the sequential method is akin to getting struck by lightning while inside the president’s secret underground bunker. I could go into more detail and explanation, but I’d need more space than this forum allows (and fear I’d lose some readers along the way).
MYTH #2
“Vaultless” tokenization is faster and more scalable than a “vaulted” solution.
There have been some articles suggesting that as the token vault grows, performance is affected and token “collisions” may occur. A token collision refers to a scenario where the same token could be generated for two different PANs. Another concern is tokens that are generated that turn out to be the actual PAN of another cardholder.
From the perspective of a traditional database model, it’s not unreasonable to assume that as a token database grows, token generation or retrieval requests could lead to latency issues. However, vendors with vast experience and expertise in “vaulted” tokenization methodology have designed systems to account for growth over time. Their network architecture is well thought out, thoroughly tested, and secure. Their transactions flow in ways that allow for multiple processes to occur in tandem so transactions can be routed immediately for processor approval or funding.
The devil is really in the details, and rather than lead you down another rabbit hole discussion, suffice to say that you shouldn’t believe every blanket claim you read. Ask prospective tokenization providers about their specific methodology and how they prevent latency and collisions.
Another important question to ask, particularly in a “vaultless” tokenization scenario, is how will you retrieve a PAN if you need it? Problems and issues sometimes occur and it’s important that vendors are able to quickly and securely access information and offer support in resolving any problems. The system requesting the PAN should be a validated system authorized to perform the request. The use of multi-factored or certificate-based authentication can address this need. In addition, there should also be a system of monitoring and alerts to ensure the request is from a valid source and brings awareness to any abnormal activity.
MYTH #3
Home grown or premise-based tokenization is better than using cloud-based or third-party vendor hosted tokenization.
As stated earlier, there is no “one size fits all” solution that works best in all circumstances. There are many factors to consider when selecting a tokenization solution that fits your business needs, security and PCI goals. Home grown and premise-based solutions offer you total control over tokenization implementation but require a great deal of expertise not typically found in the average IT department. “Vaultless” tokenization is effective for large data to token conversions and higher volume merchants but additional questions should be considered for the handling of re-occurring payments, credits, refunds and other business practices that require the recall of a specific transaction or card number. Token requests and retrieval of the original payment data can put those segments of the merchant network infrastructure involved back into the card data environment (CDE).
For merchants looking to reduce their PCI scope as much as possible, cloud-based or hosted tokenization is an attractive option. With a cloud-based solution, stored PAN data is completely removed from the local IT environment. The card data is stored in a secure off-site “vault”, safe from hackers attempting to gain access to sensitive information. Hosted tokenization allows the merchant to run their business without the worry of possible data theft as well as the added benefit of reducing PCI scope and costs.
Yes, there is much to consider when selecting a tokenization strategy but the process shouldn’t require the average merchant to spend their valuable time researching every component. By partnering with a reputable and well established solution provider, understanding the basic concepts of tokenization and asking good questions, you can find a tokenization solution that fits both your security goals and your budget. We invite you to share your experiences, questions and comments below.