Developer Transformation Guide > Data Masking Transformation > Special Mask Formats
  

Special Mask Formats

Special mask formats are masks that you can apply to common types of data. With a special mask format, the Data Masking transformation returns a masked value that has a realistic format, but is not a valid value.
For example, when you mask an SSN, the Data Masking transformation returns an SSN that is the correct format but is not valid. You can configure repeatable masking for Social Security numbers.
Configure special masks for the following types of data:
When the source data format or datatype is invalid for a mask,the Data Integration Service applies a default mask to the data. The Integration Service applies masked values from the default value file. You can edit the default value file to change the default values.

Credit Card Number Masking

The Data Masking transformation generates a logically valid credit card number when it masks a valid credit card number. The length of the source credit card number must be between 13 to 19 digits. The input credit card number must have a valid checksum based on credit card industry rules.
The source credit card number can contain numbers, spaces, and hyphens. If the credit card has incorrect characters, or is the wrong length, the Integration Service writes an error to the session log. The Integration Service applies a default credit card number mask when the source data is invalid.
The Data Masking transformation does not mask the six digit Bank Identification Number (BIN). For example, the Data Masking transformation might mask the credit card number 4539 1596 8210 2773 as 4539 1516 0556 7067. The Data Masking transformation creates a masked number that has a valid checksum.

Email Address Masking

Use the Data Masking transformation to mask the email address that contains string value. The Data Masking transformation can mask an email address with random ASCII characters or replace the email address with a realistic email address.
You can apply the following types of masking with email address:
Standard email masking
The Data Masking transformation returns random ASCII characters when it masks an email address. For example, the Data Masking transformation can mask Georgesmith@yahoo.com as KtrIupQAPyk@vdSKh.BIC. Default is standard.
Advanced email masking
The Data Masking transformation masks the email address with another realistic email address derived from the transformation output ports or dictionary columns.

Advanced Email Masking

With the advanced email masking type, you can mask the email address with another realistic email address. The Data Masking transformation creates the email address from the dictionary columns or from the transformation output ports.
You can create the local part in the email address from mapping output ports. Or you can create the local part in the email address from relational table or flat file columns.
The Data Masking transformation can create the domain name for the email address from a constant value or from a random value in the domain dictionary.
You can create an advanced email masking based on the following options:
Email Address Based on Dependent Ports
You can create an email address based on the Data Masking transformation output ports. Select the transformation output ports for the first name and the last name column. The Data Masking transformation masks the first name, the last name, or both names based on the values you specify for the first and last name length.
Email Address Based on a Dictionary
You can create an email address based on the columns from a dictionary. Select a reference table as the source for the dictionary.
Select the dictionary columns for the first name and the last name. The Data Masking transformation masks the first name, the last name, or both names based on the values you specify for the first and last name length.

Configuration Parameters for an Advanced Email Address Masking Type

Specify configuration parameters when you configure advanced email address masking.
You can specify the following configuration paramters:
Delimiter
You can select a delimiter, such as a dot, hyphen, or underscore, to separate the first name and last name in the email address. If you do not want to separate the first name and last name in the email address, leave the delimiter as blank.
FirstName Column
Select a Data Masking transformation output port or a dictionary column to mask the first name in the email address.
LastName Column
Select a Data Masking transformation output port or a dictionary column to mask the last name in the email address.
Length for the FirstName or LastName columns
Restricts the character length to mask for the first name and the last name columns. For example, the input data is Timothy for the first name and Smith for the last name. Select 5 as the length of the first name column. Select 1 as the length of the last name column with a dot as the delimiter. The Data Masking transformation generates the following email address:
timot.s@<domain_name>
DomainName
You can use a constant value, such as gmail.com, for the domain name. Or, you can specify another dictionary file that contains a list of domain names. The domain dictionary can be a flat file or a relational table.

IP Address Masking

The Data Masking transformation masks an IP address as another IP address by splitting it into four numbers, separated by a period. The first number is the network. The Data Masking transformation masks the network number within the network range.
The Data Masking transformation masks a Class A IP address as a Class A IP Address and a 10.x.x.x address as a 10.x.x.x address. The Data Masking transformation does not mask the class and private network address. For example, the Data Masking transformation can mask 11.12.23.34 as 75.32.42.52. and 10.23.24.32 as 10.61.74.84.
Note: When you mask many IP addresses, the Data Masking transformation can return nonunique values because it does not mask the class or private network of the IP addresses.

Phone Number Masking

The Data Masking transformation masks a phone number without changing the format of the original phone number. For example, the Data Masking transformation can mask the phone number (408)382 0658 as (607)256 3106.
The source data can contain numbers, spaces, hyphens, and parentheses. The Integration Service does not mask alphabetic or special characters.
The Data Masking transformation can mask string, integer, and bigint data.

Social Security Number Masking

The Data Masking transformation generates a Social Security number that is not valid based on the latest High Group List from the Social Security Administration. The High Group List contains valid numbers that the Social Security Administration has issued.
The default High Group List is a text file in the following location:
<Installation Directory>\infa_shared\SrcFiles\highgroup.txt
To use the High Group List file in workflows, copy the text file to the source directory that you configure for the Data Integration Service.
The Data Masking transformation generates SSN numbers that are not on the High Group List. The Social Security Administration updates the High Group List every month. Download the latest version of the list from the following location:
http://www.socialsecurity.gov/employer/ssns/highgroup.txt

Social Security Number Format

The Data Masking transformation accepts any SSN format that contains nine digits. The digits can be delimited by any set of characters. For example, the Data Masking transformation accepts the following format: +=54-*9944$#789-,*()”.

Area Code Requirement

The Data Masking transformation returns a Social Security Number that is not valid with the same format as the source. The first three digits of the SSN define the area code. The Data Masking transformation does not mask the area code. It masks the group number and serial number. The source SSN must contain a valid area code. The Data Masking transformation locates the area code on the High Group List and determines a range of unused numbers that it can apply as masked data. If the SSN is not valid, the Data Masking transformation does not mask the source data.

Repeatable Social Security Number Masking

The Data Masking transformation returns deterministic Social Security numbers with repeatable masking. The Data Masking transformation cannot return all unique Social Security numbers because it cannot return valid Social Security numbers that the Social Security Administration has issued.

URL Address Masking

The Data Masking transformation parses a URL by searching for the ‘://’ string and parsing the substring to the right of it. The source URL must contain the ‘://’ string. The source URL can contain numbers and alphabetic characters.
The Data Masking transformation does not mask the protocol of the URL. For example, if the URL is http://www.yahoo.com, the Data Masking transformation can return http://MgL.aHjCa.VsD/. The Data Masking transformation can generate a URL that is not valid.

Social Insurance Number Masking

The Data Masking transformation masks a Social Insurance number that is nine digits. The digits can be delimited by any set of characters.
If the number contains no delimiters, the masked number contains no delimiters. Otherwise the masked number has the following format:
xxx-xxx-xxx

Repeatable SIN Numbers

You can configure the Data Masking transformation to return repeatable SIN values. When you configure a port for repeatable SIN masking, the Data Masking transformation returns deterministic masked data each time the source SIN value and seed value are the same.
To return repeatable SIN numbers, enable Repeatable Values and enter a seed number. The Data Masking transformation returns unique values for each SIN.

SIN Start Digit

You can define the first digit of the masked SIN.
Enable Start Digit and enter the digit. The Data Masking transformation creates masked SIN numbers that start with the number that you enter.