User Guide > Data Masking Techniques and Parameters > Random Masking
  

Random Masking

Random masking produces random, non-repeatable results for the same source data and masking rules. Random masking does not require a seed value. Mask date, numeric, and string datatypes with random masking.
The results of random masking are non-deterministic. Use random masking to mask date, numeric, or string datatypes.
The following table describes the options that you can configure for random masking:
Option
Description
Range
A range of output values. The PowerCenter Integration Service returns data between the minimum and maximum values. You can configure a range for date, numeric and string datatypes.
Blurring
A range of output values with a fixed or percent variance from the source data. Returns data that is close to the value of the source data. You can configure blurring for date and numeric datatypes.
Mask Format
The type of character to substitute for each character in the input data. You can limit each character to an alphabetic, numeric, or alphanumeric character type. You can configure a mask format for the string datatype.
Source String Characters
The characters in the source string that you want to mask. You can configure source string characters for the string datatype.
Result String Replacement Characters
Substitutes the characters in the target string. You can configure replacement characters for the string datatype.

Range Masking

Configure a range to define an output range for numeric, date, or string data.
When you define a range for numeric or date values, the PowerCenter Integration Service masks the source data with a value between the minimum and maximum values. When you configure a range for a string, you configure a range of string lengths.
Note: When you configure date random masking, the maximum datetime must be later than the minimum datetime.

Blurring

Configure blurring to return a random value that is close to the original value. For random masking of datetime or numeric data, blurring creates an output value within a fixed or percent variance from the source data value.

Date Blurring

To blur a datetime source value, select a unit of time to blur, a high bound, and a low bound. You can select year, month, day, or hour as the unit of time. By default, the blur unit is year.
For example, to restrict the masked date to a date within two years of the source date, select year as the unit. Enter two as the low and high bound. If a source date is 02 February, 2006, the PowerCenter Integration Service returns a date between 02 February, 2004 and 02 February, 2008.

Numeric Blurring

To blur a numeric source value, select a fixed or percent variance, a high bound, and a low bound. The high and low bounds must be greater than or equal to zero.
The following table lists the masking results for blurring range values when the input source value is 66:
Blurring Type
Low
High
Result
Fixed
0
10
Between 66 and 76
Fixed
10
0
Between 56 and 66
Fixed
10
10
Between 56 and 76
Percent
0
50
Between 66 and 99
Percent
50
0
Between 33 and 66
Percent
50
50
Between 33 and 99

Mask Format

Configure a mask format to limit each character in the output column to an alphabetic, numeric, or alphanumeric character.
Note: The mask format contains uppercase characters. When you enter a lowercase mask character, Test Data Manager converts the character to uppercase.
The following table describes mask format characters:
Character
Description
A
Alphabetical characters. For example, ASCII characters a to z and A to Z.
D
Digits. From 0 through 9.
N
Alphanumeric characters. For example, ASCII characters a to z, A to Z, and 0-9.
X
Any character. For example, alphanumeric or symbol.
+
No masking.
R
Remaining characters. R specifies that the remaining characters in the string can be any character type. R must appear as the last character of the mask.
If you do not define a mask format, the PowerCenter Integration Service replaces each source character with any character. If the mask format is longer than the input string, the PowerCenter Integration Service ignores the extra characters in the mask format. If the mask format is shorter than the source string, the PowerCenter Integration Service does not mask the characters at the end of the source string.

Source String Characters

Source string characters are characters that you want to mask in the source. Configure source string characters if you want to mask a few of the characters in the input string.
For example, if you set the number sign (#) as a source string character, it is masked every time it occurs in the input data. The position of the characters in the source string does not matter, and you can configure any number of characters. If you do not configure source string characters, the masking replaces all the source characters in the column.
The source characters are case sensitive. The PowerCenter Integration Service does not always return unique data if the number of source string characters is fewer than the number of result string characters.
The following table describes the options that you can configure for source string characters:
Option
Description
Mask Only
Masks characters in the source that you configure as source string characters. For example, if you enter A and b as source string characters, every instance of A and b in the source data will change. A source character that is not an A or b will not change.
Mask all except
Masks all characters in the source except for source string characters. For example, if you enter "-" as the source string character, every character except for "-" will change.

Result String Replacement Characters

Result string replacement characters are a set of characters that the PowerCenter Integration service can use to mask to the source data. You can configure the masking rule to mask the source only from the set of characters, or you can configure the masking rule to mask the source with any character except the result string replacement characters.
The PowerCenter Integration Service replaces characters in the source string with the result string replacement characters. For example, enter the following characters to configure each mask to contain uppercase alphabetic characters A through F:
ABCDEF
To avoid generating the same output for different input values, configure a wide range of substitute characters, or mask only a few source characters. The position of each character in the string does not matter.
The following table describes the options for result string replacement characters:
Option
Description
Use only
Masks the source with only the characters you define as result string replacement characters. For example, if you enter the characters A, B, and c, the masking replaces every character in the source column with an A, B, or c. The word "horse" might be replaced with BAcBA.
Use all except
Masks the source with any characters except the characters you define as result string replacement characters. For example, if you enter A, B, and c result string replacement characters, the masked data never has the characters A, B, or c.

Date Random Masking Parameters

To mask datetime values with random masking, either configure a range of output dates or choose a variance.
When you configure a variance, choose a part of the date to blur. Choose the year, month, day, hour, minute, or second. The PowerCenter Integration Service returns a date that is within the range you configure.
The following table describes the parameters that you can configure for random masking of datetime values:
Parameter
Description
Range
The minimum and maximum values to return for the selected datetime value. The date range is a fixed variance.
Blurring
Masks a date based on a variance that you apply to a unit of the date. The PowerCenter Integration Service returns a date that is within the variance. You can blur the year, month, day, or hour. Choose a low and high variance to apply.

Numeric Random Masking Parameters

When you mask numeric data, you can configure a range of output values for a column.
The PowerCenter Integration Service returns a value between the minimum and maximum values of the range depending on column precision. To define the range, configure the minimum and maximum ranges or a blurring range based on a variance from the original source value.
The following table describes the parameters that you can configure for random masking of numeric data:
Parameter
Description
Range
A range of output values. The PowerCenter Integration Service returns numeric data between the minimum and maximum values.
Blurring Range
A range of output values that are within a fixed variance or a percent variance of the source data. The PowerCenter Integration Service returns numeric data that is close to the value of the source data. You can configure a range and a blurring range.

String Random Masking Parameters

Configure random masking to generate random output for string columns.
To configure limitations for each character in the output string, configure a mask format. Configure filter characters to define which source characters to mask and the characters to mask them with.
The following table describes the parameters that you can configure for random masking of string columns:
Parameter
Description
Range
The minimum and maximum string length. The PowerCenter Integration Service returns a string of random characters between the minimum and maximum string length.
Mask Format
The type of character to substitute for each character in the input data. You can limit each character to an alphabetic, numeric, or alphanumeric character type.
Source String Characters
The characters in the source string that you want to mask.
Result String Replacement Characters
Substitutes the characters in the target string.