You can use custom dictionaries when you perform substitution masking. Create relational or flat file dictionaries to mask data with values from dictionaries other than the default dictionaries.
Create and add a flat file or relational dictionary connection to the masking task. For flat file dictionaries, you must create a flat file connection with the directory that points to the dictionary files. Add the connection to the relational or flat file dictionary from the Configure | Connections view.
When you configure a masking task, you can use the flat file or relational dictionary connection to perform custom substitution masking.
You can substitute data with repeatable or nonrepeatable values. When you choose repeatable values, the masking task produces deterministic results for the same source data and seed value. You must configure a seed value to substitute data with deterministic results. You can substitute more than one column of data with masked values from the same dictionary row.
You can configure the custom substitution masking rule to replace the target column with unique masked values for every unique source column value. To configure unique substitution masking, you must create a storage connection for the storage tables. Storage tables contain the source to dictionary value mapping information required for unique substitution masking.
When you configure the custom substitution masking rule, select the dictionary type, the connection, and then select the required dictionary file or table. You can then select the required column from the dictionary. To support non-English characters, you can use different code pages from a flat file connection.
The flat file connection code page and the Secure Agent system code page must be compatible for the masking task to work.
Custom substitution masking parameters
To perform custom substitution masking, select a custom dictionary that you create.
The following table describes the parameters that you can configure for substitution masking:
Parameter
Description
Flat File Dictionary
Relational Dictionary
Choose the type of custom dictionary to use.
Dictionary
Select the required dictionary from the list.
You must have added the dictionary connection to the masking task.
The list includes relational or flat file dictionaries depending on what you choose.
For flat file dictionaries, the dictionary file must be present for all the Secure Agents in a runtime environment in the following location:
The output column from the custom dictionary. For a flat file dictionary, you can select a dictionary column if the flat file contains column headers.
Order By
Applicable for relational dictionaries. The dictionary column on which you want to sort entries. Specify a sort column to generate deterministic results even if the order of entries in the dictionary changes. For example, if you move a relational dictionary and the order of entries changes, sort on the serial number column to consistently mask the data.
Note: The column that you choose must contain unique values. Do not use columns that can contain duplicate values to sort the data.
Lookup Input Port
Optional. The source input column on which you perform a lookup operation with the dictionary.
Lookup Dictionary Port
Required if you enter a lookup Input Column value. The dictionary column to compare with the input port. The source is replaced with values from the dictionary rows where the Lookup Input and Lookup Dictionary values match.
Lookup Error Constant
Optional. A constant value that you can configure when there are no matching values for the lookup condition from the dictionary. Default is an empty string.
Repeatable
Returns the same masked value when you run a task multiple times or when you generate masked values for a field that is in multiple tables.
Seed Value
A starting number to create repeatable output. Enter a number from 1 through 999. Default seed value is 190. You can enter the seed value as a parameter.
Optimize Dictionary Usage
Increases the usage of masked values from the dictionary. Available if you choose the Repeatable option. The property is not applicable if you enable unique substitution.
Is Unique
Applicable for repeatable substitution. Replaces the target column with unique dictionary values for every unique source column value. If there are more unique values in the source than in the dictionary file, the data masking operation fails. Default is nonunique substitution.
Preprocessing Expression
Configure an expression in the rule to convert characters before the masking rule runs. For example, you might want to convert all characters to the same case before masking.
Postprocessing Expression
Configure an expression in the rule to convert characters in the masked output before the data is copied to the target.
Custom substitution lookup example
Consider that you apply substitution masking on the S_City column and you select a dictionary file with city names, identification numbers, and serial numbers. Select CITY as the dictionary column. The lookup input port is Id and the lookup dictionary port is SNO. If there are no matching values between the Id and SNO columns, the task uses the error constant BANGALORE as the lookup value.
The following image shows the substitution parameters for masking with custom dictionaries:
Custom substitution dictionary lookup use cases
The task performs dictionary lookup in custom substitution masking in the following cases:
•Case 1. If there are valid target lookup records in a dictionary for all the corresponding source records, the task picks all the values from the dictionary and replaces in the target.
•Case 2. If there are some records in the source for which there are multiple lookup values in a dictionary, the task picks one of the lookup values from the dictionary and substitutes with the source value.
•Case 3. if some of the source values are same as the lookup values in a dictionary, the target contains the same data as the source.
•Case 4. If the source records do not have a lookup value in a dictionary and if you specify a valid error constant, the task uses the error constant for all the failed lookup conditions.
•Case 5. If the source records do not have a lookup value in a dictionary and if you do not specify a valid error constant, the task fails and generates an exception.