When you add a step to a labeler asset, Data Quality prompts you to define a step with a character set or with a dictionary.
The following image shows the options that you configure when you define a step with a character set:
The character set options include the following properties:
1Labeler mode.
Indicates the type of labeling operations to perform on the input data.
2Add Step option.
Adds a step to the asset. A step describes a labeling operation that a mapping can apply to an input data field.
3Up and Down options.
Moves a step that you select up or down within the step sequence.
4Step sequence.
Defines the order in which a mapping applies each step to the input field at run time. The mapping performs labeling operations in the order that you specify.
5Options name.
Identifies the type of regular expression in the step.
6Test input field.
Contains the string that the Secure Agent uses to test the steps in the labeler sequence.
7Test output fields.
Contain the result of the test. The test output field contains a copy of the input field data in which any character that matches the content of a character set is replaced by the label that you specify.
8Import file option.
Imports data to the test panel.
9Step type.
Identifies the type of step to which the properties apply.
10Character set type.
Specifies whether the step uses a built-in or custom character set.
A built-in character set is one that you select from a list that the asset provides. A custom character set is one that you define.
11Label name.
Specifies the label that the step applies to the characters in an input string. You provide a single-character label for a custom character set.
12Add Character Set option.
Adds a custom character set. You can specify individual characters or a range of characters. Select the characters from the Custom Character Set dialog box.
13Ignore Text option.
Adds one or more strings to ignore during a labeling operation. The labeling operation does not assign the label to any input character that matches a character in the strings.
The Ignore Test option includes the following properties:
- Search term. Specifies the characters to ignore when you perform a labeling operation.
- Case sensitive. Determines whether the input characters must match the case of the search characters.
- Uppercase. Converts the characters in the input string that match the search term to uppercase.
- Start position. Specifies the character position in the input string at which the asset starts to analyze the characters.
- End position. Specifies the character position in the input string at which the asset ends the analysis.