Labeler Transformation Strategies
Use labeling strategies to assign labels to input data. To configure a labeling strategy, edit the settings in the Strategies view of a Labeler transformation.
When you create a labeling strategy, you add one or more operations. Each operation implements a specific labeling task.
The Labeler transformation provides a wizard that you use to create strategies. When you create a labeling strategy, you choose between character labeling or token labeling mode. You then add operations specific to that labeling mode.
Character Labeling Operations
Use character labeling operations to create labels that describe the character patterns in your data.
You can add the following types of operations to a character labeling strategy:
- Label Characters using Character Sets
- Label characters using predefined character sets, such as digits or alphabetic characters. You can select Unicode and non-Unicode character sets.
- Label Characters using Reference Table
- Label characters with custom labels from a reference table.
Token Labeling Operations
Use token labeling operations to create labels that describe strings in your data.
The Labeler transformation can identify and label multiple tokens in an input string. For example, you can configure the Labeler transformation to use the US Phone Number and Email Addresses token sets. When the Labeler transformation processes the input string "555-555-1212 someone@somewhere.com," the output string is "USPHONE EMAIL."
You can add the following types of token labeling operations to a labeling strategy:
- Label with Reference Table
- Label strings that match reference table entries.
- Label Tokens with Token Set
- Label string patterns that match token set data or probabilistic model data.