Developer Transformation Guide > Labeler Transformation > Token Labeling Properties
  

Token Labeling Properties

Configure properties for token labeling operations on the Strategies view in the Labeler transformation.

General Properties

General properties apply to all token labeling operations you define in the strategy. You use the general properties to name the strategy, specify input and output ports, and specify whether the strategy enables probabilistic matching techniques.
The following table describes the general properties:
Property
Description
Name
Provides a name for the strategy.
Inputs
Identifies the output port that the strategy operations can read.
Outputs
Identifies the output port that the strategy operations can write to.
Description
Describes the strategy. The property is optional.
Use probabilistic matching techniques
Specifies that the strategy can use a probabilistic model to identify token types.
Reverse Enabled
Indicates that the strategy reads input data from right to left. This property is disabled for probabilistic matching.
Delimiters
Specifies the characters that the transformation uses to evaluate substrings in input data. Default is space.
The property is disabled in probabilistic labeling.
Tokenized Output Field
Indicates that the strategy writes multiple labels to an output port. Select this field to create input data for pattern-based parsing in the Parser transformation.
Score Output Field
Identifies the field that contains the score values generated in probabilistic matching. Set the score output field when you select the option to use probabilistic matching techniques.
Output Delimiter
Specifies a character to separate data values on the output port. Default is colon.

Token Set Properties

Token set properties apply when you configure a labeling operation to use token sets.
The following table describes the general properties:
Property
Description
Select Token Sets
Specifies the token sets that the transformation uses to label strings.
Filter text
Filters the list of token sets or regular expressions. Use text characters and wildcard characters as a filter.
Add Token Set
Use to define custom token sets.
Add Regular Expression
Use to define regular expressions that match an input pattern.
Edit
Edits the contents of a custom token set or a regular expression.
Import
Imports a nonreusable copy of a token set or regular expression from a folder in the Model repository. If you update the source object for the token set or regular expression, the Data Integration Service does not update the nonreusable copy.
Remove
Removes a custom token set or a regular expression.
Specify Execution Order
Sets the order in which the operation applies the token sets or regular expressions to the data. Use the Up and Down arrows to change the order.

Custom Label Properties

When you configure a token label operation, you can select the Custom Label view to create labels for specific search terms.
The following table describes the custom label properties:
Property
Description
Search Term
Identifies the string to search for.
Case Sensitive
Specifies whether the input data must match the case of the search term.
Custom Label
Identifies the custom label to apply.

Probabilistic Matching Properties

When you select the options to use probabilistic matching techniques, you can add a probabilistic model to the labeling operation. You cannot add a probabilistic model to a strategy that uses a token set or reference table.
The following table describes the properties associated with probabilistic matching:
Property
Description
Name
Provides a name for the operation.
Filter Text
Uses characters or wildcards you enter to filter the list of probabilistic models in the repository.
Probabilistic Model
Identifies the probabilistic model to use in the operation.

Reference Table Properties

Reference table properties apply when you configure a labeling operation to use a reference table.
The following table describes the reference table properties:
Property
Description
Name
Provides a name for the operation.
Reference Table
Specifies the reference table that the operation uses to label tokens.
Label
Specifies the text that the operation writes to a new port when an input string matches a reference table entry.
Case Sensitive
Determines whether input strings must match the case of reference table entries.
Replace Matches with Valid Values
Replaces labeled strings with the entry from the Valid column in the reference table.
Mode
Determines the token labeling method. Select Inclusive to label input strings that match reference table entries. Select Exclusive to label input strings that do not match reference table entries.
Set Priority
Determines whether reference table labeling operations takes precedence over token set labeling operations in a strategy. If you set this property, the transformation performs reference table labeling before token set labeling, and the token set analysis cannot overwrite the reference table label analysis.