Token Parsing Ports
Configure the token parsing ports with settings appropriate for your data.
A Parser transformation in token parsing mode has the following port types:
- Input
- Contains data that you pass to the Parser transformation. The transformation merges all input ports into a combined data string using the Input Join Character specified on the Strategies tab. If you do not specify an input join character, the transformation uses a space character by default.
- Parsed Output Ports
- User-defined output port(s) that contains successfully parsed strings. In cases where multiple parsing strategies use the same output, the transformation merges the output into a combined data string using the Output Join Character specified on the Strategies tab. If you do not specify an output join character, the transformation uses a space character by default.
- Overflow
- Contains successfully parsed strings that do not fit into the number of outputs defined in the transformation. For example, if the transformation only has two "WORD" outputs, the string "John James Smith" results in an overflow output of "Smith." The Parser transformation creates an overflow port for each strategy that you add.
- When you select the Detailed Overflow option, the transformation creates an overflow port for each label in the model.
- Unparsed
- Contains strings that the transformation cannot parse successfully. The Parser transformation creates an unparsed port for each strategy that you add.
Output Ports in Probabilistic Matching
When you configure a parsing strategy to use probabilistic matching techniques, the Parser transformation adds a port to store the match scores for each output port.
The following table describes the types of port:
Port Type | Port Created in Probabilistic Matching |
---|
Parsed output port | [label name] output [label name] score output |
Overflow data port | [overflow data] output [[overflow data] score output |
Unparsed data port | [unparsed data] output [unparsed data] score output |