Developer Transformation Guide > Parser Transformation > Pattern-Based Parsing Mode
  

Pattern-Based Parsing Mode

In pattern-based parsing mode, the Parser transformation parses patterns made of multiple strings.
You can use the following methods to define patterns in pattern-based parsing mode:
You can use the "+" and "*" wildcards to define a pattern. Use "*" characters to match any string, and "+" characters to match one or more instances of the preceding string. For example, use "WORD+" to find multiple consecutive instances of a word token, and use "WORD *" to find a word token followed by one or more tokens of any type.
You can use multiple instances of these methods within the Parser transformation. The transformation uses the instances in the order in which they are listed on the Configuration view.
Note: In pattern-based parsing mode, the Parser transformation requires the output of a Labeler transformation that uses token labeling mode. Create and configure the Labeler transformation before creating a Parser transformation that uses pattern-based parsing mode.

Pattern-Based Parsing Ports

Configure the pattern-based parsing ports with settings appropriate for your data.
A Parser transformation that uses the pattern-based parsing mode has the following port types:
Label_Data
Connect this port to the Labeled_Output port of a Labeler transformation that uses the token labeling mode.
Tokenized_Data
Connect this port to the Tokenized_Data output port of a Labeler transformation that uses the token labeling mode.
Parse_Status
If a match is found for the input pattern, this port outputs the value Matched. If no match is found, it outputs Unmatched.
Overflow
Successfully parsed strings that do not fit into the number of outputs defined in the transformation. For example, if only two "WORD" outputs are defined, the string "John James Smith" results in an overflow output of "Smith" by default.
Parsed
Successfully parsed strings in user-defined ports.