You can use a regular expression to find values that match a given character structure in an input field. Create a regular expression that matches the structure of the values that you want to find. Or, select a regular expression from the list of built-in expressions in the asset.
Use a regular expression in place of a dictionary when you cannot predict the content of every value or when the range of values that you will search for is too great to add to a dictionary.
At run time, the Parse transformation applies the regular expression logic to the values in the input field. When the transformation finds a value with a structure that matches the expression logic, the transformation writes the value to the output field that the step specifies.
Example: United States telephone numbers and Social Security numbers
A customer data set might include a column for telephone numbers. Over a period of time, many users incorrectly enter Social Security numbers into the column. You can configure a parse asset to find values that match both formats.
The following table displays the types of errors that can appear in the column:
Value
Format
212-555-1234
Telephone number
910-22-5555
Social Security number
(518)555-8466
Telephone number
(718) 555-2907
Telephone number
2125550987
Telephone number
922-823-5746
Social Security number
974-43-0202
Social Security number
212-555-3287
Telephone number
Create a step for each data format, and add a regular expression to each step.
For example, the parse asset contains the following built-in regular expression for United States telephone numbers: