Property | Description |
---|---|
Filter By | Filters rows in the target table based on criteria for a column in the table. Enter a filter clause to determine the criteria, such as STATE='California'. You can use any transformation language functions in the filter clause. The filter clause must evaluate to TRUE or FALSE. To set criteria for multiple columns, use the operators AND and OR. For example, use the operator AND to specify a filter clause such as STATE='California' AND STATUS='Single'. |
Drop Columns | Drops columns in the target table. Enter each column as a string and separate column names with a comma, or specify a regular expression. For example, if you enter COL1, the column COL1 will be dropped in the target table. If you enter a regular expression such as .*SSN.*, columns that have SSN in the column name will be dropped. |
Trim | Trims column values in the target table to remove spaces before and after the values. Enter each column as a string and separate column names with a comma, or specify a regular expression. For example, if you enter COL1, values in the column COL1 will be trimmed in the target table. If you enter a regular expression such as .*SSN.*, values in columns that have SSN in the column name will be trimmed. |
Convert to Uppercase | Converts column values in the target table to uppercase. Enter each column as a string and separate column names with a comma, or specify a regular expression. For example, if you enter COL1, values in the column COL1 will be converted to upper case in the target table. If you enter a regular expression such as .*SSN.*, values in columns that have SSN in the column name will be converted to upper case. |
Convert to Lowercase | Converts column values in the target table to lowercase. Enter each column as a string and separate column names with a comma, or specify a regular expression. For example, if you enter COL1, values in the column COL1 will be converted to lower case in the target table. If you enter a regular expression such as .*SSN.*, values in columns that have SSN in the column name will be converted to lower case. |
Replace Columns | Replaces column values in the target table. Enter each column as a string and separate column names with a comma, or specify a regular expression. For example, if you enter COL1, values in the column COL1 will be replaced in the target table. If you enter a regular expression such as .*SSN.*, values in columns that have SSN in the column name will be replaced. If you specify columns to replace, you must specify the replace criteria. |
Replace Criteria | Required if you specify columns to replace. Determines how to replace column values in the target table. You can select Pattern or Entire String. If you select Entire String, all values in the columns that you specify are replaced by the value that you configure. If you select Pattern, enter the pattern to be replaced. Then enter the value to replace the pattern. |
Pattern | Required if you configure the replace criteria to be a pattern. Determines the pattern to replace. The pattern must be a regular expression. For example, if you want to replace the values in columns that contain Social Security numbers, you can enter the pattern: ^\d(3)-?\d(2)-?\d(4)$ The pattern replaces all values that correspond to the pattern. In the column that contains Social Security numbers, the pattern replaces the entire Social Security number. To mask only the first five digits of the Social Security number, you can enter the pattern: ^\d(3)-?\d(2)$ |
Value | Required if you configure columns to replace. Replaces the pattern or the entire string according to the configured criteria. If you select Entire String for the replace criteria, the value that you enter replaces the values in all of the columns that you specify to replace. For example, if you replace COL1 and COL2 and enter the value XXX, all values in columns COL1 and COL2 are replaced with the value XXX. If you select Pattern for the replace criteria, the value that you enter replaces all values that correspond to the pattern. For example, you want to mask the entire Social Security number according to the pattern: ^\d(3)-?\d(2)-?\d(4)$ Enter the value XXX-XX-XXXX. All values in the SSN column will appear as XXX-XX-XXXX. You might also want to mask only the first five digits of the Social Security number according to the pattern ^\d(3)-?\d(2)$ Enter the value XXX-XX. The first five digits of every Social Security number will appear as XXX-XX. For example, if the original Social Security number is 123-45-6789, the replaced value is XXX-XX-6789. |
EMPLOYEEID | PHONENUMBER |
---|---|
607014 | (630)4468851 |
620368 | (904)3854084 |
698107 | (549)5694371 |
621861 | (904)9062721 |
EMPLOYEEID | PHONENUMBER |
---|---|
620368 | (904)3854084 |
621861 | (904)9062721 |
Metacharacter | Description |
---|---|
. | Matches any single character. |
[ ] | Indicates a character class. Matches any character inside the brackets. For example, [abc] matches “a,” “b,” and “c.” |
^ | If this metacharacter occurs at the start of a character class, it negates the character class. A negated character class matches any character except those inside the brackets. For example, [^abc] matches all characters except “a,” “b,” and “c.” If this metacharacter occurs at the beginning of the regular expression, it matches the beginning of the input. For example, ^[abc] matches the input that begins with “a,” “b,” or “c.” |
- | Indicates a range of characters in a character class. For example, [0-9] matches any of the digits “0” through “9.” |
? | Indicates that the preceding expression to this metacharacter is optional. It matches the preceding expression zero or one time. For example, [0-9][0-9]? matches “2” and “12.” |
+ | Indicates that the preceding expression matches one or more times. For example, [0-9]+ matches “1,” “13,” “666,” and similar combinations. |
* | Indicates that the preceding expression matches zero or more times. For example, the input <abc*> matches <abc>, <abc123>, and similar combinations that contains <abc> as the preceding expression. |
??, +?, *? | Modified versions of ?, +, and *. These match as little as possible, unlike the versions that match as much as possible. For example, the input “<abc><def>,” <.*?> matches “<abc>” and the input <.*> matches “<abc><def>.” |
( ) | Grouping operator. For example, (\d+,)*\d+ matches a list of numbers separated by commas such as “1” or “1,23,456.” |
{ } | Indicates a match group. |
\ | An escape character, which interprets the next metacharacter literally. For example, [0-9]+ matches one or more digits, but [0-9]\+ matches a digit followed by a plus character. Also used for abbreviations such as \a for any alphanumeric character. If \ is followed by a number n, it matches the nth match group, starting from 0. For example, <{.*?}>.*?</\0> matches “<head>Contents</head>”. In C++ string literals, two backslashes must be used: “\\+,” “\\a,” “<{.*?}>.*?</\\0>.” |
$ | At the end of a regular expression, this character matches the end of the input. For example, [0-9]$ matches a digit at the end of the input. |
| | Alternation operator that separates two expressions, one of which matches. For example, T|the matches “The” or “the.” |
! | Negation operator. The expression following ! does not match the input. For example, a!b matches “a” not followed by “b.” |
Abbreviation | Definition |
---|---|
\a | Any alphanumeric character, ([a-zA-Z0-9]). |
\b | White space (blank), ([ \\t]). |
\c | Any alphabetic character, ([a-zA-Z]). |
\d | Any decimal digit, ([0-9]). |
\h | Any hexadecimal digit, ([0-9a-fA-F]). |
\n | Newline, (\r|(\r?\n)). |
\q | Quoted string, (\”[^\”]*\”)|(\’[^\’]*\’). |
\w | Simple word, ([a-zA-Z]+). |
\z | Integer, ([0-9+]). |