Define the fields to be normalized on the Normalized Fields tab. You can also include other incoming fields that you want to use in the mapping.
When you define normalized fields, you can create fields manually or select fields from a list of incoming fields. When you create a normalized field, you can set the data type to String or Number, and then define the precision and scale. In advanced mode, you can use any primitive data type.
In advanced mode, the Normalizer transformation produces multiple output groups. Otherwise, the Normalizer transformation produces only one output group.
When incoming fields include multiple-occurring fields without a corresponding category field, you can create the category field to define the occurs for the data. For example, to represent three fields with different types of income, you can create an Income category field and set the occurs value to 3.
Occurs configuration
Configure the occurs value for a normalized field to define the number of instances the field occurs in incoming data.
To define a multiple occurring field, set the occurs value for the field to an integer greater than one. When you set an occurs value to greater than one, the Normalizer transformation creates a generated column ID field for the field. The Normalizer transformation also creates a generated key field for all normalized data.
The Normalizer transformation also uses the occurs value to create a corresponding set of output fields. The output fields display on the Field Mapping tab of the Normalizer transformation. The naming convention for the output fields is <occurs field name>_<occurs number>.
To define a single-occurring field, set the occurs value for the field to one. Define a single-occurring field to include incoming fields that do not need to be normalized in the normalized fields list.
Unmatched groups of multiple-occurring fields
You can normalize more than one group of multiple-occurring fields in a Normalizer transformation. When you include more than one group and the occurs values do not match, configure the mapping to avoid validation errors.
Use one of the following methods to process groups of multiple-occurring fields with different occurs values.
Write the normalized data to different targets
You can use multiple-occurring fields with different occurs values when you write the normalized data to different targets.
For example, the source data includes an Expenses field with four occurs and an Income field with three occurs. You can configure the mapping to write the normalized expense data to one target and to write the normalized income data to a different target.
Use the same occurs value for multiple occurring fields
You can configure the multiple-occurring fields to use the same number of occurs, and then use the generated fields that you need. When you use the same number of occurs for multiple-occurring fields, you can write the normalized data to the same target.
For example, when the source data includes an Expenses field with four occurs and an Income field with three occurs, you can configure both fields to have four occurs.
When you configure the Normalizer field mappings, you can connect the four expense fields and the three income fields, leaving the unnecessary income output field unused. Then, you can configure the mapping to write all normalized data to the same target.
Generated keys
The Normalizer transformation generates key values for normalized data.
Generated keys fields appear on the Normalized Fields tab when you configure the field to have more than one occurrence.
The mapping task generates the following fields for normalized data.
Generated Key
A key value that the task generates each time it processes an incoming row. When a task runs, it starts the generated key with one and increments by one for each processed row.
The Normalizer transformation uses one generated key field for all data to be normalized.
The naming convention for the Normalizer generated key is GK_<redefined_field_name>.
Note: The generated key is not applicable in advanced mode.
Generated Column ID
A column ID value that represents the instance of the multiple-occurring data. For example, if an Expenses field that includes four occurs, the task uses values 1 through 4 to represent each type of occurring data.
The Normalizer transformation uses a generated column ID for each field configured to occur more than one time.
The naming convention for the Normalizer generated key is GCID_<redefined_field_name>.
An advanced cluster processes the generated column ID field as a bigint. The Data Integration Server processes the ID as an integer.