Multiple-Occurring Fields
When a field repeats multiple-times in the source data, you can define the field as a multiple-occurring field in the input row hierarchy. The Normalizer transformation can return a separate row for each occurrence of a multiple-occurring field or group of fields in a source.
A source row might contain four quarters of sales by store:
Store | Sales(1) | Sales(2) | Sales(3) | Sales(4) |
---|
Store1 | 100 | 300 | 500 | 700 |
Store2 | 250 | 450 | 650 | 850 |
When you define the Normalizer input hierarchy, you can you combine the four Sales fields into one multiple-occurring field. Define a field name such as Qtr_Sales and configure it to occur four times in the source.
When the output group contains the store data and the sales data, the Normalizer transformation returns a row for each Store and Qtr_Sales combination. The output row contains an index that identifies which instance of Qtr_Sales that is in the output row.
The transformation returns the following rows:
Store | Qtr_Sales | Qtr (GCID) |
---|
Store1 | 100 | 1 |
Store1 | 300 | 2 |
Store1 | 500 | 3 |
Store1 | 700 | 4 |
Store2 | 250 | 1 |
Store2 | 450 | 2 |
Store2 | 650 | 3 |
Store2 | 850 | 4 |
When an output group contains single-occurring columns and a multiple-occurring column, the Normalizer returns duplicate data for the single-occurring columns in each output row. For example, Store1 and Store2 repeat for each instance of Qtr_Sales.
A source row might contain more than one level of multiple-occurring data. You can configure the Normalizer transformation to return separate rows at each level based on how you define the input hierarchy.
Generated Column ID
The Normalizer transformation returns a generated column ID (GCID) output port for each instance of a multiple-occurring field.
The generated column ID port is an index for the instance of the multiple-occurring data. For example, if a field occurs four times in a source record, the Developer tool returns a value of 1, 2, 3, or 4 in the generated column ID port based on which instance of the multiple-occurring data occurs in the row.