Input Hierarchy Definition
When you create a Normalizer transformation, you define an input hierarchy that describes records and fields in the source. Define the input hierarchy on the Normalizer view of the transformation.
The Developer tool creates the transformation input ports based on the fields that you define in the input hierarchy. Define the input group structure before you define the transformation output groups.
When you define an input hierarchy, you must define an input structure that corresponds to the structure of the source data. The source data might contain more than one group of multiple-occurring fields. To define the structure, you can configure a record that occurs at the same level as another record in the source. Or, you can define records that occur within other records.
Input Hierarchy Example
The following source row contains customer fields and an address record that occurs twice:
CustomerID
FirstName
LastName
Address
Street
City
State
Country
Address1
Street1
City1
State1
Country1
When you define the input structure in the Normalizer view, you can add the CustomerID, FirstName, and LastName as fields. Define an Address record and include the Street, City, State, and Country fields in the address. Change the Address Occurs value to 2.
The following image shows the input hierarchy in the Normalizer view:
The Occurs column in the Normalizer view identifies the number of instances of a field or record in a source row. Change the value in the Occurs column for multiple-occurring fields or records. In this example, the customer fields occur one time, and the Address record occurs twice.
The Level column in the Normalizer view indicates where a field or record appears in the input hierarchy. The customer fields are at level 1 in the hierarchy. The Address record is also level 1.
Normalizer Transformation Input Ports
The Developer tool creates the Normalizer transformation input ports when you define the input hierarchy in the Normalizer view. When you change fields in the input hierarchy, the Developer tool changes the input ports.
View the Normalizer transformation input ports in the Overview view. You can reorder the input ports in the Overview view. To change the input ports, update the input hierarchy in the Normalizer view.
When you define a field as multiple-occurring in the input hierarchy, the Developer tool creates one input port for each instance of the multiple-occurring field. When a record is multiple-occurring, the Developer tool creates an input port for each instance of the fields in the record.
Input Ports Example
The following image shows the input ports that the Developer tool creates for the customer data and the multiple-occurring address data:
Flatten Fields
You can flatten fields of a complex data type in mappings that run on the Spark engine. You flatten fields in the Normalizer view to modify hierarchical data that passes through a complex port.
The output of the flatten action depends on the complex data type. When you flatten an array or struct data type, the Normalizer transformation creates a row for each element in the complex data type. When you flatten a map data type, the Normalizer transformation creates two columns for the map key and map value elements.
The flatten action on a nested data type extracts elements at the first-level. To flatten a nested data type at all levels, use the Flatten Complex Port hierarchical conversion wizard in the Developer tool. The Flatten All option extracts elements at each level and returns relational data of primitive data type. For more information about hierarchical conversion wizards, see the Informatica Big Data Management User Guide.
The flatten action changes the value of Occurs column in the Normalizer view to Auto and adds a flatten icon next to the flattened field. The value Auto indicates that the transformation flattens all the elements of the complex data type.
The following image shows a struct that is flattened to a string field with a flatten icon next to it and the Occurs value as Auto:
You cannot flatten a multi-occurring field. For example, you cannot flatten an array field with Occurs value as 2.
The following image shows a multi-occurring field of an array data type that you cannot flatten:
Flatten Array
The Normalizer transformation flattens a one-dimensional array to a primitive data type and an n-dimensional array to an (n-1)-dimensional array. The number of rows that the transformation creates is the same as the size of the array.
For example, if you flatten an array port with 10 string elements, the output returns 10 string ports. If you flatten a 3-dimensional array, the output returns a 2-dimensional array.
A table contains a string port Name and an array port Phones. You want to flatten the array port. The table contains the following values:
Name | Phones |
---|
Adams | [205-128-6478, 722-515-2889, 650-213-4020] |
Jane | [650-321-4506] |
When you flatten the array port, the output is as follows:
Name | Phones | GCID_Phones |
---|
Adams | 205-128-6478 | 1 |
Adams | 722-515-2889 | 2 |
Adams | 650-213-4020 | 3 |
Jane | 650-321-4506 | 1 |
You can edit the Occurs value of a flattened field to extract a specific number of elements in the array. The value must be a positive integer greater than 1. The value determines the number of elements to extract. For example, you can change the value of Occurs to 2 to extract the first two elements of the array. The output is as follows:
Name | Phones | GCID_Phones |
---|
Adams | 205-128-6478 | 1 |
Adams | 722-515-2889 | 2 |
Jane | 650-321-4506 | 1 |
Flatten Struct
The transformation flattens a struct into a field of the data type of the elements in the struct. To flatten a struct data type, all the struct elements must be of the same data type. The transformation creates a row for each element in the struct data type.
For example, you want to flatten the following struct field:
customer_address{
city : string
state : string
zip : string
}
The table contains the following values:
Name | customer_address |
---|
Clara | { New York NY 10032 } |
When you flatten the struct port, the output is as follows:
Name | customer_address | GCID_customer_address |
---|
Clara | New York | 1 |
Clara | NY | 2 |
Clara | 10032 | 3 |
When the struct elements are of different data types and at least the first two elements are of the same data type, you can flatten the struct data for consecutive elements of the same data type. To extract consecutive struct elements of the same data type, edit the Occurs value. The value must be a positive integer greater than 1. For example, a struct emp_address contains the following elements:
emp_address{
city : string
state : string
zip : int
country : string
}
You can define the value of Occurs to 2 to extract city and state struct elements. If you define the value as 3 or 4, the mapping validation fails.
Flatten Map
The transformation flattens a map into two fields for the key and value elements in the map. For a map field that you flattened, you cannot change the value of Occurs from Auto to an integer value.
For example, you want to flatten the following map field emp_sal with a string key and an array of integer values:
<emp_name -> [base_sal, bonus, commision]>
The following image shows the map field that you want to flatten in the Normalizer view:
The table contains the following values:
emp_id | emp_sal |
---|
12200 | <Greg -> [4000, 1000, 500]> |
12201 | <Patricia -> [3800, 1500, 1000]> |
When you flatten the map port, the output returns a string field for the map key and an array field for the map value as follows:
emp_id | emp_sal_Key | emp_sal_Value | GCID_emp_salary |
---|
12200 | Greg | [4000, 1000, 500] | 1 |
12201 | Patricia | [3800, 1500, 1000] | 1 |
|
The following image shows the map field that is flattened to a string key field and an array value field in the Normalizer view:
The following image shows the Output group in the Ports view:
Flattening Fields
Flatten fields of a complex data type to modify hierarchical data or to convert to relational data.
1. Click the Normalizer view.
2. Select the field of a complex data type.
The following image shows a field of type array with string elements:
3. Click the Flatten button.
The following image shows the Flatten button:
The flatten action replaces the field of a complex data type with a flattened field and changes the value of Occurs to Auto. The data type of the flattened field depends on the complex data type that you flatten.
The following image shows the flattened field of type string:
The following image shows the flattened string output port and the GCID output port in the Ports view: