Developer Transformation Guide > Normalizer Transformation > Input Hierarchy Definition
  

Input Hierarchy Definition

When you create a Normalizer transformation, you define an input hierarchy that describes records and fields in the source. Define the input hierarchy on the Normalizer view of the transformation.
The Developer tool creates the transformation input ports based on the fields that you define in the input hierarchy. Define the input group structure before you define the transformation output groups.
When you define an input hierarchy, you must define an input structure that corresponds to the structure of the source data. The source data might contain more than one group of multiple-occurring fields. To define the structure, you can configure a record that occurs at the same level as another record in the source. Or, you can define records that occur within other records.

Input Hierarchy Example

The following source row contains customer fields and an address record that occurs twice:
CustomerID
FirstName
LastName
Address
Street
City
State
Country
Address1
Street1
City1
State1
Country1
When you define the input structure in the Normalizer view, you can add the CustomerID, FirstName, and LastName as fields. Define an Address record and include the Street, City, State, and Country fields in the address. Change the Address Occurs value to 2.
The following image shows the input hierarchy in the Normalizer view:
The example Normalizer view shows the CustomerID, FirstName, and LastName fields. At the same level is an Address record. The Occurs value for Address is 2. Within Address, is Street, City, State, and Country. These fields are indented and the level is 2.
The Occurs column in the Normalizer view identifies the number of instances of a field or record in a source row. Change the value in the Occurs column for multiple-occurring fields or records. In this example, the customer fields occur one time, and the Address record occurs twice.
The Level column in the Normalizer view indicates where a field or record appears in the input hierarchy. The customer fields are at level 1 in the hierarchy. The Address record is also level 1.

Normalizer Transformation Input Ports

The Developer tool creates the Normalizer transformation input ports when you define the input hierarchy in the Normalizer view. When you change fields in the input hierarchy, the Developer tool changes the input ports.
View the Normalizer transformation input ports in the Overview view. You can reorder the input ports in the Overview view. To change the input ports, update the input hierarchy in the Normalizer view.
When you define a field as multiple-occurring in the input hierarchy, the Developer tool creates one input port for each instance of the multiple-occurring field. When a record is multiple-occurring, the Developer tool creates an input port for each instance of the fields in the record.

Input Ports Example

The following image shows the input ports that the Developer tool creates for the customer data and the multiple-occurring address data:
The Ports view shows the CustomerID, FirstName, and LastName, Street, Street1, City, City1, State, State1, and Country, Country1.

Flatten Fields

You can flatten fields of a complex data type in mappings that run on the Spark engine. You flatten fields in the Normalizer view to modify hierarchical data that passes through a complex port.
The output of the flatten action depends on the complex data type. When you flatten an array or struct data type, the Normalizer transformation creates a row for each element in the complex data type. When you flatten a map data type, the Normalizer transformation creates two columns for the map key and map value elements.
The flatten action on a nested data type extracts elements at the first-level. To flatten a nested data type at all levels, use the Flatten Complex Port hierarchical conversion wizard in the Developer tool. The Flatten All option extracts elements at each level and returns relational data of primitive data type. For more information about hierarchical conversion wizards, see the Informatica Big Data Management User Guide.
The flatten action changes the value of Occurs column in the Normalizer view to Auto and adds a flatten icon next to the flattened field. The value Auto indicates that the transformation flattens all the elements of the complex data type.
The following image shows a struct that is flattened to a string field with a flatten icon next to it and the Occurs value as Auto:
On the Properties tab of the Normalizer transformation, the Normalizer view shows the struct field StructEmp that is flattened. A flatten icon is displayed next to the flattened field. The value of Occurs for the flattened field is Auto.
You cannot flatten a multi-occurring field. For example, you cannot flatten an array field with Occurs value as 2.
The following image shows a multi-occurring field of an array data type that you cannot flatten:
The Normalizer view shows an array field Emp_Id_Name. The value of Occurs for the array field is 2.

Flatten Array

The Normalizer transformation flattens a one-dimensional array to a primitive data type and an n-dimensional array to an (n-1)-dimensional array. The number of rows that the transformation creates is the same as the size of the array.
For example, if you flatten an array port with 10 string elements, the output returns 10 string ports. If you flatten a 3-dimensional array, the output returns a 2-dimensional array.
A table contains a string port Name and an array port Phones. You want to flatten the array port. The table contains the following values:
Name
Phones
Adams
[205-128-6478, 722-515-2889, 650-213-4020]
Jane
[650-321-4506]
When you flatten the array port, the output is as follows:
Name
Phones
GCID_Phones
Adams
205-128-6478
1
Adams
722-515-2889
2
Adams
650-213-4020
3
Jane
650-321-4506
1
You can edit the Occurs value of a flattened field to extract a specific number of elements in the array. The value must be a positive integer greater than 1. The value determines the number of elements to extract. For example, you can change the value of Occurs to 2 to extract the first two elements of the array. The output is as follows:
Name
Phones
GCID_Phones
Adams
205-128-6478
1
Adams
722-515-2889
2
Jane
650-321-4506
1

Flatten Struct

The transformation flattens a struct into a field of the data type of the elements in the struct. To flatten a struct data type, all the struct elements must be of the same data type. The transformation creates a row for each element in the struct data type.
For example, you want to flatten the following struct field:
customer_address{
city : string
state : string
zip : string
}
The table contains the following values:
Name
customer_address
Clara
{
New York
NY
10032
}
When you flatten the struct port, the output is as follows:
Name
customer_address
GCID_customer_address
Clara
New York
1
Clara
NY
2
Clara
10032
3
When the struct elements are of different data types and at least the first two elements are of the same data type, you can flatten the struct data for consecutive elements of the same data type. To extract consecutive struct elements of the same data type, edit the Occurs value. The value must be a positive integer greater than 1. For example, a struct emp_address contains the following elements:
emp_address{
city : string
state : string
zip : int
country : string
}
You can define the value of Occurs to 2 to extract city and state struct elements. If you define the value as 3 or 4, the mapping validation fails.

Flatten Map

The transformation flattens a map into two fields for the key and value elements in the map. For a map field that you flattened, you cannot change the value of Occurs from Auto to an integer value.
For example, you want to flatten the following map field emp_sal with a string key and an array of integer values:
<emp_name -> [base_sal, bonus, commision]>
The following image shows the map field that you want to flatten in the Normalizer view:
The Normalizer view shows a string field emp_id and a map field emp_sal. The value of Occurs for the string and map fields is 1.
The table contains the following values:
emp_id
emp_sal
12200
<Greg -> [4000, 1000, 500]>
12201
<Patricia -> [3800, 1500, 1000]>
When you flatten the map port, the output returns a string field for the map key and an array field for the map value as follows:
emp_id
emp_sal_Key
emp_sal_Value
GCID_emp_salary
12200
Greg
[4000, 1000, 500]
1
12201
Patricia
[3800, 1500, 1000]
1
The following image shows the map field that is flattened to a string key field and an array value field in the Normalizer view:
The Normalizer view shows the map field emp_sal that is flattened to a string field emp_sal_Key and an array field emp_sal_Value with a type configuration string []. A flatten icon is displayed next to the flattened fields. The value of Occurs for the flattened field emp_sal is Auto.
The following image shows the Output group in the Ports view:
The Ports view of the Normalizer transformation shows an input group with a string port emp_id and a map port emp_sal. The output group contains a string port emp_id, the flattened fields of the map port emp_sal, and a bigint port GCID_emp_sal. The flattened fields of the map port are emp_sal_key of type string and emp_sal_Value of type array with a type configuration string [].

Flattening Fields

Flatten fields of a complex data type to modify hierarchical data or to convert to relational data.
    1. Click the Normalizer view.
    2. Select the field of a complex data type.
    The following image shows a field of type array with string elements:
    The Normalizer view shows a field EmpName of type string, Salary of type double, and arr_Dep_Emp_ID of type array with string elements.
    3. Click the Flatten button.
    The following image shows the Flatten button:
    The Normalizer view shows a field EmpName of type string, Salary of type double, and arr_Dep_Emp_ID of type array with string elements. The Flatten button appears on the right top of the Normalizer view.
    The flatten action replaces the field of a complex data type with a flattened field and changes the value of Occurs to Auto. The data type of the flattened field depends on the complex data type that you flatten.
    The following image shows the flattened field of type string:
    The Normalizer view shows a field EmpName of type string, Salary of type double, and arr_Dep_Emp_ID of type string. The arr_Dep_Emp_ID is the flattened field with a Flatten icon next to it and the value of Occurs is Auto.
    The following image shows the flattened string output port and the GCID output port in the Ports view:
    The Ports view of the Normalizer transformation shows an input group with a port EmpName of type string, Salary of type double, and arr_Dep_Emp_ID of type array with string elements. The output group contains a port EmpName of type string, Salary of type double, the flattened field arr_Dep_Emp_ID of type string, and a bigint port GCID_arr_Dep_Emp_ID.