Complex Ports
A complex port is a port that is assigned a complex data type. Based on the complex data type, you must specify the complex port properties. Use complex ports in transformations to pass or process hierarchical data in a mapping.
The following image shows the complex ports and complex port properties on the Ports tab for a transformation:
- 1. Port
- 2. Complex port
- 3. Type configuration
- 4. Type configuration for an array port
- 5. Type configuration for a map port
- 6. Type configuration for a struct port
- 7. Type configuration for a port of nested data type
Based on the data type, a transformation can include the following ports and port properties:
- Port
- A port of a primitive data type that you can create in any transformation.
- Complex port
- A port of a complex or nested data type that you can create in some transformations. Array, map, and struct ports are complex ports. Based on the complex data type, you specify the complex port properties in the type configuration column.
- Type configuration
- A set of properties that you specify for the complex port. The type configuration determines the data type of the complex data type elements or the schema of the data. You specify the data type of the elements for array and map ports. You specify a complex data type definition for the struct port.
- Type configuration for an array port
- Properties that determine the data type of the array elements. In the image, the array port emp_phone is a one-dimensional array with an ordered collection of string elements. An array with string elements is also called an array of strings.
- Type configuration for a map port
- Properties that determine the data type of the key-value pair of the map elements. In the image, the map port emp_id_dept is an unordered collection of key-value pairs of type integer and string.
- Type configuration for a struct port
- Properties that determine the schema of the data. To represent the schema, you create or import a complex data type definition. In the image, the struct port emp_address references a complex data type definition typedef_adrs.
- Type configuration for a port of nested data type
- Properties that determine the nested data type. In the image, the array port emp_bonus is a one-dimensional array with an ordered collection of struct elements. The struct elements reference a complex data type definition typedef_bonus. An array with struct elements is also called an array of structs.
Complex Ports in Transformations
You can create complex ports in some transformations that are supported on the Spark engine. Read and Write transformations can represent ports that pass hierarchical data as complex data types.
You can create complex ports in the following transformations:
- •Aggregator
- •Expression
- •Filter
- •Java
- •Joiner
- •Lookup
- •Normalizer
- •Router
- •Sorter
- •Union
The Read and Write transformations can read and write hierarchical data in complex files. To read and write hierarchical data, the Read and Write transformations must meet the following requirements:
- •The transformation must be based on a complex file data object.
- •The data object read and write operations must project columns as complex data types.
Rules and Guidelines for Complex Ports
Consider the following rules and guidelines when you work with complex ports:
- •Aggregator transformation. You cannot define a group by value as a complex port.
- •Filter transformation. You cannot use the operators >, < , >=, and <= in a filter condition to compare data in complex ports.
- •Joiner transformation. You cannot use the operators >, < , >=, and <= in a join condition to compare data in complex ports.
- •Lookup transformation. You cannot use a complex port in a lookup condition.
- •Rank transformation. You cannot define a group by or rank value as a complex port.
- •Router transformation. You cannot use the operators >, < , >=, and <= in a group filter condition to compare data in complex ports.
- •Sorter transformation. You cannot define a sort key value as a complex port.
- •You can use complex operators to specify an element of a complex port that is of a primitive data type.
For example, an array port "emp_names" contains string elements. You can define a group by value as emp_names[0], which is of type string.
Creating a Complex Port
Create complex ports in transformations to pass or process hierarchical data in mappings that run on the Spark engine.
1. Select the transformation in the mapping.
2. Create a port.
3. In the Type column for the port, select a complex data type.
The complex data type for the port appears in the Type column.
After you create a complex port, specify the type configuration for the complex port.