Hierarchy Processor transformation overview

You can choose from several different processing strategies to meet your needs, depending on the format of the source data and your desired output format.

The Hierarchy Processor transformation can operate in the following modes:

The data that you pass to or from the Hierarchy Processor transformation requires a Microsoft Azure Data Lake Store V2 or an Amazon S3 V2 connection.

For more information about the Hierarchy Processor transformation, you can watch the following videos on YouTube:

Hierarchical to relational data processing

In a mapping that converts hierarchical data to relational output, you can process one hierarchical input group and write the data to multiple relational output groups. The output data can be written as normalized relational data or to delimited flat files.

In this mapping, the data source is a complex file containing customer and order data. The data flows into two relational files: a file with customer data and a file with order data.

Relational to hierarchical data processing

In a mapping that converts relational data to hierarchical output, you can have up to five relational input groups and write to one hierarchical output group. This transformation allows you to create structs and arrays. It also allows you to join data sources, group by and order by data fields, filter for specific information, and aggregate both the input and output data.

In this mapping, the source input includes three relational files: customer address data, purchase orders, and purchase order details. The data flows into one complex file that combines data from the three source files.

Hierarchical to hierarchical data processing

In a mapping that converts hierarchical data to hierarchical data, you can read from one or more hierarchical input groups and write to one hierarchical output group.

You can convert hierarchical input from one schema to a different schema. You can read data from primitive fields, structs, and arrays and arrange the data in a different structure.

You can also transform the data that you are converting. You can join data sources, configure group by and order by fields, filter for specific information, and aggregate incoming and output data.

The following image shows an example of a mapping that uses a Hierarchy Processor transformation to convert hierarchical data to hierarchical data of a different structure:

In this mapping, the data source is a JSON file that contains orders and items data. The data flows into a different JSON file that contains order information. The Hierarchy Processor transformation is selected, and the Hierarchy Processor tab shows the structure of the incoming and output data.

Hierarchical to flattened data processing

The Hierarchy Processor transformation includes a flattened option for output data. Use the flattened output format to convert hierarchical input into denormalized output.

In a mapping that converts hierarchical data to flattened data, you can read from one hierarchical input group and write to one flattened output group. You can read data from primitive fields, structs, and arrays and quickly create a fully denormalized output file. You can also flatten and denormalize only a portion of the incoming fields.

For data sources that contain sibling arrays, you can easily denormalize the output data without the need for complex joins. Select the check box next to the incoming fields you want. The Hierarchy Processor transformation adds the field to the output and creates the expression automatically.

The following image shows a mapping that uses a Hierarchy Processor transformation to convert hierarchical data to flattened data:

In this mapping, the data source is a JSON file that contains personal and vehicle data. The data flows into a flattened file that contains vehicle information. The Hierarchy Processor transformation is selected, and the Hierarchy Processor tab shows the structure of the incoming and output data.

Configuration for multibyte hierarchical data

If a mapping includes a transformation that processes hierarchical data and the data uses multibyte characters, configure the Secure Agent machine to use UTF-8.

On Windows, create the INFA_CODEPAGENAME=UTF-8 environment variable in Windows System Properties.

Field restrictions

Elastic mappings support up to 7,000,000 input and output fields. All mappings are subject to this limit. However, mappings that include the Hierarchy Processor transformation are more likely to approach the limit because of the complex data involved.

The more joins, child fields, nested fields, and flattened arrays that the Hierarchy Processor transformation contains, the more likely the mapping will exceed the field limit.

If the mapping exceeds the limit, the following message appears in the mapping compilation log:

[LDTM_0502] The mapping [<mapping name>] failed because the number of fields in the compiled mapping exceeds the threshold: [7,000,000]. Number of fields: [<actual number>]. Create multiple mappings to process the data incrementally.

To resolve the error, create multiple mappings to process the complex data incrementally or reduce the size of the mapping.