Big Data Management User Guide > Processing Hierarchical Data on the Spark Engine > Complex Data Type Definitions
  

Complex Data Type Definitions

A complex data type definition represents the schema of the data. Use complex data type definitions to define the schema of the data for a struct port. If the complex port is of type struct or contains elements of type struct, you must specify the type configuration to reference a complex data type definition.
You can create or import complex data type definitions. You import complex data type definitions from complex files. Complex data type definitions are stored in the type definition library, which is a Model repository object.
When you create a mapping or a mapplet, the Developer tool creates a type definition library with the name Type_Definition_Library. You can view and rename the type definition library and the complex data type definitions in the Outline view of a mapping. The complex data type definitions appear on the type definition library tab of the mapping editor. The tab name is based on the name of the type definition library. You create or import complex data type definitions on the type definition library tab. When a mapping uses one or more mapplets, rename the type definition libraries in the mapping and the mapplets to ensure that the names are unique.
Type definition library and complex data type definitions are specific to a mapping or mapplet. A mapping or a mapplet can have one or more complex data type definitions. You can reference the same complex data type definition in one or more complex ports in the mapping or mapplet.

Complex Data Type Definition Example

The complex ports work_address and home_address of type struct have the following schema:
{street:string, city:string, state:string, zip:integer}
In the mapping, you create a complex data type definition typedef_adrs that defines the schema of the data. Then, you specify the type configuration of the struct ports work_address and home_address to use the typedef_adrs complex data type definition.
The following image shows a complex data type definition typedef_adrs:
The Type Definition Library tab of the mapping contains a complex data type definition typedef_adrs. The Types tab in the Properties view of the complex data type definition lists the complex data type definition elements and the data types of the elements. The complex data type definition contains the following elements: street, city, and state of type string and zip of type integer.
The following image shows two struct ports work_address and home_address that reference the complex data type definition typedef_adrs:
The Ports tab displays five ports. The type configuration column for the two struct ports work_address and home_address show the complex data type definition typedef_adrs.

Nested Data Type Definitions

Elements of a complex data type definition can reference one or more complex data type definitions in the type definition library. Such complex data type definitions are called nested data type definitions.
The following image shows a nested data type definition Company on the type definition library tab:
The image shows a nested data type definition Company that references the following complex data type definitions: Employee, Address, and Contact.
The nested data type definition Company references the following complex data type definitions:
Note: In a recursive data type definition, one of the complex data type definitions at any level is the same as any of its parents. You cannot reference a recursive data type definition to a struct port or a struct element in a complex port.

Rules and Guidelines for Complex Data Type Definitions

Consider the following rules and guidelines when you work with complex data type definitions:

Creating a Complex Data Type Definition

A complex data type definition represents the schema of the data of type struct. You can create a complex data type definition for a mapping or a mapplet on the Type Definition Library tab of the mapping editor.
    1. In the Developer tool, open the mapping or mapplet where you want to import a complex data type definition.
    2. In the Outline view, select the Type Definition Library to view the Type_Definition_Library tab in the mapping editor.
    3. Right-click an empty area of the editor and click New Complex Data Type Definition.
    An empty complex data type definition appears on the Type Definition Library tab of the mapping editor.
    An empty complex data type definition with the name Complex_Data_Type_Definition that contains a Name column and a Type column.
    4. Optionally, change the name of the complex data type definition.
    5. Click the New button to add elements to the data type definition with a name and a type.
    The data type of the element can be of primitive or complex data type.

Importing a Complex Data Type Definition

A complex data type definition represents the schema of the data of type struct. You can import the schema of the hierarchical data in the complex file to the type definition library as a complex data type definition.
Import complex data type definitions from an Avro, JSON, or Parquet schema file. For example, you can import an Avro schema from an .avsc file as a complex data type definition.
    1. In the Developer tool, open the mapping or mapplet where you want to import a complex data type definition.
    2. In the Outline view, select the Type Definition Library to view the Type_Definition_Library tab in the mapping editor.
    3. Right-click an empty area of the editor and click Import Complex Data Type Definitions.
    The Import Complex Data Type Definitions dialog box appears.
    4. Select a complex file format from the File Format list.
    5. Click Browse to select the complex file schema from which you want to import the complex data type definition.
    6. Navigate to the complex file schema and click Open.
    7. Click Next.
    The Choose Complex Data Type Definitions to Import page appears.
    8. Select one or more schemas from the list to import.
    9. Click Next.
    The Complex Data Type Definitions to Import page appears.
    10. Review the complex data type definitions that you want to import and click Finish.
    The complex data type definition appears in the Type Definition Library tab of the mapping or the mapplet.