Catalog Administrator Guide > Custom Metadata Integration Overview > Metadata Ingestion Overview
  

Metadata Ingestion Overview

To ingest custom metadata into the catalog, you must enter the data from the custom data source into a template following the template structure. The custom resource uses the data provided in the template to extract and ingest metadata into the catalog.
After creating a custom resource type based on the custom model, you must perform the following steps to ingest metadata into the catalog:
  1. 1. Export the custom resource type ZIP file.
  2. 2. Extract the CSV files included in the ZIP file.
  3. 3. Fill the details in the CSV files according to the structure.
  4. 4. Include the updated CSV files in the ZIP file.
  5. 5. Provide the ZIP file when you configure the custom resource.

Exporting the Custom Resource Type Template

You can use the custom resource type template to add the object details of a specific class defined in the model. Enterprise Data Catalog uses the details entered in the template to ingest metadata into the catalog.
    1. Click Manage > Custom Resource Types.
    The Custom Resource Types page appears.
    2. Select the custom resource for which you want to add details from the Custom Resource Types section and click Export Template.
    The custom resource type template is downloaded to your machine as a ZIP file.
    3. Extract the following files in the ZIP file to your machine and enter the required details:
    4. Replace the CSV files in the ZIP file with the updated CSV files.
    You can include multiple objects.csv, links.csv, and lineage.csv files in the ZIP file. Ensure that you name the files in the following format:
    Note: * indicates the custom file names that you want to define after the mandatory name if you are uploading multiple files. For example, you can have multiple objects.csv files in the ZIP file such as objects1.csv, objectscritical.csv, and objects12July.csv.

Entering Association Details

To enter the association details for objects in the custom data source, perform the following steps:
Verify that you do not include white spaces in the headers.
    1. Open the links.csv file in any text editor.
    2. Enter the association details for the data in the custom data source in the following format:
    association, from object, to object
    See the following table for more information about the association details that you must enter:
    Field
    Description
    association
    Mandatory. Represents the name of the association in the data source. To enter object association data for objects in a single resource, verify that you enter the association entries in the following format:com.example.packageName.associationName
    You can use the custom lineage scanner to specify the association details with other resource types.
    fromObjectIdentity
    Mandatory. Unique identity of the object from the from side of the association. Verify that you enter the complete path to the object.
    Make sure that you specify the identity of the of the object in a parent-child format. Verify that the identity of the object matches with the identity of the object provided in the objects.csv file.
    For example, to refer to a column, provide the identity with the full path that includes the schema, the table, and the column as shown: Schema/Table/Column
    Entering the complete path to the objects ensures that Enterprise Data Catalog displays the lineage for the objects.
    toObjectIdentity
    Mandatory. Unique identity of the object from the to side of the association.
    Make sure that you specify the identity of the of the object in a parent-child format. Verify that the identity of the object matches with the identity of the object provided in the objects.csv file.
    For example, to refer to a column, provide the identity with the full path that includes the schema, the table, and the column as shown: Schema/Table/Column
    Entering the complete path to the objects ensures that Enterprise Data Catalog displays the lineage for the objects.
    Note: Verify that you enter each entry on a separate line.
    3. Save the CSV file with the updates.

Entering Class Details

To enter the class details for the data in the custom data source, perform the following steps:
    1. Open the objects.csv file in any text editor.
    2. Enter the class details for the data in the custom data source in the following format:
    class, identity, core.name, core.description, attributes of the class
    See the following table for more information about the class details that you must enter:
    Field
    Description
    class
    Mandatory. Class type of the object in the following format: com.example.packageName.ClassName
    identity
    Mandatory. Unique identity of the class with the path to the class.
    For example, to refer to a column, provide the identity of the object in the parent/child format: Schema/Table/Column
    core.name
    Mandatory. Name of the object.
    core.description
    Optional. Description for the object.
    com.infa.ldm.etlcore.transformationType
    Optional. Define the type of transformation if you want to show the transformation type in the lineage diagram.
    attributes of the object
    Optional. Data for the attributes associated with the object, separated by commas.
    Note: Verify that you enter each entry on a separate line.
    3. Save the CSV file with the updates.

Entering Transformation Details for Custom ETL Resources

To enter the transformation details for a custom ETL resource, perform the following steps:
    1. Open the lineage.csv file in any text editor.
    2. Enter the transformation details in the following format:
    association, from connection, to connection, from object, to object
    See the following table for more information about the class details that you must enter:
    Field
    Description
    Association
    Mandatory. Represents the name of the dataflow association in the core model. To enter object association data for objects in a single resource, verify that you enter the association entries in the following format: com.example.packageName.associationName .
    Note: Ensure that you provide only dataflow associations and not any other association types such as a parent-child associations.
    From Connection
    Optional. The name of the source for the transformation.
    To Connection
    Optional. The name of the target for the transformation.
    From Object
    Mandatory. Unique identity of the object from the from side of the association. Verify that the identity of the object matches with the identity of the object provided in the objects.csv file.
    To Object
    Mandatory. Unique identity of the object in the to side of the association (transformation applied for the object in the From Object column). Verify that the identity of the object matches with the identity of the object provided in the objects.csv file.
    Note: You can use the $etlRes variable at the start of the identity.
    com.infa.ldm.etl.ETLContext
    Optional. Use the following variables located in the Association column of the CSV file to view detailed lineage or summary lineage of the transformations:
    • - core.DataSetDataFlow. Links the source and target objects at the data set level. For example, for a relational source, the data flow is at the table level.
    • - core.DirectionalDataFlow. Links the source and target objects at a data element level. For example, for a relational source the data flow is at the column level.
    Add the variables in the Association column based on the source and target objects for which you want to view detailed or summary lineage.
    Enter True in the com.infa.ldm.etl.ETLContext column for the variables if you want to view detailed lineage for the transformations. Leave the option blank or enter False if you want to view the summary lineage for the transformations.
    In the following sample, detailed lineage is enabled for the tables source table SUPPLIER and the target table SALES:
    Sample 1
    core.DataSetDataFlow,Oracle_source,Oracle_wembley,SUPPLIER,SALES,TRUE
    In the following sample, detailed lineage is enabled for the source and target columns ITEMSID and SALESID in the source and target tables:
    Sample 2
    core.DirectionalDataFlow,Oracle_source,Oracle_wembley,SUPPLIER/ITEMSID,SALES/SALESID,TRUE
    For the From Object and To Object, fields, you can use the $etlRes:// variable at the start of the identity to represent the custom ETL resource that you created. Enterprise Data Catalog replaces $etlRes with the name of the custom ETL resource to form the complete path to the identity of the object. For example, you can specify the identity as $etlRes://SUPPLIER/QUANTITY. If you created a custom ETL resource named test_etl_res, Enterprise Data Catalog replaces the variable as shown in the following sample: test_etl_res://SUPPLIER/QUANTITY.
    Note: Verify that you enter each entry on a separate line.
    3. Save the CSV file with the updates.