Catalog Source Configuration > Custom metadata integration > Workflow for custom metadata integration
  

Workflow for custom metadata integration

To ingest metadata from a source system, verify whether you can reuse a system model. For example, to ingest metadata from any relational source system, you can use the relational system model included in Metadata Command Center. If you cannot reuse a system model available in Metadata Command Center, perform the following steps to integrate custom metadata into the catalog.
This image shows the steps in the workflow for custom metadata integration.

Step 1. Create a custom model

When you create a custom model for custom metadata integration, you configure the metadata that you want to ingest in the catalog. You can also configure options to search, filter, and sort the custom metadata ingested into the catalog.
Create a custom model in one of the following ways:
To create a custom model, perform the following steps:
    1On the Customize page, go to the Metadata Models tab.
    Image depicting the Metadata Models tab on the Customize page.
    2To use an existing model that Metadata Command Center provides by default, hover your mouse on the model of your choice and click the Download Model icon on the far right.
    The model definition file in the JSON format is downloaded to your machine. The file contains the package name, properties, classes, attributes, and associations of the objects in the source system.
    3Alternatively, click the Download Template menu and choose Model.
    The sample model file, sample.json, is downloaded to you machine. This file contains sample classes, attributes, and associations that you want to extract from the custom source system.
After downloading the sample model or an existing model, update the details of the metadata following the template structure. Then, import the custom model and publish it in Metadata Command Center.

Step 2. Update the custom model definition file

In the model definition JSON file, update the package details, parent classes, subclasses, attributes and associations of the objects in the custom source system based on your business requirements.
Open the JSON file in a text editor of your choice and update the following components of the model file:

Package

The name of the custom model. For example, com.example.accessdb. The package represents a container for the classes in the model. Verify that the package name does not contain com.infa.odin, com.informatica or infa. These keywords are reserved for system models.

Required Packages

A comma-separated list of required or reused package names. Use the fully-qualified package names if you are reusing any components such as classes, attributes, or associations from other packages. Verify that you define the name of the package as core if you are reusing other packages. For example, core.

Classes

A class describes a group of objects that share the same characteristics. Each class has a set of attributes, those that belongs to itself or from its super classes. It represents the metadata objects that you want to extract from the custom source system. For example, AccessSchema is the root class for a model that you define for the Microsoft Access Database source system. AccessTable and AccessView are the child classes of the AccessSchema class. For information about the class properties that you need to define in the model, see Add class properties.

Attributes

An attribute describes the characteristics of an object. Define the model attributes and how they apply to classes and relationship by creating attributeClass entries and attributeRelationship entries. For information about the attribute properties that you need to define in the model, see Add attribute properties.

Associations

An association represents the relationship between two objects in the catalog. You can create associations between the objects within the custom catalog source and to objects already synced to the catalog. For example, for a model that you define for the Microsoft Access Database source system, the columns in the AccessTable schema has an association with columns in an Oracle database. For information about the association properties that you need to define in the model, see Add association properties.

Data Type

The data type defines the types of values possible for an attribute. For example, string and integer. For information about the data types that you need to define in the model, see Add data types.

Add class properties

Define the following class properties in the classes component of the custom model JSON file.
Property
Description
name
The name of the class.
label
The text displayed to the user to describe this entity.
externalLabel
Not in use.*
description
The description of the class displayed to the user.
isFirstClass
Use this property to prioritize the class in search. Set to true or false. If set to true, Metadata Command Center displays the object count of the classes after you run the calalog source.
IsAbstract
Use this property to specify whether the class is abstract or not. Set to true or false.
cdc
Use this property to specify if the class participates in change data capture. Change data capture is used to record the applied changes for audit purposes. Set to true or false.
superClasses
The list of classes with which the class has a semantic relationship.
deprecated
Use this property to specify whether the class is deprecated. Set to true or false.
indexType
Use this property to specify whether the class can be indexed or not.
Specify any of the following values:
  • - FULL. Indexed in both the elastic and graph stores.
  • - FULL_TEXT. Indexed in elastic, but not in the graph store.
  • - NONE. Not indexed.
appendOnly
Not in use.*
extensions
Indicates extension classes. Extension classes are classes that are linked to other classes. Multiple classes can have the same extension class. It contains common attributes linked to different master classes.
softDeleted
Not in use.*
system
Classes indicated as system do not inherit any property from the IClass abstract class. System classes are organized at the same level as IClass.
* The properties that are not in use can be ignored.

Add attribute properties

Define the following attribute properties in the attributes component of the custom model JSON file.
Property
Description
name
The name of the attribute.
label
The text displayed to the user to describe this attribute.
dataType
The data type of the attribute.
description
The description of the attribute displayed to the user.
multivalued
Use this property to specify if the attribute can have multiple values. Set to true or false.
deprecated
Use this property to specify whether the attribute is deprecated or not. Set to true or false.
derived
Use this property to specify if this attribute is derived from an existing attribute.
data
Use this property to specify whether its metadata is derived from Data. This property is used to handle sensitive attributes like value frequencies or data patterns. Set to true or false.
custom
Use this property to specify if the class should be marked as searchable. Set to true or false.
reference
The reference attribute references a registered association and its target class as a native data type of the source class.
Use this property to specify the association type to relate the class from which the attribute needs to be projected based on the projectionCondition attribute value.
referencedAssociationAttributes
Not in use.*
referencedAttributes
Use this property to specify projected attributes.
projectionType
Use this property to specify the attribute type to be projected to other classes.
Specify any of the following values:
  • - PRIMITIVE. Project a single attribute
  • - NESTED. Project a nested attribute
Define the projected attribute in referencedAttributes.
projectionExpressions
The condition to be applied to project an attribute.
defaultValues
The value to be applied if the user does not provide values for ingestion.
isSystem
Use this property to specify whether system attributes should be added for every object or association that is ingested. Set to true or false.
embedded
Use this property to specify whether the reference data type is embedded into the same class as a struct or created as an extension table. Set to true or false.
searchConfiguration
Use this property to specify the search configuration for indexing before data is ingested.
Specify any of the following values:
  • - AttributeType. Indicates if the attribute is searchable or viewable. Only searchable attributes are indexed and can be searched.
  • Note: You can change the attribute type from VIEWABLE to SEARCHABLE.
  • - FieldMappingTemplate. Elasticsearch uses dynamic mapping to infer the data type of each field and assigns a field type to store each field. Use this property to map attributes in the field mapping template and assign a type to each field.
  • - Suggestable. Suggests similar looking terms based on the provided text.
  • - Aggregatable. Enables aggregate functions on search operations.
  • - Sortable. Enables sorting on search results.
projectionCondition
The condition to project the attribute. You can define conditions in the projectionExpressions and expressionContext attributes.
customizations
Indicates whether you can customize the attribute or not.
deleted
Indicates if the attribute is deleted. You can't use deleted attributes. Use this property to track deleted attributes.
* The properties that are not in use can be ignored.

Add association properties

Define the following association properties in the associations component of the custom model JSON file.
Property
Description
name
The name of the association.
label
The text displayed to the user to describe the association.
description
The description of the association displayed to the user.
fromClass
The source class of the association.
toLabel
The text that describes the relationship to the target class.
fromLabel
The text that describes the relationship from the source class.
toClass
The target class of the association.
associationKinds
The type of relationship or association between objects in the source system.
deprecated
Use this property to specify whether the association is deprecated or not. Set to true or false.
cdc
Use this property to specify if the association participates in change data capture. Change data capture is used to record the applied changes for audit purposes. Set to true or false.
unidirectional
Indicates if the association is only valid for one direction. Set to true or false.
aggregate
Not in use.*
cardinality
Use this property to specify the cardinality. The value can be OneToOne and OneToMany.
index
Use this property to specify the index type for the association. Set to COLLECTION to configure a single elastic document as a collection for classes that are not prioritized using the isFirstClass property.
customizable
Indicates whether you can customize the association or not. Set to true or false.
custom
Indicates whether the association is customized at least once or not. Set to true or false.
deleted
Indicates if the association is deleted. You can't use deleted associations. Use this property to track deleted associations.
* The properties that are not in use can be ignored.

Add data types

You can define the data types in a custom model using the following list of core types available in Metadata Command Center:
In the following example, the data type, curationStatusEnum, is defined using the core type, STRING. Similarly, you can define any data type in a model using the available core types.
{
"name": "curationStatusEnum",
"constraint": {
"constraintType": "LIST_OF_VALUES",
"values": [
"AUTO_ACCEPTED",
"ACCEPTED",
"REJECTED",
"NONE"
]
},
"coreType": "STRING",
"deprecated": false
}

Step 3. Import and publish the custom model

After you have created and defined the model for the custom metadata in the JSON file, you can import and publish the custom model. Based on the custom model, Metadata Command Center organizes the storage of the custom metadata that is ingested. To import and publish the custom model, perform the following steps:
    1Click New in the left navigation panel.
    2In the New dialog box, select Customization from the list in the left pane, and click Metadata Model on the right pane.
    You can also click the plus icon on the Metadata Models page to create a new model.
    Image depicting the New Model dialog window
    3In the New Model window, click Choose File to upload a JSON-based model definition file that you have created for the custom metadata.
    4Enter a package name for the model.
    Verify that the package name is the same as defined in the model definition JSON file.
    5Click Create.
    This creates a draft version of the model that appears on the Metadata Models page.
    6From the Metadata Models page, select the model that you imported. You can view the model details including the classes, attributes, and associations that you defined in the model.
    Image of the model details
    7Verify that your model appears as you expected. To make any updates to the model, modify the JSON file on your machine and click Update to import the latest version of the model file.
    Note: You cannot modify the package name once you have imported the model.
    8To publish the model, click Publish.
    The Model Publish job is triggered. You can click View Status to monitor the exact status of the job on the Job Monitoring Overview page of that job. After the publishing is successful, the lifecycle of the model changes from Draft to Published. Note that you cannot import an updated model while the model publish job is in progress.
Note the following points about the lifecycle of a model:

Step 4. Create a custom catalog source type

The custom catalog source type represents the custom source system from which you want to ingest the metadata. By default, Metadata Command Center provides connections for a variety of source systems. To ingest metadata from source systems for which there are no predefined catalog source types available in Metadata Command Center, create a custom catalog source type based on which you can then create a custom catalog source.
To create a custom catalog source type, define appropriate roles and assign the Create, Read, Update, and Delete permissions for the Custom Catalog Source Type asset for that role in Administrator. For more information about asset permissions that the organization administrator can configure for user roles, see Asset permissions in the Administrator help.
To create a custom catalog source type, perform the following steps:
    1Click New in the left navigation panel.
    2In the New dialog box, select Customization from the list of asset types in the left pane, and click Custom Catalog Source Type on the right pane.
    3Enter a name that describes the source system from which you want to extract metadata.
    4Optionally, enter a description of the source system and click Save.
    The new custom catalog source type appears in the list of custom catalog sources types on the Customize page.
    Image depicting the Custom Catalog Source Types tab on the Customize page
    5To modify the name or description of the catalog source type, click the name of the catalog source type, update the name or description of the catalog source type and click Save.
You're now ready to create a custom catalog source based on the custom catalog source type.

Step 5. Prepare the custom metadata source

To load metadata from a custom source system, prepare the custom metadata source based on the metadata source type. You can choose to load metadata into the catalog using CSV files, Cloud Data Integration, or Java SDK.
If you choose to use CSV files as the custom metadata source, download and update the metadata definition files.
If you choose to use Cloud Data Integration as the custom metadata source, create and run a mapping task or a linear taskflow in Cloud Data Integration. CSV files that contain metadata are generated from a mapping task or a linear taskflow in Cloud Data Integration.
If you choose to use Java SDK as the custom metadata source, build a custom JAR, and copy it to any location on the Secure Agent machine.
Preview Notice: Effective in the July 2023 release, loading metadata into the catalog using Cloud Data Integration and Java SDK are available for preview. Preview functionality is supported for evaluation purposes but is unwarranted and is not supported in production environments or any environment that you plan to push to production. Informatica intends to include the preview functionality in an upcoming release for production use, but might choose not to in accordance with changing market or technical circumstances. For more information, contact Informatica Global Customer Support.

Load metadata into the catalog using CSV files

Before you create the custom catalog source, download the metadata definition files and enter the details of the metadata that you want to ingest from the custom source system into the metadata files.
Download and update the metadata files template. In the file, add the object details of the specific classes that you defined in the custom model for this source system. Metadata Command Center uses the details entered in the metadata files to load metadata from the custom source system into the catalog.
  1. 1Go to the Customization page and click the Metadata Models tab.
  2. 2Click the custom model that you created in Step 1. Create a custom model.
  3. 3Click Download Template > Metadata.
  4. This downloads the metadata template in the ZIP format to your machine. The ZIP file might contain multiple CSV files for the metadata that you have defined in the custom model.
  5. 4Extract the CSV files included in the ZIP file.
  6. The ZIP file contains the following CSV files:
    Open the CSV files in a text editor and enter the objects, association, and lineage details.
  7. 5Enter the details of each object in separate CSV files. Each object corresponds to a specific class that you define in the custom model.
  8. The following table lists the details that you need to enter in the CSV file for each object of the source system:
    Header Field
    Description
    core.externalId
    Required. Unique identity of the object. The ID cannot contain a comma.
    core.reference
    Optional. Set to true if you want to use this object as a reference asset. Reference assets are used in place of actual assets to view the data lineage among one or more sources that don't exist in the catalog. For more information about using reference assets for custom lineage, see Referenced catalog sources and Create custom lineage.
    core.assignable
    Optional. Set to true if you want this object to be available for connection assignment. If set to true, you can assign connections to this object to view the complete lineage.
    core.name
    Required. Name of the object.
    core.description
    Optional. Description of the object.
    core.businessName
    Optional. Specify the name of the business term that you want to associate with this object.
    core.businessDescription
    Optional. Specify the description of the business term that you want to associate with this object.
  9. 6Define associations between the objects in source system in the links.csv file that is included in the template. Define all associations between the objects or classes that you defined in the custom model for this source system. The type of associations that you specify depends on the type of associations defined in the associationKinds property in the custom model. Create each entry on a separate line.
  10. The following table lists the association details that you need to enter in the links.csv file:
    Header Fields
    Source
    Required. Unique external ID of the source object in the association. Specify the ID of the object in a parent-child format and verify that the ID of the object matches the external ID of the object provided in the objects CSV file.
    For example, if the source object is a view, then the identity has the full path to the view, that is, Schema/Table/Column/View.
    Note: $resource represents the catalog source. For this source object, the target object is the root class that is defined in the model. This creates a parent-child association between the specified catalog source and the root class or object in the source system.
    Target
    Required. Unique external ID of the target object in the association. Specify the ID of the object in a parent-child format and verify that the ID of the object matches the external ID of the object provided in the objects CSV file.
    For example, if the target object is a table, then the identity has a full path to the table, that is, Schema/Table.
    Association
    Required. The name of the association in the source system. Specify the association for objects in the <package name>.<association name> format.
  11. 7Include all the CSV files and create a ZIP file.

Load metadata into the catalog using Cloud Data Integration

To load metadata from Cloud Data Integration, create a mapping task in Cloud Data Integration, run the task, and verify that the mappings generate CSV files.
Verify that the name of the CSV file contains the package name of the metadata model and the component class in the following format: <Package name>.<Class name>.csv. For example, the CSV file generated from a metadata model with the package name com.infa.model.relational for the class Table is com.infa.model.relational.Table.csv.
Note: The header fields in the CSV files can contain an underscore (_).

Load metadata into the catalog using Java SDK

To load metadata into the catalog using Java SDK, build a custom JAR, and copy it to any location on the Secure Agent machine. Informatica provides an Oracle, a Swagger, and a Denodo sample project to help you build custom JARs.
To build a custom JAR, perform the following steps:
  1. 1Download a sample project ZIP file from the following location: <Informatica Secure Agent installation directory>/apps/Metadata_Foundation_Agent/<version>/data/scanner/custom/
  2. Note: Informatica provides several sample project ZIP files. Choose the ZIP file that is most suitable for your use case.
    Here, xxxx represents the bundle version.
  3. 2Copy and extract the ZIP file to a machine that has an integrated development environment (IDE).
  4. 3Import the sample project to a suitable IDE.
  5. To build the custom JAR with the Denodo sample project, copy the denodo-vdp-jdbcdriver.jar to the lib folder of the extracted ZIP file. The Denodo JAR is available in the following location: <Denodo installation directory>/tools/client-drivers/jdbc/
  6. 4Run the gradle clean build command.
  7. Note: If you want to use a build tool other than Gradle, use the custom-scanner-sdk-contract-xxxx.jar file present in the lib folder of the extracted ZIP file. Add the custom-scanner-sdk-contract-xxxx.jar file to the project class path.
  8. 5To build the custom JAR with a sample project, use the Java file located in the src folder of the extracted ZIP file.
  9. To build the custom JAR with the Oracle sample project, use the CustomOracleScanner.java file located in the src folder of the extracted ZIP file.
    To build the custom JAR with the Swagger sample project, use the CustomSwaggerScanner.java file located in the src folder of the extracted ZIP file.
    To build the custom JAR with the Denodo sample project, use the CustomDenodoScanner.java file located in the src folder of the extracted ZIP file.
  10. 6Implement the code in Java to ingest metadata from the custom source system.
  11. 7Build the JAR based on your requirements.
  12. You must specify the main class to be executed in the JAR. You can either include all the dependent classes in the JAR and build a fat JAR, or specify class paths in the manifest file to include any external libraries provided outside of the main JAR.
    Note: Use Java Standard Edition version 17, 11, or 8 to build JARs. Newer Java versions will be supported in future releases.
    Important: Informatica is not responsible for any security vulnerabilities that might be associated with the custom JAR.
  13. 8Copy the JAR to any location on the Secure Agent machine.
Consider the following points when you build custom JARs using the Swagger sample project:
Guidelines and best practices to build custom JARs
Consider the following rules and guidelines when you build custom JARs:

Step 6. Create the custom catalog source

After you import and publish a custom model and define the details of your metadata in the metadata files, create a custom catalog source based on the source type.
To create a custom catalog source, define appropriate roles and assign the Create, Read, Update, and Delete permissions for the Custom Catalog Source Type asset for that role in Informatica Intelligent Cloud Services Administrator. For more information about asset permissions that the organization administrator can configure for user roles, see Asset permissions in the Administration help.
    1Click New in the left navigation panel and select Catalog Source from the list of asset types in the left pane.
    2Expand Custom Catalog Source Type on the right pane and select the custom catalog source type that you created for your source system.
    Image depicting the New dialog box. The Custom Catalog Sources section is expanded.
    3Click Create.
    4On the Registration page, enter a name and an optional description for the custom catalog source.
    5In the Connection Information area, configure the following options:
    6Click Next to go to the Configuration page.
    On the Configuration page, the Metadata Extraction capability is enabled by default.
    If you selected Java SDK as the metadata source and want to pass parameters for your custom JAR, add configuration parameters on the Metadata Extraction tab. For example, you can add driver class, URL, username, and password as configuration parameters to connect to an Oracle database.
    To pass parameters for a custom JAR built using the Swagger sample project, configure the following options in the Configuration Parameters area:
    Parameter
    Description
    Name
    Enter the name as "Swagger Path".
    Value
    Enter the path to a Swagger parent folder that contains the JSON Swagger files or to a JSON Swagger file on the Secure Agent machine. The custom catalog source can extract metadata from multiple JSON files placed in different folders.
    To pass parameters for a custom JAR built using the Denodo sample project, configure the following options in the Configuration Parameters area:
    Name
    Value
    Enter the name as "url".
    Enter the URL to access the Denodo source system.
    Example: jdbc:denodo://<host name>:<port number>/admin
    By default, the port number is 9999.
    Enter the name as "userName".
    Enter the username to log in to the Denodo server.
    Enter the name as "password".
    Enter the password associated with the username.
    Enter the name as "schema".
    Enter the schema to connect to the Denodo server.
    You can enable data classification and glossary association for the custom catalog source. For more information about enabling these capabilities, see Step 2. Configure a catalog source.
    7Click Next to go to the Assocations page.
    Assign permissions to users to access this custom catalog source. For more information about assigning roles and users to technical assets, see Step 3. Associate stakeholders and asset groups.
    8Click Next to go to the Schedule page.
    Optionally, select schedules to run the catalog source job.
    9Click Save.
After you save the custom catalog source, it appears in the list of catalog sources on the Explore page on the left navigation tab.

Step 7. Run the custom catalog source

When the catalog source runs, Metadata Command Center processes the metadata from the CSV files and ingests the metadata from the source system into the catalog.
You can run the custom catalog source in one of the following ways:
A job is triggered to run the catalog source. You can monitor the status of the job on the Overview page for that catalog source job.
When the catalog source job is successful, you can search for the custom catalog source in Data Governance and Catalog and view the ingested assets from the custom source system.