If a mapping task references another source system, you can perform connection assignment to view the complete data lineage. When the IDMC metadata jobs run, if a mapping task references another source system, a reference catalog source and connection get created that point to the reference source system. To view the complete lineage for your mapping task, you can perform connection assignment from the reference catalog source connection to the objects in the reference source system. You must first create and run an endpoint catalog source that connects to the reference source system. A reference source system might be a database, such as Oracle.
Before you assign a connection, ensure that you have created and run an endpoint catalog source for each reference source system.
Important: The first job that runs is a connectionless scan and might result in a partial or incomplete lineage. To perform a connection-aware scan, after the first job completes, perform connection assignment, and either run the mapping task again in Data Integration or retry the mapping task job in Metadata Command Center or perform realtime connection assignment if the same reference catalog source connection applies for multiple mapping tasks.
1On the Configure page, select the Lineage tab and then select the Assign Connections tab.
The Assign Connections panel displays a list of assigned and unassigned connections along with details for each connection.
2Select the connection of the reference catalog source that you want to assign to objects in endpoint catalog sources and click Assign.
The following image shows the Connection Assignment tab with the Assign button and the list of connections:
Note: You can find the connection name on the Hierarchy tab of the mapping task in Data Governance and Catalog. The connection name is prefixed to the reference catalog source name.
The Assign Connection dialog box appears with a list of objects of the endpoint catalog sources.
3Select one or more endpoint objects to assign to the selected connection and click Assign.
You can filter the list in the Assign Connection dialog box by name, type, or endpoint.
The following table lists the types of reference source systems that you can connect to and the class type that the endpoint objects must belong to:
Reference source system
Endpoint object class type
Amazon Redshift
Database
Amazon S3
Bucket
Google BigQuery
Database
IBM Db2 for LUW
Database
Microsoft Azure Blob Storage
Container
JDBC
Database
Microsoft Azure Data Lake Storage Gen2
Container
Microsoft Azure Synapse
Database
Oracle
Database
Microsoft SQL Server
Database
PostgreSQL
Database
SFTP File System
File System
Snowflake
Database
Teradata Database
Database
When you click Assign, Metadata Command Center creates links between matching objects in the connected catalog sources, and it calculates the percentage of matched and unmatched objects. The higher the percentage of matched objects, the more accurate the lineage that you view in Data Governance and Catalog.
The following image shows the Assign Connection dialog box:
4After connection assignment, perform any of the following tasks:
- Run the mapping task again in Data Integration.
After the mapping task completes, a new mapping task job runs in Metadata Command Center. After the new mapping task job completes, a new mapping task instance appears on the Relationships tab of the mapping task in Data Governance and Catalog.
Note: The previous mapping task instance run on connectionless scan remains in the catalog.
- Retry the mapping task job in Metadata Command Center.
On the IDMC Metadata tab of the Monitor page, hover the mouse over the mapping task job and click Retry from the Action menu.
After the new mapping task job completes, a new mapping task instance appears on the Relationships tab of the mapping task in Data Governance and Catalog.
Note: The previous mapping task instance run on connectionless scan remains in the catalog.
- Realtime connection assignment. If the reference catalog source connection used in a mapping task is assigned to an endpoint object, the subsequent mapping task jobs which have the same connection runs connection-aware scans.
After the mapping task job completes, a mapping task instance appears on the Relationships tab of the mapping task in Data Governance and Catalog. For the subsequent mapping task jobs, only one mapping task instance is generated.
To view the complete lineage of the mapping task, click the Lineage tab of the mapping task instance.