Due to technological limitations or security constraints, you might not always see complete lineage after metadata extraction. You can use Metadata Command Center to link catalog sources and construct data lineage based on rules and other criteria. You can choose source and target catalog sources to link and create lineage. You can also choose source and target schemas to restrict lineage inference to specific subsets of data objects within the data sources.
The linked assets and generated lineage links are auto-accepted by default and appear on the Catalog Source Links page in Data Governance and Catalog. Stakeholders of the source and target catalog sources can reject the auto-accepted lineage links from the Action menu. If stakeholders initially reject the generated lineage links and later accept them, they are marked as accepted in Data Governance and Catalog. Stakeholders can also view the generated lineage on the Lineage tab of the asset.
Important: You can link only relational database source systems, such as Oracle, to generate lineage.
The following image shows the Rule Definition tab with Name Matching as the rule type:
- To resolve mount points during metadata extraction from Databricks sources, you can specify the environment initialization file path when you configure the Databricks catalog source. Specify the absolute path to the Python code file that defines the mount points and other environment properties related to the Databricks source.
- You can extract metadata related to workflows from Run Job Tasks.
- You can add metadata extraction filters based on Delta Live Table pipelines.
- You can extract metadata from the following Delta Live Table objects:
▪ Live Tables
▪ Live Views
▪ Pipeline Definitions
▪ Streaming Tables
▪ Streaming Views
dbt
This release includes the following enhancements:
- You can extract metadata from tests.
- You can add metadata extraction filters based on tests.
- You can specify multiple manifest.json files from dbt projects.
- When you perform connection assignment, you can assign Databricks and Amazon Athena as endpoint catalog sources.
Greenplum
This release includes the following enhancements:
- You can extract metadata from stored procedures.
- You can add metadata extraction filters based on stored procedures.
IBM Db2 for LUW
You can extract metadata from an IBM Db2 for LUW database hosted on an Amazon RDS for Db2 database.
IBM Netezza
This release includes the following enhancements:
- You can extract metadata from stored procedures.
- You can add metadata extraction filters based on stored procedures.
Informatica Intelligent Cloud Services
This release includes the following enhancements:
- You can extract metadata from Snowflake stored procedures included in a Data Integration mapping.
- You can extract metadata from Data Integration mappings that use Access Policy transformation.
Microsoft Azure Blob Storage
You can configure partition pruning when you configure the catalog source. Partition pruning helps detect the latest partitions and schemas in source systems. It improves the performance of the catalog source as the updates to partitions and schemas are verified in an incremental mode.
Microsoft Azure Data Factory
You can extract the AppendVariable activity. You can also extract supported activities that include variables with the Array data type.
Oracle
You can add metadata extraction filter conditions based on packages.
PostgreSQL
This release includes the following enhancements:
- You can extract metadata from stored procedures.
- You can add metadata extraction filters based on stored procedures.
Salesforce
You can view the lookup relationships between fields and objects.
SAP Enterprise Resource Planning (ERP)
You can extract subpackages and their assets when you extract metadata from a package.
SAP HANA Database
You can perform connection assignment to view the lineage between an SAP HANA Database table or view and an SAP BW/4HANA DataSource or SAP BW DataSource.
Tableau
This release includes the following enhancements:
- You can extract metadata from Tableau workbooks with stored procedures.
- When you perform connection assignment, you can assign Postgresql as an endpoint catalog source.
Extract group elements from hierarchical JSON files
You can extract group elements from hierarchical JSON files using the Extract Group Elements from Hierarchical Files property for the following catalog sources:
Choose the JDK version to load metadata using Java SDK
When you configure a custom catalog source to load metadata into the catalog using Java SDK, you can choose to run the JAR file with either JDK version 17 or 11. Default is 17.
Build custom JAR files with libraries that are compatible with JDK version 17. If the libraries used in the custom JAR file are incompatible with JDK version 17, you can choose JDK version 11.