- You can extract metadata from the following objects of Databricks Unity Catalog:
▪ AI model
▪ AI model versions
- You can extract table metadata from information_schema for Databricks Unity Catalog.
- You can use OAuth machine-to-machine authentication to connect to a Databricks source system.
- When you extract metadata from Databricks notebooks, you can use the Python Default Variables Values property to specify values for Python default variables.
This release includes the following profiling enhancements:
Microsoft SQL Server
You can run a data profiling job on metadata extracted from any database or schema regardless of the database or schema name that you specified in the connection properties.
Oracle
You can run a data profiling job on metadata extracted from any schema regardless of the schema name that you specified in the connection properties.
Microsoft SQL Server and Oracle
You can profile columns with names up to 128 characters in length.
SAP ERP
You can run a data profiling job on a limited number of rows using the Limit N Rows sampling type.
Teradata Database
You can run profiles on metadata extracted from multiple databases.
If you want to create a workflow that is similar to an existing one, you can clone the existing workflow and modify the workflow name and other details as per your requirement.
For more information about designing workflows, see Workflows.
Enable lineage discovery for catalog sources
Connection assignment can be a time-consuming task. To simplify this, you can now use CLAIRE to help build complete lineage of a catalog source by recommending the endpoint catalog source objects to be assigned to reference catalog source connections. To view CLAIRE recommendations, you need to enable lineage discovery when you configure a catalog source. When you run the catalog source job, Metadata Command Center assigns the reference catalog source connections to CLAIRE recommended endpoint catalog source objects. You can then view the list of CLAIRE recommendations and accept or reject them.
For more information about lineage discovery, see Lineage discovery.
Define filters when you link catalog sources
When you link catalog sources to generate lineage automatically with CLAIRE, you can choose to define filters for both source and target catalog sources.
You can now run incremental metadata extraction jobs on the following catalog sources:
•Microsoft Fabric Data Lakehouse
•Microsoft Fabric Data Warehouse
A full metadata extraction extracts all objects from the source to the catalog. An incremental metadata extraction considers only the changed and new objects since the last successful catalog source job run. Incremental metadata extraction doesn’t remove deleted objects from the catalog and doesn’t extract metadata of code-based objects.
Use abbreviations and synonyms for glossary association
You can choose to use the data in a lookup table as synonyms and abbreviations to associate glossary terms with technical assets. To use the data in a lookup table, enable the Glossary Association Synonyms option in the lookup table.
System jobs and user jobs get deleted after a retention period. The retention period is 30 days for system jobs and IDMC metadata jobs and 90 days for user jobs.
For information about monitoring jobs, see Jobs in the Administration help.
Predefined data element classifications
You can import and use the following predefined data classifications to perform data classification on a source system:
•Indian Phone Number
•Indian City
•Indian District
•Indian PIN
•Indian State
•Indian Goods and Services Tax Identification Number (GSTIN)
•Indian EPIC Number
•India Passport Number
For more information, see the Predefined data element classifications in Cloud Data Governance and Catalog how-to library article.
Runtime environment
When you choose a runtime environment, you can only choose from Secure Agents installed on the operating system applicable to the catalog source.
You can detect partitions that use the epoch time format in the following source systems:
•Amazon S3
•Google Cloud Storage
•Hadoop Distributed File System
•Microsoft Azure Blob Storage
•Microsoft Azure Data Lake Storage Gen2
•Microsoft Fabric OneLake
•Oracle Cloud Object Storage
•SFTP File System
Epoch time is the number of milliseconds between the current time and midnight January 1, 1970 UTC. For example, the epoch timestamp for 10/11/2021 12:04:41 GMT (MM/dd/yyyy HH:mm:ss) is 1633953881 and the timestamp in milliseconds is 1633953881000.
To detect partitions, define the custom partition in JSON format in the configuration file as: {"CustomPartitionPatterns": ["@"]}
Use reference data from Reference 360 in data classifications
You can use reference data from Reference 360 to look up values when you define data element classifications in Metadata Command Center.
For more information about using reference data to define data element classification, see Data classification.
Data element classification category
You can now create and define a classification category for a data element classification in Metadata Command Center. From the Asset Customization tab on the Customize page, you can create or edit values for a classification category attribute of a data element classification. Then, from the Explore page you can add multiple classification categories to a data classification.
For more information about creating or adding classification categories to a data element classification, see Data classification.
SAP transports
New SAP transports are available for SAP ERP catalog sources.