Understanding Technical Assets > Technical asset types > Databricks
  

Databricks

Databricks combines data warehouses and data lakes into an AI-driven Databricks Lakehouse platform.
You can run connection-aware scans on Databricks sources.
You can use SQL warehouse or all-purpose clusters to extract metadata.
Note: To improve wildcard lineage at the directory or file level for Databricks assets, perform connection assignment, and run the Databricks catalog source again. These wildcards can refer to files in Amazon S3 and Microsoft Azure Data Lake Storage Gen2.

Objects extracted

You can extract AI Model Core and AI Model Core Version objects from Databricks Unity Catalog source systems. Additionally, you can retrieve lineage captured by Databricks Unity Catalog.
Note: Databricks Unity Catalog retains lineage data for 90 days.
You can extract metadata from Databricks notebooks if they use the following technologies:
You can extract the following objects from a Databricks workspace:
You can extract the following objects from Databricks Unity Catalog:
You can extract the following complex data type columns along with their nested fields from Databricks source systems:

Data profiling for Databricks objects

Configure data profiling to run profiles on the metadata extracted from Databricks Delta Lake source systems. You can run profiles on Databricks Delta tables created in all-purpose clusters or SQL warehouse. You can also run profiles on Databricks Unity Catalog objects.
You can run profiles on the following Databricks Delta Lake objects:
You can view the profiling statistics in Data Governance and Catalog. The data profiling task runs profiles on the following data types for Databricks Delta Lake objects:
Sampling type
Determine the sample rows on which you want to run the data profiling task. You can choose one of the following sampling types for a Databricks catalog source:

Data lineage

The following lineage data is available for Databricks assets:
You can extract lineage information from the following source systems:
You can extract lineage information from the following technologies, if available in Unity Catalog:
For more information about data lineage, see the Asset Discovery help.