You can use Metadata Command Center to extract metadata from a source system.
A source system is any system that contains data or metadata. For example, Microsoft Purview is a source system from which you can extract metadata through a Microsoft Purview catalog source with Metadata Command Center. A catalog source is an object that represents and contains metadata from the source system.
Before you extract metadata from a source system, you first create and register a catalog source that represents the source system. Then you configure capabilities for the catalog source. A capability is a task that Metadata Command Center can perform, such as metadata extraction, lineage discovery, data profiling, data classification, or glossary association.
When Metadata Command Center extracts metadata, Data Governance and Catalog displays the extracted metadata and its attributes as technical assets. You can then perform tasks such as analyzing the assets, viewing lineage, and creating links between those assets and their business context.
The following table describes the capabilities of the catalog source:
Capability
Description
Serverless Runtime Environment
A serverless runtime environment is an advanced serverless deployment solution that doesn't require downloading, installing, configuring, or maintaining a Secure Agent or Secure Agent group. You can use a serverless runtime environment in the same way that you use a Secure Agent when you configure a catalog source.
Lineage Discovery
Builds the complete lineage of a catalog source by recommending endpoint catalog source objects to assign to reference catalog source connections. When you run the catalog source job, Metadata Command Center assigns the reference catalog source connections to CLAIRE recommended endpoint catalog source objects. You can then view the list of CLAIRE recommendations and accept or reject them.
Extraction and view process
To extract metadata from a source system, configure the catalog source and run the extraction job in Metadata Command Center. Then view the results in Data Governance and Catalog.
The following image shows the process to extract metadata from a source system:
After you verify prerequisites, perform the following tasks to extract metadata from Microsoft Purview:
1Register a catalog source. Create a catalog source object, select Microsoft Purview, and specify values for connection properties.
2Configure the catalog source. Specify the runtime environment and configure parameters for metadata extraction. Optionally, add filters to include or exclude source system assets from metadata extraction. You can also configure other capabilities such as data profiling and quality, data classification, or glossary association.
3Optionally, associate stakeholders. Associate users with technical assets, giving the users permission to perform actions determined by their roles.
4Run or schedule the catalog source job.
5Optionally, if the catalog source job generates referenced asset objects, you can assign a connection to referenced source system assets.
You can view the lineage with object references without performing connection assignment. After connection assignment, you can view the objects.
After you run the catalog source job, you view the results in Data Governance and Catalog.
About the Microsoft Purview catalog source
You can use the Microsoft Purview catalog source to extract metadata from a Microsoft Purview source system.
Microsoft Purview is a centralized data governance solution that helps organizations manage and govern their on-premises, multicloud, and SaaS data.
Extracted metadata
You can use the Microsoft Purview catalog source to extract metadata from a Microsoft Purview source.
Data stores
Metadata Command Center extracts metadata from the following data stores in Microsoft Purview:
•Database
- Apache Cassandra
- Google BigQuery
- Hive Metastore
- IBM Db2 for LUW
- Microsoft Azure Cosmos DB API for NoSQL
- Microsoft Azure Database for MySQL
- Microsoft Azure Database for PostgreSQL
- Microsoft Azure Dedicated SQL pool (SQL Data Warehouse)
- Microsoft Azure SQL Database
- Microsoft Azure SQL Managed Instance
- Microsoft SQL Server
- MongoDB
- MySQL
- Oracle
- PostgreSQL
- SAP HANA
- Snowflake
- Teradata
•File system
- Amazon S3
- Microsoft Azure Blob Storage
- Microsoft Azure Data Lake Storage Gen2
Objects extracted
Metadata Command Center extracts collections from a Microsoft Purview source.
Metadata Command Center extracts the following objects from referenced source systems: