You can use the Apache Atlas catalog source to extract metadata from an Apache Atlas source system.
Apache Atlas is the governance and metadata framework for Hadoop. Apache Atlas has a scalable and extensible architecture that can be plugged into many Hadoop components to manage their metadata in a central repository.
Extracted metadata
You can extract specific metadata from source systems that you connect to with an Apache Atlas catalog source.
Objects extracted
Metadata Command Center extracts the following metadata from an Apache Atlas source system:
•Atlas Server
•Hive Process
•Sqoop Process
•Calculation
Note:
Calculation objects are extracted when there is column-level lineage from one asset to another in Hive and Sqoop processes.
•Spark Application
•Spark Process
The Apache Atlas catalog source extracts data lineage from the following data sources:
•Oracle
•MySQL
•PostgreSQL
•Apache Hive
•Hadoop Distributed File System (HDFS)
•Apache HBase
Note: Metadata Command Center
skips extraction of Hive processes and the associated lineage links for the following operation types:
•CREATETABLE
•CREATEVIEW
•CREATE_MATERIALIZED_VIEW
Metadata Command Center extracts folders as reference objects from Hadoop Distributed File System.
Metadata Command Center extracts the following objects as reference objects from Apache Hive:
•Schema
•Table
•View
•External Table
•Column
Field and column objects are extracted when there is column-level lineage from one asset to another in Apache Atlas.