Hadoop Distributed File System Sources > Introduction to Hadoop Distributed File System catalog sources > Extraction and view process
  

Extraction and view process

To extract metadata from a source system, configure the catalog source and run the extraction job in Metadata Command Center. Then view the results in Data Governance and Catalog.
The following image shows the process to extract metadata from a source system: The process of metadata extraction begins with prerequisites verification, continues with the creation of the catalog source, and ends with viewing the results.
After you verify prerequisites, perform the following tasks to extract metadata from Hadoop Distributed File System:
  1. 1Register a catalog source. Create a catalog source object, select Hadoop Distributed File System, and then select and test the connection.
  2. 2Configure the catalog source. Specify the runtime environment and configure parameters for metadata extraction. Optionally, add filters to include or exclude source system assets from metadata extraction. You can also configure other capabilities such as data profiling and quality, data classification, or glossary association.
  3. 3Configure the catalog source. Specify the runtime environment and configure parameters for metadata extraction. You can also configure other capabilities such as data profiling and quality, data classification, or glossary association.
  4. 4Optionally, associate stakeholders. Associate users with technical assets, giving the users permission to perform actions determined by their roles.
  5. 5Run or schedule the catalog source job.
After you run the catalog source job, you view the results in Data Governance and Catalog.