Hadoop Distributed File System
Hadoop Distributed File System is a distributed file system that handles large data sets that run on commodity hardware.
The Hadoop Distributed File System catalog source is available on the following clusters:
- •Cloudera Data Platform (CDP)
- •Amazon EMR
- •Google Dataproc
- •Azure HDInsight
Objects extracted
Hadoop Distributed File System supports the Azure HDInsight, Amazon EMR, Cloudera Data Platform, and Google Dataproc distributions.
Metadata Command Center extracts the following objects from a Hadoop Distributed File System source system:
- •File System
- •Folder
- •File
- •Flat File
- •Hierarchical File
- •Flat Field
- •Hierarchical Field
- •XML File
- •XSD File
- •Attribute
- •Element
You can extract workbooks, worksheets, and columns from Microsoft Excel files.