Complex File Data Objects for Semi-Structured Data Sources in HDFS
You can create and run a column profile on an Avro, JSON, Parquet, or XML file that uses HDFS. To read the JSON or XML file in HDFS, use a complex file reader to pass the JSON or XML input to the Data Processor transformation.
Complex File Data Object from a JSON or XML Data Source in HDFS
You can create a complex file data object from a JSON or XML file. You can create and run a column profile on data object.
Create a connection to HDFS before you create the data objects for JSON or XML files in HDFS.
You can use one of the following methods to create a data object from a JSON or XML file in HDFS:
- •Create a complex file data object on a JSON or XML file.
- •Create a complex file data object on a folder that contains multiple JSON or multiple XML files.
After you create the data object, you can create and run a column profile on the data object.
Complex File Data Object from an Avro or Parquet Data Source in HDFS
You can create a complex file data object from an Avro or Parquet data source in HDFS. You can use the data object to create and run a column profile.
You can create a complex file data object from an Avro or Parquet file or on a folder that contains multiple Avro or multiple Parquet files. You can create a complex file data object from an Avro and Parquet data source with file or connection access type and resource format as Binary, Avro, or Parquet. You have to create an HDFS connection before you create a complex file data object from the Avro and Parquet data sources.
You can choose one of the following options when you create a data object from Avro and Parquet files in HDFS:
- •Select the access type as file and resource format as Binary.
- •Select the access type as file and resource format as Avro or Parquet.
- •Select the access type as connection and resource format as Avro or Parquet.