Data Discovery Guide > Part III: Data Discovery with Informatica Developer > Column Profiles on Semi-structured Data Sources > Complex File Data Objects for Semi-Structured Data Sources in HDFS
  

Complex File Data Objects for Semi-Structured Data Sources in HDFS

You can create and run a column profile on an Avro, JSON, Parquet, or XML file that uses HDFS. To read the JSON or XML file in HDFS, use a complex file reader to pass the JSON or XML input to the Data Processor transformation.

Complex File Data Object from a JSON or XML Data Source in HDFS

You can create a complex file data object from a JSON or XML file. You can create and run a column profile on data object.
Create a connection to HDFS before you create the data objects for JSON or XML files in HDFS.
You can use one of the following methods to create a data object from a JSON or XML file in HDFS:
After you create the data object, you can create and run a column profile on the data object.

Complex File Data Object from an Avro or Parquet Data Source in HDFS

You can create a complex file data object from an Avro or Parquet data source in HDFS. You can use the data object to create and run a column profile.
You can create a complex file data object from an Avro or Parquet file or on a folder that contains multiple Avro or multiple Parquet files. You can create a complex file data object from an Avro and Parquet data source with file or connection access type and resource format as Binary, Avro, or Parquet. You have to create an HDFS connection before you create a complex file data object from the Avro and Parquet data sources.
You can choose one of the following options when you create a data object from Avro and Parquet files in HDFS: