Hadoop Multi-Part File Handler
Use the Hadoop Multi-Part File Handler resource to fetch lineage from a combined multi-part file to a relational target.
Prerequisites
Perform the following steps to complete the prerequisites:
- •In the Data Integration Service where the data engineering mappings are deployed, set the HDFSRetainOriginialTargetFile custom property value as True.
- •Configure and run the relational and the HDFS metadata sources that are used to create the data engineering mappings.
- •Configure and run the Infomatica Platform metadata resource.
- •For the required relational and HDFS metadata sources, assign the connections using the Connection Assignment option in the Catalog.
Resource Connection Properties
The General tab includes the following properties:
Property | Description |
---|
HDFS Resource Name | Name of the HDFS resource. |
Source Relational Resource Name | Name of the relational source. |
Target Relational Resource Name | Name of the target relational source. |
The Metadata Load Settings tab includes the following properties:
Property | Description |
---|
Enable Source Metadata | Extracts metadata from the data source. |
Memory | The memory required to run the scanner job. Select one of the following values based on the data set size imported: Note: For more information about the memory values, see the Tuning Enterprise Data Catalog Performance article on How To-Library Articles tab in the Informatica Doc Portal |
JVM Options | JVM parameters that you can set to configure the scanner container. Use the following arguments to configure the parameters: - - -Dscannerloglevel=<DEBUG/INFO/ERROR>. Changes the log level of scanner to values, such as DEBUG, ERROR, or INFO. Default value is INFO.
- - -Dscanner.container.core=<No. of core>. Increases the core for the scanner container. The value should be a number.
- - -Dscanner.yarn.app.environment=<key=value>. Key pair value that you need to set in the Yarn environment. Use a comma to separate the key pair value.
- - -Dscanner.pmem.enabled.container.memory.jvm.memory.ratio=<1.0/2.0>. Increases the scanner container memory when pmem is enabled. Default value is 1.
|