Hadoop Multi-Part File Handler
Use the Hadoop Multi-Part File Handler resource to fetch lineage from a combined multi-part file to a relational target.
Prerequisites
Perform the following steps to complete the prerequisites:
- •In the Data Integration Service where the data engineering mappings are deployed, set the HDFSRetainOriginialTargetFile custom property value as True.
- •Configure and run the relational and the HDFS metadata sources that are used to create the data engineering mappings.
- •Configure and run the Infomatica Platform metadata resource.
- •For the required relational and HDFS metadata sources, assign the connections using the Connection Assignment option in the Catalog.
Basic Information
The General tab includes the following basic information about the resource:
Information | Description |
---|
Name | The name of the resource. |
Description | The description of the resource. |
Resource type | The type of the resource. |
Execute On | You can choose to execute on the default catalog server or offline. |
Resource Connection Properties
The General tab includes the following properties:
Property | Description |
---|
HDFS Resource Name | Name of the HDFS resource. |
Source Relational Resource Name | Name of the relational source. |
Target Relational Resource Name | Name of the target relational source. |
The Metadata Load Settings tab includes the following properties:
Property | Description |
---|
Enable Source Metadata | Extracts metadata from the data source. |
Memory | The memory required to run the scanner job. Select one of the following values based on the data set size imported: Note: For more information about the memory values, see the Tuning Enterprise Data Catalog Performance article on How To-Library Articles tab in the Informatica Doc Portal |
Custom Options | JVM parameters that you can set to configure the scanner container. Use the following arguments to configure the parameters: - - -Dscannerloglevel=<DEBUG/INFO/ERROR>. Changes the log level of scanner to values, such as DEBUG, ERROR, or INFO. Default value is INFO.
- - -Dscanner.container.core=<No. of core>. Increases the core for the scanner container. The value should be a number.
- - -Dscanner.yarn.app.environment=<key=value>. Key pair value that you need to set in the Yarn environment. Use a comma to separate the key pair value.
- - -Dscanner.pmem.enabled.container.memory.jvm.memory.ratio=<1.0/2.0>. Increases the scanner container memory when pmem is enabled. Default value is 1.
|