Big Data Management User Guide > Mapping Transformations in the Hadoop Environment > Java Transformation in the Hadoop Environment

Java Transformation in the Hadoop Environment

The Java transformation processing in the Hadoop environment depends on the engine that runs the transformation.

Java Transformation Support on the Blaze Engine

To use external .jar files in a Java transformation, perform the following steps:

1. Copy external .jar files to the Informatica installation directory in the Data Integration Service machine at the following location: <Informatic installation directory>/services/shared/jars. Then recycle the Data Integration Service.
2. On the machine that hosts the Developer tool where you develop and run the mapping that contains the Java transformation:

a. Copy external .jar files to a directory on the local machine.
b. Edit the Java transformation to include an import statement pointing to the local .jar files.
c. Update the classpath in the Java transformation.
d. Compile the transformation.

Java Transformation Support on the Spark Engine

You can use complex data types to process hierarchical data.

Some processing rules for the Spark engine differ from the processing rules for the Data Integration Service.

General Restrictions

The Java transformation is supported with the following restrictions on the Spark engine:

•The Java code in the transformation cannot write output to standard output when you push transformation logic to Hadoop. The Java code can write output to standard error which appears in the log files.
•For date/time values, the Spark engine supports the precision of up to microseconds. If a date/time value contains nanoseconds, the trailing digits are truncated.

Partitioning

The Java transformation has the following restrictions when used with partitioning:

•The Partitionable property must be enabled in the Java transformation. The transformation cannot run in one partition.
•The following restrictions apply to the Transformation Scope property:

- The value Transaction for transformation scope is not valid.
- If you enable an input port for partition key, the transformation scope must be set to All Input.
- Stateless must be enabled if the transformation scope is row.

Mapping Validation

Mapping validation fails in the following situations:

•You reference an unconnected Lookup transformation from an expression within a Java transformation.
•You select a port of a complex data type as the partition or sort key.
•You enable nanosecond processing in date/time and the Java transformation contains a port of complex data type with an element of a date/time type. For example, a port of type array<data/time> is not valid if you enable nanosecond processing in date/time.
•When you enable high precision, a validation error occurs in the following situations:

- The Java transformation contains a port of a decimal data type.
- The Java transformation contains a complex data type with an element of a decimal data type.

Using External .jar Files

To use external .jar files in a Java transformation, perform the following steps:

1. Copy external .jar files to the Informatica installation directory in the Data Integration Service machine at the following location:

<Informatic installation directory>/services/shared/jars

2. Recycle the Data Integration Service.
3. On the machine that hosts the Developer tool where you develop and run the mapping that contains the Java transformation:

a. Copy external .jar files to a directory on the local machine.
b. Edit the Java transformation to include an import statement pointing to the local .jar files.
c. Update the classpath in the Java transformation.
d. Compile the transformation.

Setting the JDK Path

To use complex ports in the Java transformation and to run Java user code directly on the Spark engine, you must set the JDK path.

In the Administrator tool, configure the following execution option for the Data Integration Service:

Property	Description
JDK Home Directory	The JDK installation directory on the machine that runs the Data Integration Service. Changes take effect after you recycle the Data Integration Service. The JDK version that the Data Integration Service uses must be compatible with the JRE version on the cluster. For example, enter a value such as /usr/java/default. Default is blank.

Java Transformation Suppport on the Hive Engine

You can enable the Stateless advanced property when you run mappings in a Hadoop environment.

The Java code in the transformation cannot write output to standard output when you push transformation logic to Hadoop. The Java code can write output to standard error which appears in the log files.

Some processing rules for the Hive engine differ from the processing rules for the Data Integration Service.

Partitioning

You can optimize the transformation for faster processing when you enable an input port as a partition key and sort key. The data is partitioned across the reducer tasks and the output is partially sorted.

The following restrictions apply to the Transformation Scope property:

•The value Transaction for transformation scope is not valid.
•If transformation scope is set to Row, a Java transformation is run by mapper script.
•If you enable an input port for partition Key, the transformation scope is set to All Input. When the transformation scope is set to All Input, a Java transformation is run by the reducer script and you must set at least one input field as a group-by field for the reducer key.

Using External .jar Files

To use external .jar files in a Java transformation, perform the following steps:

1. Copy external .jar files to the Informatica installation directory in the Data Integration Service machine at the following location: <Informatic installation directory>/services/shared/jars. Then recycle the Data Integration Service.
2. On the machine that hosts the Developer tool where you develop and run the mapping that contains the Java transformation:

a. Copy external .jar files to a directory on the local machine.
b. Edit the Java transformation to include an import statement pointing to the local .jar files.
c. Update the classpath in the Java transformation.
d. Compile the transformation.