Developer Transformation Guide > Python Transformation > Python Transformation Setup Requirements

Python Transformation Setup Requirements

Before you can use the Python transformation, you must prepare the Spark engine to process the Python transformation.

Complete the following tasks:

•Install Python and Jep on the Data Integration Service machine.
•Configure Spark execution parameters in the Hadoop connection.

Install Python, JEP, and Third-Party Libraries

Install Python to run the Python code in the Python transformation. When you install Python, you must install the Jep package. Optionally, you can install additional third-party libraries.

Install Python with the --enable-shared option to ensure that shared libraries are accessible by Jep.

The Python transformation supports the following Python versions:

•2.7
•3.3
•3.4
•3.5
•3.6

To install Jep, consider the following installation options:

•Run pip install jep. Use this option if Python is installed with the pip package.
•Configure the Jep binaries. Ensure that jep.jar can be accessed by Java classloaders, the shared Jep library can be accessed by Java, and Jep Python files can be accessed by Python.

Optionally, you can install third-party libraries such as numpy, scikit-learn, and cv2. You can access the third-party libraries in the Python transformation.

After you install Python, Jep, and any third-party libraries, copy the Python installation folder to the following location on the Data Integration Service machine:

<Informatica installation directory>/services/shared/spark/python

Changes take effect after you restart the Data Integration Service.

Configure Spark Execution Parameters

To configure the Spark engine to run the Python transformation, configure the following Spark execution parameters:

infaspark.pythontx.executorEnv.LD_PRELOAD
infaspark.pythontx.submit.lib.JEP_HOME