What's New and Changed > Part IV: Versions 10.2.2 - 10.2.2 HotFix 1 > 10.2.2 What's Changed > Big Data Management
  

Big Data Management

This section describes the changes to Big Data Management in version 10.2.2.

Hive Connection

Effective in version 10.2.2, the following Hive connection properties are renamed:
The following table describes the properties:
Property
Description
Fine Grained Authorization
When you select the option to observe fine grained authorization in a Hive source, the mapping observes the following:
  • - Row and column level restrictions. Applies to Hadoop clusters where Sentry or Ranger security modes are enabled.
  • - Data masking rules. Applies to masking rules set on columns containing sensitive data by Dynamic Data Masking.
If you do not select the option, the Blaze and Spark engines ignore the restrictions and masking rules, and results include restricted or sensitive data.
LDAP username
LDAP user name of the user that the Data Integration Service impersonates to run mappings on a Hadoop cluster. The user name depends on the JDBC connection string that you specify in the Metadata Connection String or Data Access Connection String for the native environment.
If the Hadoop cluster uses Kerberos authentication, the principal name for the JDBC connection string and the user name must be the same. Otherwise, the user name depends on the behavior of the JDBC driver. With Hive JDBC driver, you can specify a user name in many ways and the user name can become a part of the JDBC URL.
If the Hadoop cluster does not use Kerberos authentication, the user name depends on the behavior of the JDBC driver.
If you do not specify a user name, the Hadoop cluster authenticates jobs based on the following criteria:
  • - The Hadoop cluster does not use Kerberos authentication. It authenticates jobs based on the operating system profile user name of the machine that runs the Data Integration Service.
  • - The Hadoop cluster uses Kerberos authentication. It authenticates jobs based on the SPN of the Data Integration Service. LDAP username will be ignored.
For more information, see the Informatica Big Data Management 10.2.2 User Guide.

Mass Ingestion

Effective in version 10.2.2, deployed mass ingestion specifications run on the Spark engine. Upgraded mass ingestion specifications that were deployed prior to version 10.2.2 will continue to run on the Blaze and Spark engines until they are redeployed.
For more information, see the Informatica Big Data Management 10.2.2 Mass Ingestion Guide.

Spark Monitoring

Effective in version 10.2.2, the Spark monitoring is enabled by default.
Previously, Spark monitoring was disabled by default.
For more information about Spark monitoring, see the Informatica Big Data Management 10.2.2 User Guide.

Sqoop

Effective in version 10.2.2, the following changes apply to Sqoop:

Transformations in the Hadoop Environment

This section describes changes to transformations in the Hadoop environment in version 10.2.2.

Python Transformation

Effective in version 10.2.2, the Python transformation can process data more efficiently on the Spark engine compared to the Python transformation in version 10.2.1. Additionally, the Python transformation does not require you to install Jep, and you can use any version of Python to run the transformation.
Previously, the Python transformation supported only specific versions of Python that were compatible with Jep.
Note: The improvements are available only for Big Data Management.
For information about installing Python, see the Informatica Big Data Management 10.2.2 Integration Guide.
For more information about the Python transformation, see the "Python Transformation" chapter in the Informatica 10.2.2 Developer Transformation Guide.

Write Transformation

Effective in version 10.2.2, the Create or Replace Target Tables advanced property in a Write transformation for relational, Netezza, and Teradata data objects is renamed to Target Schema Strategy.
When you configure a Write transformation, you can choose from the following target schema strategy options for the target data object:
Previously, you selected the Create or Replace Target Tables advanced property so that the Data Integration Service drops the target table at run time and replaces it with a table based on a target table that you identify. When you do not select the Create or Replace Target Tables advanced property, the Data Integration Service retains the existing schema for the target table.
In existing mappings where the Create or Replace Target Tables property was enabled, after the upgrade to version 10.2.2, by default, the Target Schema Strategy property shows enabled for the CREATE - Create or replace table at run time option. In mappings where the Create or Replace Target Tables option was not selected, after the upgrade, the Target Schema Strategy property is enabled for the RETAIN - Retain existing target schema option. After the upgrade, if the correct target schema strategy option is not selected, you must manually select the required option from the Target Schema Strategy list, and then run the mapping.
For more information about configuring the target schema strategy, see the "Write Transformation" chapter in the Informatica Transformation Guide, or the "Dynamic Mappings" chapter in the Informatica Developer Mapping Guide.