Big Data

Effective in version 10.1, Informatica supports the following updated versions of Hadoop distrbutions:

For the full list of Hadoop distributions that Big Data Management 10.1 supports, see the Informatica Big Data Management 10.1 Installation and Configuration Guide.

Hadoop Security Systems

Effective in version 10.1, Informatica supports the following security systems on the Hadoop ecosystem:

Limitations apply to some combinations of security system and Hadoop distribution platform. For more information on Informatica support for these technologies, see the Informatica Big Data Management 10.1 Security Guide.

Spark Runtime Engine

Effective in version 10.1, you can push mappings to the Apache Spark engine in the Hadoop environment.

Spark is an Apache project with a run-time engine that can run mappings on the Hadoop cluster. Configure the Hadoop connection properties specific to the Spark engine. After you create the mapping, you can validate it and view the execution plan in the same way as the Blaze and Hive engines.

When you push mapping logic to the Spark engine, the Data Integration Service generates a Scala program and packages it into an application. It sends the application to the Spark executor that submits it to the Resource Manager on the Hadoop cluster. The Resource Manager identifies resources to run the application. You can monitor the job in the Administrator tool.

For more information about using Spark to run mappings, see the Informatica Big Data Management 10.1 User Guide.

Sqoop Connectivity for Relational Sources and Targets

Effective in version 10.1, you can use Sqoop to process data between relational databases and HDFS through MapReduce programs. You can use Sqoop to import and export data. When you use Sqoop, you do not need to install the relational database client and software on any node in the Hadoop cluster.

To use Sqoop, you must configure Sqoop properties in a JDBC connection and run the mapping in the Hadoop environment. You can configure Sqoop connectivity for relational data objects, customized data objects, and logical data objects that are based on a JDBC-compliant database. For example, you can configure Sqoop connectivity for the following databases:

You can also run a profile on data objects that use Sqoop in the Hive run-time environment.

For more information, see the Informatica 10.1 Big Data Management User Guide.

Transformation Support on the Blaze Engine

Effective in version 10.1, the following transformations are supported on the Blaze engine:

The Address Validator, Consolidation, Data Processor, Match, and Sequence Generator transformations are supported with restrictions.

Effective in version 10.1, the following transformations have additional support on the Blaze engine:

For more information, see the "Mapping Objects in a Hadoop Environment" chapter in the Informatica Big Data Management 10.1 User Guide.

Big Data

Hadoop Ecosystem

Support in Big Data Management 10.1

Hadoop Security Systems

Spark Runtime Engine

Sqoop Connectivity for Relational Sources and Targets

Transformation Support on the Blaze Engine