Release Guide > Part II: Version 10.1.1 > New Features (10.1.1) > Big Data
  

Big Data

This section describes new big data features in version 10.1.1.

Blaze Engine

Effective in version 10.1.1, the Blaze engine has the following new features:

Hive Sources and Targets on the Blaze Engine

Effective in version 10.1.1, Hive sources and targets have the following additional support on the Blaze engine:
For more information, see the "Mapping Objects in the Hadoop Environment" chapter in the Informatica Big Data Management® 10.1.1 User Guide.

Transformation Support on the Blaze Engine

Effective in version 10.1.1, transformations have the following additional support on the Blaze engine:
For more information, see the "Mapping Objects in the Hadoop Environment" chapter in the Informatica Big Data Management 10.1.1 User Guide.

Blaze Engine Monitoring

Effective in Version 10.1.1, more detailed statistics about mapping jobs are available in the Blaze Summary Report. In the Blaze Job Monitor, a green summary report button appears beside the names of successful grid tasks which opens the Blaze Summary Report.
The Blaze Summary Report contains the following information about a mapping job:
Note: The Blaze Summary Report is in beta. It contains most of the major features, but is not yet complete.

Blaze Engine Logs

Effective in version 10.1.1, the following error logging enhancements are available on the Blaze engine:
For more information, see the "Monitoring Mappings in a Hadoop Environment" chapter in the Informatica Big Data Management 10.1.1 User Guide.

Installation and Configuration

This section describes new features related to big data installation and configuration.

Address Reference Data Installation

Effective in version 10.1.1, Informatica Big Data Management installs with a shell script that you can use to install address reference data files. The script installs the reference data files on the compute nodes that you specify.
When you run an address validation mapping in a Hadoop environment, the reference data files must reside on each compute node on which the mapping runs. Use the script to install the reference data files on multiple nodes in a single operation.
The shell script name is copyRefDataToComputeNodes.sh.
Find the script in the following directory in the Informatica Big Data Management installation:
[Informatica installation directory]/tools/dq/av
When you run the script, you can enter the following information:
If you do not enter the information, the script uses a series of default values to identify the file locations and the user name.
For more information, see the Informatica Big Data Management 10.1.1 Installation and Configuration Guide.

Hadoop Configuration Manager in Silent Mode

Effective in version 10.1.1, you can use the Hadoop Configuration Manager in silent mode to configure Big Data Mangement.
For more information about configuring Big Data Management in silent mode, see the Informatica Big Data Management 10.1.1 Installation and Configuration Guide.

Installation in an Ambari Stack

Effective in version 10.1.1, you can use the Ambari configuration manager to install Big Data Management as a service in an Ambari stack.
For more information about installing Big Data Management in an Ambari stack, see the Informatica 10.1.1 Big Data Management Installation and Configuration Guide.

Script to Populate HDFS in HDInsight Clusters

Effective in version 10.1.1, you can use a script to populate the HDFS file system on an Azure HDInsight cluster when you configure the cluster for Big Data Management.
For more information about using the script to populate the HDFS file system, see the Informatica Big Data Management 10.1.1 Installation and Configuration Guide.

Spark Engine

Effective in version 10.1.1, the Spark engine has the following new features:

Binary Data Types

Effective in version 10.1.1, the Spark engine supports binary data type for the following functions:
Note: The Spark engine does not support binary data type for the join and lookup conditions.
For more information, see the "Function Reference" chapter in the Informatica Big Data Management 10.1.1 User Guide.

Transformation Support on the Spark Engine

Effective in version 10.1.1, transformations have the following additional support on the Spark engine:
For more information, see the "Mapping Objects in the Hadoop Environment" chapter in the Informatica Big Data Management 10.1.1 User Guide.

Run-time Statistics for Spark Engine Job Runs

Effective in version 10.1.1, you can view summary and detailed statistics for mapping jobs run on the Spark engine.
You can view the following Spark summary statistics in the Summary Statistics view:
The Detailed Statistics view displays a graph of the row counts for Spark engine job runs.
For more information, see the "Mapping Objects in the Hadoop Environment" chapter in the Informatica Big Data Management 10.1.1 User Guide.

Security

This section describes new big data security features in version 10.1.1.

Fine-Grained SQL Authorization Support for Hive Sources

Effective in version 10.1.1, you can configure a Hive connection to observe fine-grained SQL authorization when a Hive source table uses this level of authorization. Enable the Observe Fine Grained SQL Authorization option in the Hive connection to observe row and column-level restrictions that are configured for Hive tables and views.
For more information, see the Authorization section in the "Introduction to Big Data Management Security" chapter of the Informatica 10.1.1 Big Data Management Security Guide.

Spark Engine Security Support

Effective in version 10.1.1, the Spark engine supports the following additional security systems:
For more information, see the "Introduction to Big Data Management Security" chapter in the Informatica Big Data Management 10.1.1 Security Guide.

Sqoop

Effective in version 10.1.1, you can use the following new features when you configure Sqoop:
For more information, see the Informatica 10.1.1 Big Data Management User Guide.