Installation Overview
When you install Big Data Management, you install Informatica binaries on the Hadoop cluster. You download an installation package based on the distribution in the Hadoop environment.
The following table lists the Hadoop distributions and the associated package types that you use to install Big Data Management:
Hadoop Distribution | Installation Package Description |
---|
Amazon EMR | The tar.gz file includes an RPM package and the binary files that you need to run the Big Data Management installation. |
Azure HDInsight | The tar.gz file includes a Debian package and the binary files that you need to run the Big Data Management installation. |
Cloudera CDH | The parcel.tar file includes a Cloudera parcel package and the binary files that you need to run the Big Data Management installation. |
Hortonworks HDP | The archive file includes Big Data Management libraries that are compatible with Ambari stack installation. |
IBM BigInsights | The tar.gz file includes an RPM package and the binary files that you need to run the Big Data Management installation. |
After you complete the installation, you configure the Informatica domain and the Hadoop cluster to enable Informatica mappings to run on the Hadoop cluster.
Informatica Big Data Management Installation Process
You can install Big Data Management in a single node or cluster environment.
Installing in a Single Node Environment
You can install Big Data Management in a single node environment.
- 1. Extract the Big Data Management tar.gz file to the machine.
- 2. Install Big Data Management by running the installation shell script in a Linux environment.
Installing in a Cluster Environment
You can install Big Data Management in a cluster environment.
- 1. Extract the Big Data Management tar.gz file to a machine on the cluster.
- 2. Install Big Data Management by running the installation shell script in a Linux environment. You can install Big Data Management from the primary name node or from any machine using the HadoopDataNodes file.
Add the IP addresses or machine host names, one for each line, for each of the nodes in the Hadoop cluster in the HadoopDataNodes file. During the Big Data Management installation, the installation shell script picks up all of the nodes from the HadoopDataNodes file and copies the Big Data Management binary files to the /<BigDataManagementInstallationDirectory>/Informatica directory on each of the nodes.