Configure Hive Connector to download the distribution-specific Hive libraries
You must configure Hive Connector to download the distribution specific Hive third-party libraries. The Informatica Hive third-party script and the Informatica Hive third-party properties files are available as part of the Hive Connector package in the Secure Agent installation.
Distributions applicable for Hive mappings
You can utilize the following distribution versions when you use Hive Connector to run mappings:
•Cloudera CDH 6.1
•Amazon EMR 5.20, 6.3, and 6.4
•Cloudera CDP 7.1 private cloud and Cloudera CDW 7.2 public cloud
•Azure HDInsight 4.0
•Hortonworks HDP 3.1
Distributions applicable for Hive mappings in advanced mode
You can utilize the following distribution versions when Hive Connector runs on the advanced cluster:
•Cloudera CDH 6.1
•Cloudera CDP 7.1 private cloud and Cloudera CDW 7.2 public cloud
•Azure HDInsight 4.0
•Amazon EMR 6.1, 6.2, and 6.3
Perform the following tasks to download distribution specific Hive third-party libraries before you use Hive Connector:
1Run the script to copy the third-party libraries to the Secure Agent location. Ensure that you have full permissions to the directories where the Hive libraries are copied.
The script is interactive and you need to specify the job type and the Hadoop cluster you want to use, when prompted.
2Add the runtime DTM property, INFA_HADOOP_DISTRO_NAME, and set its value to the applicable distribution that you want to use.
3Restart the Secure Agent.
Step 1. Run the script on a Linux system
The Hive Connector package that contains the Informatica Hive third-party script and the Informatica Hive third-party property files is part of the Secure Agent installation. When you run the Hive third-party script, you can specify the distribution that you want to use.
1Go to the following Secure Agent installation directory where the Informatica Hive third-party script is located:
<Secure Agent installation directory>/apps/Data_Integration_Server/ext/deploy_to_main/distros/Parsers/<Hadoop distribution version>/lib
where the value of the Hadoop distribution version is based on the Hadoop distribution you specified.
4If you copy the scripts folder to a machine where the Secure Agent is not installed, perform the following tasks:
aPerform steps 3a and 3b.
The third-party libraries are copied to the following directories based on the option you selected in step 3b:
▪ For CDI:
<CurrentDirectory>/deploy_to_main/distros/Parsers/<Hadoop distribution version>/lib
Manually copy the deploy_to_main directory to the following Secure Agent location: <Secure Agent installation directory>/apps/Data_Integration_Server/ext, or replace the directory if it is already present.
▪ For CDI Advanced Mode: <CurrentDirectory>/informaticallc.hiveadapter/spark/lib
Manually perform the following tasks:
Copy the informaticallc.hiveadapter directory to the following Secure Agent location: <Secure Agent installation directory>/ext/connectors/thirdparty/
Copy the deploy_to_main directory to the following Secure Agent location: <Secure Agent installation directory>/apps/Data_Integration_Server/ext, or replace the directory if it is already present.
where the value of the Hadoop distribution version is based on the Hadoop distribution you specified.
Note: CDH_6.1 option is applicable for Cloudera CDH 6.1, Cloudera CDP 7.1 private cloud, and Cloudera CDW 7.2 public cloud in mappings. For mappings in advanced mode, CDH_6.1 is applicable only for Cloudera CDH 6.1. EMR_5.20 is applicable for EMR_6.1, EMR_6.2, and EMR_6.3 for Hive mappings in advanced mode, whereas EMR_5.20 is applicable only for Amazon EMR 5.20, EMR 6.3, and EMR 6.4 in mappings.
The Hadoop distribution directory created under deploy_to_main/distros/Parsers/ changes based on the distribution you select:
▪ If you select CDH_6.1, CDP_7.1, or CDW_7.2, the Hadoop distribution directory created is CDH_6.1.
▪ If you select EMR_5.20, EMR_6.1, EMR_6.2, or EMR_6.3, the Hadoop distribution directory created is EMR_5.20.
▪ If you select HDInsight_4.0, the Hadoop distribution directory created is HDInsight_4.0.
▪ If you select HDP_3.1, the Hadoop distribution directory created is HDP_3.1.
Step 2. Set the custom property for the Data Integration Service
Set the INFA_HADOOP_DISTRO_NAME property for the DTM in the Secure Agent properties and set the value of the distribution version that you want to use.
1Open Administrator and select Runtime Environments.
2 Select the Secure Agent for which you want to configure the DTM property.
3On the upper-right corner of the page, click Edit.
4Add the following DTM properties in the Custom Configuration section:
- Service: Data Integration Service
- Type: DTM
- Name: INFA_HADOOP_DISTRO_NAME
- Value: <distribution_version>
where the following values are applicable based on the distribution version you want to access:
- For CDH_6.1, CDP_7.1, and CDW_7.2, set the value as CDH_6.1.
- For EMR_5.20, EMR_6.1, EMR_6.2, EMR_6.3, and EMR_6.4, set the value as EMR_5.20.
- For HDInsight_4.0, set the value as HDInsight_4.0.
- For HDP_3.1, set the value as HDP_3.1.
Step 3. Restart the Secure Agent
After you complete the configurations and set the properties, restart the Secure Agent to reflect the changes.