Complex File Connection Properties
When you set up a Complex File connection, you must configure the connection properties.
The following table describes the Complex File connection properties:
Connection Property | Description |
---|
Connection Name | Name of the Complex File connection. |
Description | Description of the connection. The description cannot exceed 765 characters. |
Type | Type of connection. Select Complex File. |
Runtime Environment | The name of the runtime environment where you want to run the tasks. |
User Name | Required to read data from HDFS. Enter a user name that has access to the single-node HDFS location to read data from or write data to. |
NameNode URI | The URI to access HDFS. Use the following format to specify the name node URI in Cloudera, Amazon EMR, and Hortonworks distributions: hdfs://<namenode>:<port> Where - - <namenode> is the host name or IP address of the name node.
- - <port> is the port that the name node listens for remote procedure calls (RPC).
Specify either the name node URI or the local path. Do not specify the name node URI if you want to read data from or write data to a local file system path. |
Local Path | A local file system path to read data from or write data to. Do not specify local path if you want to read data from or write data to HDFS. Read the following conditions to specify the local path: - - You must enter NA in local path if you specify the name node URI. If the local path does not contain NA, the name node URI does not work.
- - If you specify the name node URI and local path, the local path takes the preference. The connection uses the local path to run all tasks.
- - If you leave the local path blank, the agent configures the root directory (/) in the connection. The connection uses the local path to run all tasks.
|
Hadoop Distribution | Hadoop distribution name. Enter CLOUDERA, EMR, or HDP based on the HDFS instance you want to use for the connection. You can use Kerberos authentication for the Cloudera CDH and Hortonworks HDP Hadoop distributions. By default, Complex File Connector uses the old versions of Cloudera CDH and Hortonworks HDP Hadoop distributions (Cloudera CDH 5.4 and Hortonworks HDP 2.3). For more information on using the Cloudera CDH 5.8 or Hortonworks HDP 2.5 distribution, see Note: Use all uppercase letters to specify the Hadoop distribution name. |
Keytab File | The file that contains encrypted keys and Kerberos principals to authenticate the machine. |
Principle Name | Users assigned to the superuser privilege can perform all the tasks that a user with the administrator privilege can perform. |
Impersonation Username | You can enable different users to run mappings in a Hadoop cluster that uses Kerberos authentication or connect to sources and targets that use Kerberos authentication. To enable different users to run mappings or connect to big data sources and targets, you must configure user impersonation. |
Using Cloudera CDH 5.8 or Hortonworks HDP 2.5 Hadoop Distributions
Complex File Connector supports the following Cloudera CDH and Hortonworks HDP Hadoop distribution versions:
- •Cloudera CDH 5.4 and 5.8
- •Hortonworks HDP 2.3 and 2.5
By default, Complex File Connector uses the old versions of Cloudera CDH and Hortonworks HDP Hadoop distributions (Cloudera CDH 5.4 and Hortonworks HDP 2.3). You can configure the Complex File Connector to use Cloudera CDH 5.8 or Hortonworks HDP 2.5 distribution.
To configure the Complex File Connector to use Cloudera CDH 5.8 or Hortonworks HDP 2.5 distribution, you must replace all the files and folders in the old version with the files and folders in the new version of the distribution.
Using Cloudera CDH 5.8 Distribution in Complex File Connector
To configure Complex File Connector to use Cloudera CDH 5.8 distribution, perform the following steps:
- 1. Download Cloudera CDH 5.4 and 5.8 on the machine where Secure Agent is installed.
- 2. Copy all the files and folders from <Secure Agent installation directory>/download/package-Cloudera_5_8.1/package/cloudera_cdh5u8 to <Secure Agent installation directory>/download/package-Cloudera_5_4.1/package/cloudera_cdh5u4 and replace the files.
- 3. Enter CLOUDERA as the value of the Hadoop Distribution field in the connection properties.
- 4. Click Test to evaluate the connection.
- 5. Click OK to save the connection.
Using Hortonworks HDP 2.5 Distribution in Complex File Connector
To configure Complex File Connector to use the Hortonworks HDP 2.5 distribution, perform the following steps:
- 1. Download Hortonworks HDP 2.3 and 2.5 on the machine where Secure Agent is installed.
- 2. Copy all the files and folders from <Secure Agent installation directory>/download/package-Hortonworks_2_5.1/package/hortonworks_2.5 to <Secure Agent installation directory>/download/package-Hortonworks_2_3.1/package/hortonworks_2.3 and replace the files.
- 3. Enter HDP as the value of the Hadoop Distribution field in the connection properties.
- 4. Click Test to evaluate the connection.
- 5. Click OK to save the connection.