Hadoop Distributed File System Sources > Before you begin > Create a connection
  

Create a connection

Before you configure the Hadoop Distributed File System catalog source, create a connection object in Administrator.
Ensure that you have the required information to connect to the Hadoop Distributed File System.
Before you create a connection, configure the Hadoop Files V2 connector to download the Hadoop Distributed File System third-party libraries for the Cloudera CDP, Amazon EMR, Azure HDInsight, or Google Dataproc cluster. For more information about the Hadoop Files V2 connector, see the Data Integration Connectors help.
    1In Administrator, select Connections.
    2Click Add Connection.
    3Search for and select Hadoop Files V2 and then click Next.
    4Enter the following connection details:
    Property
    Description
    Connection Name
    Unique name of the Hadoop Distributed File System connection that meets the following criteria:
    • - Can contain alphanumeric characters, spaces, and the following special characters: _ . + -
    • - Maximum length is 100 characters.
    • - Is not case sensitive.
    Description
    Optional description of the connection.
    The maximum permitted length is 255 characters.
    5If you want to use Kerberos authentication to connect to the Hadoop Distributed File System source system, enter the following properties:
    Property
    Description
    Runtime Environment
    A runtime environment is either Informatica Cloud Secure Agent or a serverless runtime environment.
    NameNode URI
    The access URI to the Hadoop Distributed File System instance.
    Configuration Files Path
    The directory that contains Kerberos Hadoop Distributed File System configuration files.
    Keytab File
    The path and file name of the keytab file that contains the encrypted keys and Kerberos principals for Kerberos login.
    Principal Name
    The principal name that you use to connect to Hadoop Distributed File System with Kerberos authentication.
    6If you want to use non-Kerberos authentication with the configuration file to connect to the Hadoop Distributed File System source system, enter the following properties:
    Property
    Description
    Runtime Environment
    A runtime environment is either Informatica Cloud Secure Agent or a serverless runtime environment.
    User Name
    Name of the user that connects to the Hadoop Distributed File System instance.
    NameNode URI
    The access URI to the Hadoop Distributed File System instance in one of the following formats:
    • - hdfs://<NameNodeURI>:<port>/
    • - hdfs://<NameNodeURI>:<port>/<source directory>
    Note:
    If you don't enter <source directory>, you can include the directory in
    Metadata Command Center
    . In the
    Filters
    area, select
    Folder
    and include the source directory.
    Configuration Files Path
    The directory that contains non-Kerberos Hadoop Distributed File System configuration files.
    7If you want to use non-Kerberos authentication without the configuration file to connect to the Hadoop Distributed File System source system, enter the following properties:
    Property
    Description
    Runtime Environment
    A runtime environment is either Informatica Cloud Secure Agent or a serverless runtime environment.
    User Name
    Name of the user that connects to the Hadoop Distributed File System instance.
    NameNode URI
    The access URI to the Hadoop Distributed File System instance in one of the following formats:
    • - hdfs://<NameNodeURI>:<port>/
    • - hdfs://<NameNodeURI>:<port>/<source directory>
    Note:
    If you don't enter <source directory>, you can include the directory in
    Metadata Command Center
    . In the
    Filters
    area, select
    Folder
    and include the source directory.
    8Click Test to test your connection to the source system.
    9Click Save.