Big Data Management Administrator Guide > Cluster Configuration > Create the Cluster Configuration
  

Create the Cluster Configuration

Import the cluster information into the domain. When you import cluster information, you import values from *-site.xml files to create a domain object called a cluster configuration.
Choose one of the following options to import cluster properties:
Import from cluster
When you import directly from the cluster, you enter cluster connection information. The Service Manager uses the information to connect to the cluster and get cluster configuration properties.
Note: You can import directly from Azure HDInsight, Cloudera CDH, and Hortonworks HDP clusters.
Import from file
When you import from a file, you browse to an archive file that the Hadoop administrator created. Use this option if the Hadoop administrator requires you to do so.
Note: If you import from a MapR or Amazon EMR cluster, you must import from a file.

Before You Import

Before you can import the cluster configuration, you must get information from the Hadoop administrator, based on the method of import.
If you import directly from the cluster, contact the Hadoop administrator to get cluster connection information. If you import from a file, get an archive file of exported cluster information.
For more information about required cluster information, see the Big Data Management Hadoop Integration Guide.
Note: To import from Amazon EMR or MapR, you must import from an archive file.

Importing a Cluster Configuration from the Cluster

When you import the cluster configuration directly from the cluster, you provide information to connect to the cluster.
Get cluster connection information from the Hadoop administrator.
    1. From the Connections tab, click the ClusterConfigurations node in the Domain Navigator.
    2. From the Actions menu, select New > Cluster Configuration.
    The Cluster Configuration wizard opens.
    3. Configure the following General properties:
    Property
    Description
    Cluster configuration name
    Name of the cluster configuration.
    Description
    Optional description of the cluster configuration.
    Distribution type
    The cluster Hadoop distribution type.
    Distribution version
    Version of the Hadoop distribution.
    Each distribution type has a default version. The default version is the latest version of the Hadoop distribution that Big Data Management supports.
    Note: When the cluster version differs from the default version and Informatica supports more than one version, the cluster configuration import process populates the property with the most recent supported version. For example, consider the case where Informatica supports versions 5.10 and 5.13, and the cluster version is 5.12. In this case, the cluster configuration import process populates this property with 5.10, because 5.10 is the most recent supported version before 5.12.
    You can edit the property to choose any supported version. Restart the Data Integration Service for the changes to take effect.
    Method to import the cluster configuration
    Choose Import from cluster.
    Create connections
    Choose to create Hadoop, HDFS, Hive, and HBase connections.
    If you choose to create connections, the Cluster Configuration wizard associates the cluster configuration with each connection that it creates.
    If you do not choose to create connections, you must manually create them and associate the cluster configuration with them.
    Important: When the wizard creates the Hive connection, it populates the Metadata Connection String and the Data Access Connection String properties with the value from the hive.metastore.uris property. If the Hive metastore and HiveServer2 are running on different nodes, you must update the Metadata Connection String to point to the HiveServer2 host.
    The cluster properties appear.
    4. Configure the following properties:
    Property
    Description
    Host
    IP address of the cluster manager.
    Port
    Port of the cluster manager.
    User ID
    Cluster user ID.
    Password
    Password for the user.
    Cluster name
    Name of the cluster. Use the display name if the cluster manager manages multiple clusters. If you do not provide a cluster name, the wizard imports information based on the default cluster.
    5. Click Next and verify the cluster configuration information on the summary page.

Importing a Cluster Configuration from a File

You can import properties from an archive file to create a cluster configuration.
Before you import from the cluster, you must get the archive file from the Hadoop administrator.
    1. From the Connections tab, click the ClusterConfigurations node in the Domain Navigator.
    2. From the Actions menu, select New > Cluster Configuration.
    The Cluster Configuration wizard opens.
    3. Configure the following properties:
    Property
    Description
    Cluster configuration name
    Name of the cluster configuration.
    Description
    Optional description of the cluster configuration.
    Distribution type
    The cluster Hadoop distribution type.
    Distribution version
    Version of the Hadoop distribution.
    Each distribution type has a default version. This is the latest version of the Hadoop distribution that Big Data Management supports.
    When the cluster version differs from the default version, the cluster configuration wizard populates the cluster configuration Hadoop distribution property with the most recent supported version relative to the cluster version. For example, suppose Informatica supports versions 5.10 and 5.13, and the cluster version is 5.12. In this case, the wizard populates the version with 5.10.
    You can edit the property to choose any supported version. Restart the Data Integration Service for the changes to take effect.
    Method to import the cluster configuration
    Choose Import from file to import properties from an archive file.
    Create connections
    Choose to create Hadoop, HDFS, Hive, and HBase connections.
    If you choose to create connections, the Cluster Configuration wizard associates the cluster configuration with each connection that it creates.
    If you do not choose to create connections, you must manually create them and associate the cluster configuration with them.
    Important: When the wizard creates the Hive connection, it populates the Metadata Connection String and the Data Access Connection String properties with the value from the hive.metastore.uris property. If the Hive metastore and HiveServer2 are running on different nodes, you must update the Metadata Connection String to point to the HiveServer2 host.
    4. Click Browse to select a file. Select the file and click Open.
    5. Click Next and verify the cluster configuration information on the summary page.