Installation and Configuration Guide > Part II: Before You Install the Services > Prepare for Application Services and Databases > Prepare to Create the Enterprise Data Lake Services
  

Prepare to Create the Enterprise Data Lake Services

To create the Enterprise Data Lake services, the domain must be integrated with the Hadoop environment through a domain cluster configuration object.
The Enterprise Data Lake services require connections to the Hadoop environment. The connections are associated with the Hadoop environment through a cluster configuration. The process to integrate the environments and create the services can vary based on the type of installation you choose.

Install Enterprise Data Lake with Informatica Domain Services

If you install the Informatica domain services when you install Enterprise Data Lake and you want to create the Enterprise Data Lake services, you must provide the cluster information during the installation. The installer can import the cluster configuration from the Hadoop environment, and create the connections required by the Enterprise Data Lake services.
Before you run the installer, you need to get import information from the Hadoop administrator. The Hadoop administrator can provide import information to you in one of the following formats:
Note: When the installation completes, you must fully integrate the domain with the Hadoop environment, including a task to refresh the cluster configuration. If you want to complete all integration tasks at one time, you can skip creating the services during installation and create them manually after you integrate the domain with the Hadoop environment.

Install Enterprise Data Lake on a Node with Enterprise Data Catalog

When you install Enterprise Data Lake on a node with Enterprise Data Catalog, you can choose to create the Enterprise Data Lake services. To create the services, the domain must be integrated with the Hadoop environment before you run the installer.
Before you run the installer, verify that the domain is integrated with the Hadoop environment and that the Hadoop, HDFS, and Hive connections are associated with the cluster configuration. For more information about integrating the domain with the Hadoop environment, see Informatica Big Data Management Hadoop Integration Guide.

Install Enterprise Data Lake and Enterprise Data Catalog on an Existing Node

When you install Enterprise Data Lake and Enterprise Data Catalog on a domain node, the installer installs the service binaries. The installer does not prompt for any configuration. You must manually create the services after the installation completes.
Before you create the services, verify that the domain is integrated with the Hadoop environment and that the Hadoop, HDFS, and Hive connections are associated with the cluster configuration.

Prepare for Archive File Import with a Full Installation

If you want to create the Enterprise Data Lake services when you perform a full installation, you must import properties from the *-site.xml files into the domain. The Hadoop administrator might choose to provide you with a .zip or .tar archive file instead of with direct connection information.
If you are integrating with an Amazon EMR or MapR cluster, you must import the cluster configuration through an archive file.
Get an archive file that contains the following *-site.xml files from the cluster:
Note: Verify that the Hadoop administrator creates an archive file from all the listed *-site.xml files. Big Data Management might require them even though Enterprise Data Lake might not.
After creating the archive file, the Hadoop administrator needs to edit it for the following distributions:
Azure HDInsight
Edit the Hortonworks Data Platform (HDP) version string wherever it appears in the archive file. Search for the string ${hdp.version} and replace all instances with the HDP version that HDInsight includes in the Hadoop distribution.
Hortonworks HDP
Edit the Hortonworks Data Platform (HDP) version string wherever it appears in the archive file. Search for the string ${hdp.version}and replace all instances with the HDP version that Hortonworks includes in the Hadoop distribution.

Prepare for Direct Import with a Full Installation

If you want to create the Enterprise Data Lake services when you perform a full installation, you must import properties from the *-site.xml files into the domain. You can get connection information for the cluster or an archive file from the Hadoop administrator to import cluster configuration from the Hadoop cluster.
The following table describes information that you need from the Hadoop administrator to create the cluster configuration directly from the cluster:
Property
Description
Host
IP address of the cluster manager.
Port
Port of the cluster manager.
User ID
Cluster user ID.
Password
Password for the user.
Cluster name
Name of the cluster. Use the display name if the cluster manager manages multiple clusters. If you do not provide a cluster name, the installer imports information based on the default cluster.
Note: To find the correct Cloudera cluster name when you have multiple clusters, the Hadoop administrator can add the string /api/v8/clusters to the URL and provide you with the name that appears in the browser tab.