Installation and Configuration Guide > Part II: Before You Install the Services > Prepare for the Enterprise Data Catalog Cluster > External Hadoop Cluster Deployment
  

External Hadoop Cluster Deployment

You can deploy Enterprise Data Catalog on a Hadoop cluster that you have set up on Cloudera or Hortonworks. If you have enabled Kerberos authentication in your enterprise to authenticate users and services on a network, you can configure the Informatica domain to use Kerberos network authentication.
You need to configure Zookeeper, HDFS, and Yarn specifications when you install Enterprise Data Catalog on an external Hadoop cluster in your enterprise. The Catalog Service uses the following specifications and launches the following services and components on the Hadoop cluster as YARN application:

Prerequisites for the External Cluster

Before you install Enterprise Data Catalog to use an external Hadoop cluster, you must verify that the system environment meets the prerequisites required to deploy Enterprise Data Catalog.
Verify that the external Hadoop distribution meets the following prerequisites:

Preparing the External Hadoop Cluster Environment

You need to perform multiple validation checks before you install Enterprise Data Catalog on an external Hadoop cluster.
Perform the following steps before you install Enterprise Data Catalog to use an external cluster.:

Kerberos and SSL Setup for an External Cluster

You can install Enterprise Data Catalog on an external cluster that uses Kerberos network authentication to authenticate users and services on a network. Enterprise Data Catalog also supports SSL authentication for secure communication in the cluster.
Kerberos is a network authentication protocol which uses tickets to authenticate access to services and nodes in a network. Kerberos uses a Key Distribution Center (KDC) to validate the identities of users and services and to grant tickets to authenticated user and service accounts. In the Kerberos protocol, users and services are known as principals. The KDC has a database of principals and their associated secret keys that are used as proof of identity. Kerberos can use an LDAP directory service as a principal database.
Informatica does not support cross or multi-realm Kerberos authentication. The server host, client machines, and Kerberos authentication server must be in the same realm.
The Informatica domain requires keytab files to authenticate nodes and services in the domain without transmitting passwords over the network. The keytab files contain the service principal names (SPN) and associated encrypted keys. Create the keytab files before you create nodes and services in the Informatica domain.

Prerequisites for SSL Authentication

Verify that the external cluster meets the following requirements before you can enable SSL authentication in the cluster:

Prerequisites for Kerberos Authentication

Perform the following steps before you enable the Kerberos authentication for the external cluster:
Note: Enterprise Data Catalog does not support deployment on a Hortonworks version 2.6 cluster where Kerberos is enabled.