Running Mappings with Kerberos Authentication Overview
You can run mappings on a Hadoop cluster that uses MIT or Microsoft Active Directory (AD) Kerberos authentication. Kerberos is a network authentication protocol that uses tickets to authenticate access to services and nodes in a network.
To run mappings on a Hadoop cluster that uses Kerberos authentication, you must configure the Informatica domain to enable mappings to run in the Hadoop cluster.
If the Informatica domain uses Kerberos authentication, you must configure a one-way cross-realm trust to enable the Hadoop cluster to communicate with the Informatica domain. The Informatica domain uses Kerberos authentication on an AD service. The Hadoop cluster uses Kerberos authentication on an MIT service. The one way cross-realm trust enables the MIT service to communicate with the AD service.
Based on whether the Informatica domain uses Kerberos authentication or not, you might need to perform the following tasks to run mappings on a Hadoop cluster that uses Kerberos authentication:
- •If you run mappings in a Hadoop environment, you can choose to configure user impersonation to enable other users to run mappings on the Hadoop cluster. Otherwise, the Data Integration Service user can run mappings on the Hadoop cluster.
- •If you run mappings in the native environment, you must configure the mappings to read and process data from Hive sources that use Kerberos authentication.
- •If you run a mapping that has Hive sources or targets, you must enable user authentication for the mapping on the Hadoop cluster.
- •If you want to run a mapping with the Blaze runtime engine, you configure settings on the Informatica domain. To run a mapping on a cluster with High Availability, you configure additional settings.
- •If you import metadata from Hive, complex file sources, and HBase sources, you must configure the Developer tool to use Kerberos credentials to access the Hive, complex file, and HBase metadata.