Authentication and Authorization Overview
You can configure security for Big Data Management and the Hadoop cluster to protect from threats inside and outside the network. Security for Big Data Management includes security for the Informatica domain and security for the Hadoop cluster.
Security for the Hadoop cluster includes the following areas:
- Authentication
- When the Informatica domain includes Big Data Management, user identities must be authenticated in the Informatica domain and the Hadoop cluster. Authentication for the Informatica domain is separate from authentication for the Hadoop cluster.
- By default, Hadoop does not verify the identity of users. To authenticate user identities, you can configure the following authentication protocols on the cluster:
- - Native authentication
- - Lightweight Directory Access Protocol (LDAP)
- - Kerberos, when the Hadoop distribution supports it
- - Apache Knox Gateway
- Big Data Management also supports Hadoop clusters that use a Microsoft Active Directory (AD) Key Distribution Center (KDC) or an MIT KDC.
- Authorization
- After a user is authenticated, a user must be authorized to perform actions. For example, a user must have the correct permissions to access the directories where specific data is stored to use that data in a mapping.
- You can run mappings on a cluster that uses one of the following security management systems for authorization:
- - Cloudera Navigator Encrypt
- - HDFS permissions
- - User impersonation
- - Apache Ranger
- - Apache Sentry
- - HDFS Transparent Encryption
- Data and metadata management
- Data and metadata management involves managing data to track and audit data access, update metadata, and perform data lineage. Big Data Management supports Cloudera Navigator and Metadata Manager to manage metadata and perform data lineage.
- Data security
- Data security involves protecting sensitive data from unauthorized access. Big Data Management supports data masking with the Data Masking transformation in the Developer tool, Dynamic Data Masking, and Persistent Data Masking.
- Operating system profiles
- An operating system profile is a type of security that the Data Integration Service uses to run mappings. Use operating system profiles to increase security and to isolate the run-time environment for users. Big Data Management supports operating system profiles on all Hadoop distributions.
Support for Authentication Systems
Depending on the run-time engine that you use, you can run mappings on a Hadoop cluster that uses a supported security management system.
Hadoop clusters use a variety of security management systems for user authentication. The following table shows the run-time engines supported for the security management system installed on the Hadoop platform:
Hadoop Distribution | Apache Knox | Kerberos | LDAP |
---|
Amazon EMR | No support | - - Native
- - Blaze
- - Spark
- - Hive
| - - Native
- - Blaze
- - Spark
- - Hive
|
Azure HDInsight* | No support | - - Native
- - Blaze
- - Spark
- - Hive
| No support |
Cloudera CDH | No support | - - Native
- - Blaze
- - Spark
- - Hive
| - - Native
- - Blaze
- - Spark
- - Hive
|
Hortonworks HDP | - - Native
- - Blaze
- - Spark
- - Hive
| - - Native
- - Blaze
- - Spark
- - Hive
| - - Native
- - Blaze
- - Spark
- - Hive
|
MapR | No support | - - Native
- - Blaze
- - Spark
- - Hive
| No support |
*Informatica supports Kerberos and SASL authentication on a Azure HDInsight cluster that uses WASB storage only. |
Hadoop cluster with Kerberos authentication also support SASL.
Support for Authorization Systems
Depending on the run-time engine that you use, you can run mappings on a Hadoop cluster that uses a supported security management system.
Hadoop clusters use a variety of security management systems for user authorization. The following table shows the run-time engines supported for the security management system installed on the Hadoop platform:
Hadoop Distribution | Apache Ranger | Apache Sentry | HDFS Transparent Encryption | SSL/TLS | SQL Authorization |
---|
Amazon EMR | No support | No support | No support | No support | No support |
Azure HDInsight* | | No support | No support | No support | |
Cloudera CDH | No support | - - Native
- - Blaze
- - Spark
- - Hive
| | - - Native
- - Blaze
- - Spark
- - Hive
| |
Hortonworks HDP | Note: Also supports SQL authorization | No support | | - - Native
- - Blaze
- - Spark
- - Hive
| |
MapR | No support | No support | No support | - - Native
- - Blaze
- - Spark
- - Hive
| No support |
*Informatica supports Apache Ranger and SQL authorization on a Azure HDInsight cluster that uses WASB storage only. |
The combination of Apache Ranger and SQL authorization is supported on Hortonworks HDP only.
The combination of Apache Sentry and SQL authorization is supported on Cloudera, RedHat and SUSE, only.