Before you create an Open Table connection, complete the prerequisites.
Using AWS Glue Catalog and Amazon S3 Storage to interact with Apache Iceberg or Delta Lake tables
If you use an AWS Glue Catalog and Amazon S3 Storage to interact with Apache Iceberg or Delta Lake tables, you need to have access to the following AWS services that manage the tables on AWS:
•AWS Glue Catalog: AWS Glue Catalog manages the metadata associated with the Apache Iceberg or Delta Lake tables.
•Amazon S3 Storage: Amazon S3 stores the Apache Iceberg or Delta Lake tables containing actual records in columnar format, organized in partitioned directories.
•Amazon Athena JDBC driver: Amazon Athena JDBC driver connects to the AWS Glue Catalog to access Apache Iceberg or Delta Lake tables metadata and perform SQL queries on data stored in Amazon S3 storage.
You need to create separate policies to access these services.
Using Hive Metastore catalog and Microsoft Azure Delta Lake Storage Gen2 to interact with Apache Iceberg tables
If you use a Hive Metastore catalog and Microsoft Azure Delta Lake Storage Gen2 to interact with Apache Iceberg tables, you need to have access to the following services that manage the tables on Microsoft Azure Delta Lake Storage Gen2:
•Hive Metastore catalog: Hive Metastore catalog manages the metadata associated with the Apache Iceberg tables.
•Microsoft Azure Delta Lake Storage Gen2: Microsoft Azure Delta Lake Storage Gen2 stores the Apache Iceberg tables containing actual records in columnar format, organized in partitioned directories.
•Hive JDBC driver: Hive JDBC driver connects to the Hive Metastore catalog to access Apache Iceberg tables metadata and perform SQL queries on data stored in Microsoft Azure Delta Lake Storage Gen2.
Using Hive Metastore catalog and Amazon S3 storage to interact with Apache Iceberg tables
If you use a Hive Metastore catalog and Amazon S3 storage to interact with Apache Iceberg tables, you need to have access to the following services that manage the tables on Amazon S3 storage:
•Hive Metastore catalog: Hive Metastore catalog manages the metadata associated with the Apache Iceberg tables.
•Amazon S3 storage: Amazon S3 stores the Apache Iceberg tables containing actual records in columnar format, organized in partitioned directories.
•Hive JDBC driver: Hive JDBC driver connects to the Hive Metastore catalog to access Apache Iceberg tables metadata and perform SQL queries on data stored in Amazon S3 storage.
Using REST catalog and Amazon S3 to interact with Apache Iceberg tables
If you use a REST catalog such as Polaris catalog and Amazon S3 storage to interact with Apache Iceberg tables, you need to have access to the following services that manage the tables on Amazon S3 storage:
•REST catalog: REST catalog manages the metadata associated with the Apache Iceberg tables.
•Amazon S3 storage: Amazon S3 stores the Apache Iceberg tables containing actual records in columnar format, organized in partitioned directories.
Create minimal IAM policies
You need to create IAM policies with the minimum required permissions to interact with Apache Iceberg or Delta Lake tables managed by AWS Glue Catalog. For more information on configuring these policies, refer to the AWS documentation.
Minimum policy for Amazon Athena
The following sample policy shows the minimal Amazon IAM policy to access Amazon Athena:
Before you use Open Table Connector, you need to copy the Amazon Athena or Hive JDBC driver on the Linux machine where you installed the Secure Agent. You need to use the Amazon Athena driver for the AWS Glue Catalog and the Hive JDBC driver for the Hive Metastore catalog.
1Download the latest Amazon Athena or Hive JDBC driver from the website.
2Navigate to the following directory on the Secure Agent machine: <Secure Agent installation directory>/ext/connectors/thirdparty/
3Create the following folder: informatica.opentableformat/common
4Add the JDBC driver to the folder.
5Restart the Secure Agent.
Configure EC2 role to assume role
You can configure an EC2 role to assume an IAM role and generate temporary security credentials to connect to Amazon S3 from the same or different AWS accounts.
The EC2 role can assume another IAM role from the same or different AWS account without requiring a permanent access key and secret key.
When you configure EC2 role to assume role, ensure that you have the sts:AssumeRole permission and a trust relationship established within the AWS accounts to use the temporary security credentials. The trust relationship is defined in the trust policy of the IAM role when you create the role. The IAM role adds the EC2 role as a trusted entity allowing the EC2 role to use the temporary security credentials and access the AWS accounts.
When the trusted EC2 role requests for the temporary security credentials, the AWS Security Token Service (AWS STS) dynamically generates the temporary security credentials that are valid for a specified period and provides the credentials to the trusted EC2 role.
Before you use the EC2 Role to Assume Role authentication, consider the following prerequisites:
•Install the Secure Agent on the AWS EC2 instance.
•The EC2 role attached to the AWS EC2 instance must have permissions to assume another IAM role.
The following is a sample permission policy of EC2 role that is attached to the AWS EC2 instance:
Resource value must include the ARN of IAM role that the EC2 role needs to assume.
•The IAM role that the EC2 role needs to assume must have a permission policy and a trust policy attached to access AWS Glue Catalog, Amazon Athena, and Amazon S3.
You can also specify the external ID of your AWS account for a more secure access. The external ID must be a string.
The following sample shows the assumed IAM role's trust policy with the external ID:
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Principal": { "AWS": "arn:aws:iam::001234567890:root" //anyone in this account 001234567890 can assume this role, this can also be limited to one role. }, "Action": "sts:AssumeRole", "Condition": { "StringEquals": { "sts:ExternalId": "aws_externalid" } } } ] }