Connectors and Connections > Data Ingestion and Replication connection properties > Open Table connection properties

Open Table connection properties

Create an Open Table connection to securely load data to Open Table formats available in a catalog.

Prerequisites

Before you create an Open Table connection, complete the prerequisites.

If you use an AWS Glue Catalog and Amazon S3 Storage to interact with Apache Iceberg, you need to have access to the following AWS services that manage the tables on AWS:

•AWS Glue Catalog: AWS Glue Catalog manages the metadata associated with the Apache Iceberg tables.
•Amazon S3 Storage: Amazon S3 stores the Apache Iceberg tables containing actual records in columnar format, organized in partitioned directories.
•Amazon Athena: Amazon Athena uses the AWS Glue Data Catalog to store metadata such as table and column names for your data stored in Amazon S3. Open Table Connector uses the Amazon Athena JDBC driver to connect to the AWS Glue Catalog to access Apache Iceberg table metadata.

You need to create separate policies to access these services.

Create minimal IAM policies

You need to create IAM policies with the minimum required permissions to interact with Apache Iceberg tables managed by AWS Glue Catalog. For more information on configuring these policies, refer to the AWS documentation.

Minimum policy for Amazon Athena: The following sample policy shows the minimal Amazon IAM policy to access Amazon Athena:
Minimum policy for AWS Glue: The following sample policy shows the minimal Amazon IAM policy to access AWS Glue Catalog:
Minimum policy for AWS S3: The following sample policy shows the minimal Amazon IAM policy to read from or write data to an Amazon S3 bucket:

Configure EC2 role to assume role

You can configure an EC2 role to assume an IAM role and generate temporary security credentials to connect to Amazon S3 from the same or different AWS accounts.

When you configure EC2 role to assume role, ensure that you have the sts:AssumeRole permission and a trust relationship established within the AWS accounts to use the temporary security credentials. The trust relationship is defined in the trust policy of the IAM role when you create the role. The IAM role adds the EC2 role as a trusted entity allowing the EC2 role to use the temporary security credentials and access the AWS accounts.

When the trusted EC2 role requests for the temporary security credentials, the AWS Security Token Service (AWS STS) dynamically generates the temporary security credentials that are valid for a specified period and provides the credentials to the trusted EC2 role.

Before you use the EC2 Role to Assume Role authentication, consider the following prerequisites:

•Install the Secure Agent on the AWS EC2 instance.
•The EC2 role attached to the AWS EC2 instance must have permissions to assume another IAM role.

The following is a sample permission policy of EC2 role that is attached to the AWS EC2 instance:

{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": "sts:AssumeRole",
"Resource": "arn:aws:iam::001234567890:role/open_table_rolearn"
}
]
}

Resource value must include the ARN of IAM role that the EC2 role needs to assume.

•The IAM role that the EC2 role needs to assume must have a permission policy and a trust policy attached to access AWS Glue Catalog, Amazon Athena, and Amazon S3.

You can also specify the external ID of your AWS account for a more secure access. The external ID must be a string.

The following sample shows the assumed IAM role's trust policy with the external ID:

{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::001234567890:root" //anyone in this account 001234567890 can assume this role, this can also be limited to one role.
},
"Action": "sts:AssumeRole",
"Condition": {
"StringEquals": {
"sts:ExternalId": "aws_externalid"
}
}
}
]
}

For more information about the minimum permission policies, see Create minimal IAM policies.

Connect to Open Table

Let's configure the Open Table connection properties to connect to AWS Glue Catalog.

Before you begin

Before you get started, you will need to create the minimal IAM policies to interact with Apache Iceberg tables managed by AWS Glue Catalog. You also need to configure the authentication-specific prerequisites to connect to Amazon S3.

Permanent IAM Credentials authentication for Amazon S3 requires the access key and secret key values of the IAM user. EC2 Role to Assume Role authentication for Amazon S3 requires the ARN of the IAM role that the EC2 role assumes to generate temporary security credentials.

Check out Prerequisites to learn more about how to configure policies and role to access Apache Iceberg tables.

Connection details

The following table describes the Open Table connection properties:

Property	Description
Connection Name	Name of the connection. Each connection name must be unique within the organization. Connection names can contain alphanumeric characters, spaces, and the following special characters: _ . + -, Maximum length is 255 characters.
Description	Description of the connection. Maximum length is 4000 characters.
Use Secret Vault	Stores sensitive credentials for this connection in the secrets manager that is configured for your organization. This property appears only if secrets manager is set up for your organization. This property is not supported by Data Ingestion and Replication and the Data Access Management services. When you enable the secret vault in the connection, you can select which credentials that the Secure Agent retrieves from the secrets manager. If you don't enable this option, the credentials are stored in the repository or on a local Secure Agent, depending on how your organization is configured. Note: If you’re using this connection to apply data access policies through pushdown or proxy services, you cannot use the Secret Vault configuration option. For information about how to configure and use a secrets manager, see Secrets manager configuration.
Runtime Environment	The name of the runtime environment where you want to run tasks. You cannot run a database ingestion task on a Hosted Agent or in a serverless runtime environment.
Open Table Format	The Open Table format that you want to use to read from or write data to a catalog. Select Apache Iceberg from the list.

Catalog types

You can select AWS Glue Catalog as the catalog type to manage the metadata of the Open Table format that you selected.

Select the catalog type that your Open Table format uses and then configure the catalog specific parameters.

AWS Glue Catalog

If the Apache Iceberg Open Table format uses AWS Glue Catalog as the catalog type, configure the properties specific to AWS Glue Catalog.

The following table describes the property to configure AWS Glue Catalog:

Property	Description
Athena JDBC URL	Enter the JDBC URL in the following format: jdbc:athena://Region=<AWS_Region>;OutputLocation=<S3_Location> For example, jdbc:athena://Region=us-west1;OutputLocation=s3://working/dir.
Catalog Authentication Type	The authentication method to connect to the catalog. Select one of the following options: - None. Connects to AWS Glue Catalog without any authentication credentials. - OAuth 2.0 Client Credentials. Connects to a REST catalog using a Client ID and Client Secret to obtain an access token from the authorization server.

Storage types

You can choose Amazon S3 as the storage type to store the Open Table format tables.

Select the storage type and configure the storage specific authentication parameters.

Amazon S3

If you use AWS Glue Catalog as the catalog type, configure the properties specific to Amazon S3 storage.

Permanent IAM Credentials authentication

You can use Permanent IAM Credentials authentication for Amazon S3 storage when you connect to an AWS Glue Catalog.

The following table describes the properties to configure Permanent IAM Credentials authentication:

Property	Description
Access Key	The key to access the AWS Glue Catalog.
Secret Key	The secret key to access the AWS Glue Catalog. The secret key is associated with the access key and uniquely identifies the account.

EC2 Role to Assume Role authentication

You can use EC2 Role to Assume Role authentication for Amazon S3 storage only when you read Apache Iceberg tables from AWS Glue Catalog.

The following table describes the properties to configure EC2 Role to Assume Role authentication:

Property	Description
IAM Role ARN	The ARN of the IAM role assumed by the EC2 role to generate the temporary session credentials.
External ID	A unique, user-defined string value that the IAM role requires the EC2 role to provide when calling the sts:AssumeRole API.