Catalog Source Configuration > Amazon Athena
  

Amazon Athena

Amazon Athena is an interactive query service to query data and analyze big data in Amazon S3 using standard SQL.

Objects extracted

The Metadata Command Center service extracts the following objects from an Amazon Athena source system:

Prerequisites for configuring an Amazon Athena catalog source

Use the Amazon Athena connector to connect to the Amazon Athena source system. For information about configuring a connection in Administrator, see Connections in the Cloud Common Services help.

Configure permissions or access to Amazon Athena

Permissions to extract metadata
This section addresses permissions for configuring an Amazon Athena connection. Amazon Athena uses Amazon S3 buckets to store query results.
Grant the following Identity and Access Management (IAM) permissions to the user for the INFORMATION_SCHEMA database and all user-defined databases that you want to scan:
glue:GetDatabases
glue:GetDatabase
glue:GetTables
glue:GetTable
Grant the following IAM permissions to the user to create, manage, execute, and delete prepared statements in Amazon Athena:
athena:CreatePreparedStatement
athena:StartQueryExecution
athena:GetQueryResultsStream
athena:GetQueryResults
athena:GetDatabase
athena:GetDataCatalog
athena:DeletePreparedStatement
athena:GetPreparedStatement
athena:ListDatabases
athena:StopQueryExecution
athena:GetQueryExecution
athena:ListDataCatalogs
Grant the following IAM permissions to the user to perform operations on Amazon S3 buckets:
s3:PutObject
s3:GetObject
s3:GetBucketLocation
Grant permissions that allow you to perform the following operations:
Permissions to run data profiles
You do not need additional permissions to run data profiles.

Data profiling for Amazon Athena

Configure data profiling to run profiles on the metadata extracted from an Amazon Athena source system. You can run data profiles on the following Amazon Athena objects:
You can view the profiling statistics in Data Governance and Catalog. The data profiling task runs profiles on the following data types for Amazon Athena objects:
Sampling type
Determine the sample rows on which you want to run the data profiling task. You can choose one of the following sampling types for an Amazon Athena catalog source:
Note: You can run data quality only on views and external tables that are created in Amazon Athena.

Create a connection to Amazon Athena

When you configure a connection to the Amazon Athena source system in Administrator, you can view the connection properties for that connection on the Registration page in Metadata Command Center.
The following table describes the Amazon Athena connection properties:
Property
Description
Connection Name
Name of the connection.
Each connection name must be unique within the organization. Connection names can contain alphanumeric characters, spaces, and the following special characters: _ . + -,
Maximum length is 255 characters.
Description
Description of the connection. Maximum length is 4000 characters.
Type
Amazon Athena
Use Secret Vault
Stores sensitive credentials for this connection in the secrets manager that is configured for your organization.
This property appears only if secrets manager is set up for your organization.
When you enable the secret vault in the connection, you can select which credentials that the Secure Agent retrieves from the secrets manager. If you don't enable this option, the credentials are stored in the repository or on a local Secure Agent, depending on how your organization is configured.
For information about how to configure and use a secrets manager, see Secrets manager configuration.
Runtime Environment
The name of the runtime environment where you want to run tasks.
Authentication Type
The authentication mechanism to connect to Amazon Athena. Select Permanent IAM Credentials or EC2 instance profile. Permanent IAM credentials is the default authentication mechanism. Permanent IAM requires an access key and secret key to connect to Amazon Athena. Use the EC2 instance profile when the Secure Agent is installed on an Amazon Elastic Compute Cloud (EC2) system. This way, you can configure AWS Identity and Access Management (IAM) authentication to connect to Amazon Athena.
JDBC URL
The URL of the Amazon Athena connection.
Enter the JDBC URL in the following format:
jdbc:awsathena://AwsRegion=<region_name>;S3OutputLocation=<S3_Output_Location>;