Catalog Source Configuration > AWS Glue > Before you begin
  

Before you begin

Before you can extract catalog source metadata, get information from the AWS Glue administrator.
Perform the following prerequisite tasks:

Verify permissions and privileges

To extract AWS Glue metadata, you need account access and permissions to the AWS Glue and Amazon Athena source systems.
Verify that the administrator performs the following tasks:

Create a connection

You use the Amazon Athena connection to connect to the Amazon Athena source system and create schema to use in AWS Glue. Create an Amazon Athena connection object in Administrator.
    1In Administrator, select Connections.
    2Click New Connection.
    3Enter the following connection details:
    Property
    Description
    Connection Name
    Name of the connection.
    Each connection name must be unique within the organization. Connection names can contain alphanumeric characters, spaces, and the following special characters: _ . + -,
    Maximum length is 255 characters.
    Description
    Description of the connection. Maximum length is 4000 characters.
    Type
    The Amazon Athena connection type.
    Runtime Environment
    Name of the runtime environment where you want to run the tasks.
    Authentication Type
    The authentication mechanism to connect to Amazon Athena. Select Permanent IAM Credentials or EC2 instance profile.
    Permanent IAM credentials is the default authentication mechanism. Permanent IAM requires an access key and secret key to connect to Amazon Athena.
    Use the EC2 instance profile when the Secure Agent is installed on an Amazon Elastic Compute Cloud (EC2) system. This way, you can configure AWS Identity and Access Management (IAM) authentication to connect to Amazon Athena.
    For more information about authentication, see Prepare for authentication.
    Access Key
    Optional. The access key to connect to Amazon Athena.
    Secret Key
    Optional. The secret key to connect to Amazon Athena.
    JDBC URL
    The URL of the Amazon Athena connection.
    Enter the JDBC URL in the following format:
    jdbc:awsathena://AwsRegion=<region_name>;S3OutputLocation=<S3_Output_Location>;
    You can use pagination to fetch the Amazon Athena query results. Set the property UseResultsetStreaming=0 to use pagination.
    Enter the property in the following format:
    jdbc:awsathena://AwsRegion=<region_name>;S3OutputLocation=<S3_Output_Location>;UseResultsetStreaming=0;
    You can also use streaming to improve the performance and fetch the Amazon Athena query results faster. When you use streaming, ensure that port 444 is open.
    By default, streaming is enabled.
    Customer Master Key ID
    Optional. Specify the customer master key ID generated by AWS Key Management Service (AWS KMS) or the Amazon Resource Name (ARN) of your custom key for cross-account access.
    You must generate the customer master key ID for the same region where your Amazon S3 bucket resides. You can either specify the customer-generated customer master key ID or the default customer master key ID.
    4Click Test Connection.
    5Click Save.

Get AWS Glue source information

Get the values of the connection properties that you need to configure from the AWS Glue administrator
Note: You don't need to create a connection object for AWS Glue. You provide this information when you configure the catalog source.
The following table describes the properties that you need:
Property
Description
Athena Connection
The Amazon Athena connection object.
Access Key
The access key of the Amazon Web Services account.
Security Key
The secret key of the Amazon Web Services account.
Region
The Amazon Web Services region from where you want to run the catalog source job.
Use EC2 Role to Assume Role
Enable the EC2 role to assume the role specified in the IAM Role ARN property.
Note: Verify that the administrator granted the minimum user permission to access the AWS Glue and Amazon Athena source systems.
IAM Role ARN
The Amazon Resource Name (ARN) of the AWS Identity and Access Management (IAM) role that the user assumes to use dynamically generated temporary security credentials.
This property is required only if you select Yes for the Use EC2 Role to Assume Role property.
For more information about how to get the ARN of an IAM role, see the AWS documentation.