Catalog Source Configuration > Amazon Redshift > Before you begin
  

Before you begin

Before you create a catalog source, ensure that you have the information required to connect to the source system.
Perform the following tasks:

Verify permissions

To extract metadata and to configure other capabilities that a catalog source might include, you need account access and permissions on the source system. The permissions required might vary depending on the capability.

Permissions to extract metadata

Ensure that you have the required permissions to enable metadata extraction.
Configure the following permissions:
Optionally, to obtain more detailed results, grant permissions that allow you to perform the following operation:

Permissions to run data profiles

Ensure that you have the required permissions to run profiles.
To perform data profiling, you need to unload data to the Amazon Redshift source system.
To unload data, configure the following connector permissions:
Grant permissions to perform the following operations:

Permissions to perform data classification

You can perform data classification with the permissions required to perform metadata extraction.

Permissions to perform relationship discovery

You can perform relationship discovery with the permissions required to perform metadata extraction.

Permissions to perform glossary association

You can perform glossary association with the permissions required to perform metadata extraction.

Create a connection

Create an Amazon Redshift connection object in Administrator with the connection details of the Amazon Redshift source system.
    1In Administrator, select Connections.
    2Click New Connection.
    3In the Connection Details section, enter the following connection details:
    Connection property
    Description
    Connection Name
    Name of the connection.
    Each connection name must be unique within the organization. Connection names can contain alphanumeric characters, spaces, and the following special characters: _ . + -,
    Maximum length is 255 characters.
    Description
    Description of the connection. Maximum length is 4000 characters.
    4Select the Amazon Redshift V2 connection type.
    5Enter properties specific to the Amazon Redshift connection:
    Property
    Description
    Use Secret Vault
    Stores sensitive credentials for this connection in the secrets manager that is configured for your organization.
    This property appears only if secrets manager is set up for your organization.
    When you enable the secret vault in the connection, you can select which credentials that the Secure Agent retrieves from the secrets manager. If you don't enable this option, the credentials are stored in the repository or on a local Secure Agent, depending on how your organization is configured.
    Note: If you’re using this connection to apply data access policies through pushdown or proxy services, you cannot use the Secret Vault configuration option.
    For information about how to configure and use a secrets manager, see Secrets manager configuration.
    Runtime Environment
    Name of the runtime environment where you want to run tasks.
    Select a Secure Agent, Hosted Agent, or serverless runtime environment.
    JDBC URL
    The JDBC URL to connect to the Amazon Redshift cluster.
    You can get the JDBC URL from your Amazon AWS Redshift cluster configuration page.
    Enter the JDBC URL in the following format:
    jdbc:redshift://<cluster_endpoint>:<port_number>/<database_name>, where the endpoint includes the Redshift cluster name and region.
    For example, jdbc:redshift://infa-rs-cluster.abc.us-west-2.redshift.amazonaws.com:5439/rsdb
    In the example,
    • - infa-rs-qa-cluster is the name of the Redshift cluster.
    • - us-west-2.redshift.amazonaws.com is the Redshift cluster endpoint, which is the US West (Oregon) region.
    • - 5439 is the port number for the Redshift cluster.
    • - rsdb is the specific database instance in the Redshift cluster to which you want to connect.
    6Select the authentication type to connect to Amazon Redshift and enter the required properties.
    You can use the following authentication types:
    7Click Test Connection.
    8Click Save.

Default authentication

Default authentication uses the user name and password to connect to Amazon Redshift.
The following table describes the basic connection properties for default authentication:
Properties
Description
JDBC URL
The JDBC URL to connect to the Amazon Redshift cluster.
You can get the JDBC URL from your Amazon AWS Redshift cluster configuration page.
Enter the JDBC URL in the following format:
jdbc:redshift://<cluster_endpoint>:<port_number>/<database_name>, where the endpoint includes the Redshift cluster name and region.
For example, jdbc:redshift://infa-rs-cluster.abc.us-west-2.redshift.amazonaws.com:5439/rsdb
In the example,
  • - infa-rs-qa-cluster is the name of the Redshift cluster.
  • - us-west-2.redshift.amazonaws.com is the Redshift cluster endpoint, which is the US West (Oregon) region.
  • - 5439 is the port number for the Redshift cluster.
  • - rsdb is the specific database instance in the Redshift cluster to which you want to connect.
Username
User name of your database instance in the Amazon Redshift cluster.
Password
Password of the Amazon Redshift database user.
Use EC2 Role to Assume Role
Enables the EC2 instance that assumes an S3 IAM role to access the S3 resources to stage data using the temporary security credentials.
The EC2 role must have a policy attached with permissions to assume an S3 IAM role. The S3 IAM role and the EC2 instance can be in the same or different AWS account.
Select the check box to enable the EC2 role to assume an S3 IAM role specified in the S3 IAM Role ARN option to access the S3 resources for staging data.
S3 IAM Role ARN
The Amazon Resource Number (ARN) of the IAM role assumed by the IAM user or EC2 to use the dynamically generated temporary security credentials to stage data in Amazon S3.
This property applies when you want to generate temporary security credentials to access the S3 staging buckets by using either the EC2 instance or the IAM user who assumes the S3 IAM role.
Specify the S3 IAM role name to use the temporary security credentials to access the Amazon S3 staging bucket.
For more information about how to get the ARN of the S3 IAM role, see the AWS documentation.

Advanced settings

The following table describes the advanced connection properties for default authentication:
Properties
Description
S3 Access Key ID
Access key ID of the IAM user to access the Amazon S3 staging bucket.
Enter the access key ID when you use the following methods for S3 staging:
  • - When the IAM user has access to S3 staging.
  • - When the IAM user who assumes the S3 IAM role uses the temporary security credentials to access S3.
The S3 access key ID is only validated at runtime, so verify its accuracy before saving the connection to prevent runtime errors.
You do not need to enter the S3 access key ID if you use IAM authentication or the assume role for EC2 to access S3.
Note: If you use the connection for application ingestion and replication or database ingestion and replication tasks that use key-based authentication, provide the access key value.
S3 Secret Access Key
Secret access key to access the Amazon S3 staging bucket.
The secret access key is associated with the access key ID and uniquely identifies the account.
Enter the secret access key value when you use following methods for S3 staging:
  • - When the IAM user has access to S3 staging.
  • - When the IAM user who assumes the S3 IAM role uses the temporary security credentials to access S3.
The S3 secret access key is only validated at runtime, so verify its accuracy before saving the connection to prevent runtime errors.
You do not need to enter the S3 secret access key if you use IAM authentication or the assume role for EC2 to access S3.
S3 VPC Endpoint Type
The type of Amazon Virtual Private Cloud endpoint for Amazon S3.
You can use a VPC endpoint to enable private communication with Amazon S3.
Select one of the following options:
  • - Default. Select if you do not want to use a VPC endpoint.
  • - Interface Endpoint. Select to establish private communication with Amazon S3 through an interface endpoint which uses a private IP address from the IP address range of your subnet. It serves as an entry point for traffic destined to an AWS service.
Endpoint DNS Name for Amazon S3
The DNS name for the Amazon S3 interface endpoint.
Replace the asterisk symbol with the bucket keyword in the DNS name.
Enter the DNS name in the following format:
bucket.<DNS name of the interface endpoint>
For example, bucket.vpce-s3.us-west-2.vpce.amazonaws.com
STS VPC Endpoint Type
The type of Amazon Virtual Private Cloud endpoint for AWS Security Token Service.
You can use a VPC endpoint to enable private communication with Amazon Security Token Service.
Select one of the following options:
  • - Default. Select if you do not want to use a VPC endpoint.
  • - Interface Endpoint. Select to establish private communication with Amazon Security Token Service through an interface endpoint which uses a private IP address from the IP address range of your subnet.
Endpoint DNS Name for AWS STS
The DNS name for the AWS STS interface endpoint.
For example, vpce-01f22cc14558c241f-s8039x4c.sts.us-west-2.vpce.amazonaws.com
KMS VPC Endpoint Type
The type of Amazon Virtual Private Cloud endpoint for AWS Key Management Service.
You can use a VPC endpoint to enable private communication with Amazon Key Management Service.
Select one of the following options:
  • - Default. Select if you do not want to use a VPC endpoint.
  • - Interface Endpoint. Select to establish private communication with Amazon Key Management Service through an interface endpoint which uses a private IP address from the IP address range of your subnet.
Endpoint DNS Name for AWS KMS
The DNS name for the AWS KMS interface endpoint.
For example, vpce-0e722f5c721e19232-g2pkm2r7.kms.us-west-2.vpce.amazonaws.com
External ID
The external ID associated with the IAM role.
You can specify the external ID if you want to provide a more secure access to the Amazon S3 bucket. The Amazon S3 staging bucket and the IAM role can be in the same or different AWS accounts.
If required, you also have the option to specify the external ID in the AssumeRole request to the AWS Security Token Service (STS) using an external ID condition in the assumed IAM role's trust policy.
For more information about using an external ID, see External ID when granting access to your AWS resources.
Cluster Region
The AWS cluster region in which the Redshift cluster resides.
Select the cluster region from the list if you choose to provide a custom JDBC URL with a different cluster region from that specified in the JDBC URL field property. To continue to use the cluster region name specified in the JDBC URL field property, select None as the cluster region in this property.
You can only read data from or write data to the cluster regions supported by the AWS SDK.
Select one of the following cluster regions:
None
Asia Pacific(Mumbai)
Asia Pacific(Seoul)
Asia Pacific(Singapore)
Asia Pacific(Sydney)
Asia Pacific(Tokyo)
Asia Pacific(Hong Kong)
AWS GovCloud (US)
AWS GovCloud (US-East)
Canada(Central)
China(Bejing)
China(Ningxia)
EU(Ireland)
EU(Frankfurt)
EU(Paris)
EU(Stockholm)
South America(Sao Paulo)
Middle East(Bahrain)
US East(N. Virginia)
US East(Ohio)
US West(N. California)
US West(Oregon)
Default is None.
Connection Environment SQL
The SQL statement to set up the database environment that applies for the entire session.
Separate multiple values with a semicolon (;).
Specify only the configurations for the database environment in the SQL statement. Do not specify any DDL or DML commands in the SQL statement.
Master Symmetric Key
You cannot use client-side encryption type for Amazon Redshift V2 Connector. Hence, if you specify the master symmetric key, it is ignored.
Customer Master Key ID
The customer master key ID generated by AWS Key Management Service (AWS KMS) or the ARN of your custom key for cross-account access when you stage data in Amazon S3. The customer master key serves to encrypt your data at the destination before they are saved in Amazon S3.
You can either enter the customer-generated customer master key ID or the default customer master key ID.

Redshift IAM AssumeRole authentication

The Redshift AssumeRole authentication enables the user to assume an IAM role or define an EC2 role configured with required trust policies to generate temporary security credentials to access Amazon Redshift.
The following table describes the basic connection properties for Redshift IAM AssumeRole authentication:
Properties
Description
JDBC URL
The JDBC URL to connect to the Amazon Redshift cluster.
You can get the JDBC URL from your Amazon AWS Redshift cluster configuration page.
Enter the JDBC URL in the following format:
jdbc:redshift://<cluster_endpoint>:<port_number>/<database_name>, where the endpoint includes the Redshift cluster name and region.
For example, jdbc:redshift://infa-rs-cluster.abc.us-west-2.redshift.amazonaws.com:5439/rsdb
In the example,
  • - infa-rs-qa-cluster is the name of the Redshift cluster.
  • - us-west-2.redshift.amazonaws.com is the Redshift cluster endpoint, which is the US West (Oregon) region.
  • - 5439 is the port number for the Redshift cluster.
  • - rsdb is the specific database instance in the Redshift cluster to which you want to connect.
Username
User name of your database instance in the Amazon Redshift cluster.
Cluster Identifier
The unique identifier of the cluster that hosts Amazon Redshift.
Specify the Amazon Redshift cluster name.
Database Name
Name of the Amazon Redshift database where the tables that you want to access are stored.
Redshift IAM Role ARN
The Amazon Resource Number (ARN) of the IAM role assumed by EC2 to use the dynamically generated temporary security credentials to access Amazon Redshift.
Enter the Redshift IAM role ARN to access the Amazon Redshift cluster.
Use EC2 Role to Assume Role
Enables the EC2 role to assume an IAM role, either to connect to Redshift or to stage data using the temporary security credentials:
Connect to Redshift with IAM authentication using the EC2 role
Select the check box to enable the EC2 role that assumes a Redshift IAM role specified in the Redshift IAM Role ARN field to access Amazon Redshift.
The EC2 role must have a policy attached with permissions to assume a Redshift IAM role from the same or different account.
Access S3 resources to stage data
Select the check box to enable the EC2 role to assume an S3 IAM role specified in the S3 IAM Role ARN field and dynamically generate the temporary security credentials to access the S3 staging buckets.
The EC2 role must have a policy attached with permissions to assume an S3 IAM role from the same or different AWS account.
S3 IAM Role ARN
The Amazon Resource Number (ARN) of the S3 IAM role assumed by the IAM user or EC2 to use the dynamically generated temporary security credentials to stage data in Amazon S3.
This property applies when you want to generate the temporary security credentials to access the S3 staging buckets by using either the EC2 instance or the IAM user who assumes the S3 IAM role.
Specify the S3 IAM role name to use the temporary security credentials to access the Amazon S3 staging bucket.
For more information about how to get the ARN of the IAM role, see the AWS documentation.

Advanced settings

The following table describes the advanced connection properties for Redshift IAM AssumeRole authentication:
Properties
Description
Redshift Access Key ID
The access key of the IAM user that has permissions to assume the Redshift IAM AssumeRole ARN.
This property doesn't apply to Amazon Redshift AssumeRole authentication with EC2 role.
Redshift Secret Access Key
The secret access key of the IAM user that has permissions to assume the Redshift IAM Assume Role ARN.
This property doesn't apply to Amazon Redshift AssumeRole authentication with EC2 role.

Create endpoint catalog sources for connection assignment

An endpoint catalog source represents a source system that the catalog source references. Before you perform connection assignment, create endpoint catalog sources and run the catalog source jobs.
You can then perform connection assignment to reference source systems to view complete lineage with source system objects.

Import a relationship inference model

Import a relationship inference model if you want to configure the relationship discovery capability. You can either import a predefined relationship inference model, or import a model file from your local machine.
    1In Metadata Command Center, click Explore on the navigation panel.
    2Expand the menu and select Relationship Inference Model. The following image shows the Explore page with the Relationship Inference Model menu:The image shows the Explore page with the Relationship Inference Model menu and the Import Predefined Content options.
    3Select one of the following options:
    The imported models appear in the list of relationship inference models on the Relationship Discovery tab.