Catalog Source Configuration > Amazon Redshift > Before you begin

Before you begin

Before you create a catalog source, ensure that you have the information required to connect to the source system.

Perform the following tasks:

•Assign the required permissions.
•Configure authentication.
•Configure a connection to the Amazon Redshift source system in Administrator.
•Create endpoint catalog sources for connection assignment.
•Optionally, if you want to identify pairs of similar columns and relationships between tables within a catalog source, import a relationship inference model.

Verify permissions

To extract metadata and to configure other capabilities that a catalog source might include, you need account access and permissions on the source system. The permissions required might vary depending on the capability.

Permissions to extract metadata

Ensure that you have the required permissions to enable metadata extraction.

Configure the following permissions:

•Read permission on the Amazon Redshift external source.
•Permissions that allow you to perform the following operations:

- select on pg_catalog.PG_ATTRIBUTE
- select on pg_catalog.PG_CLASS
- select on pg_catalog.PG_CONSTRAINT
- select on pg_catalog.PG_DESCRIPTION
- select on pg_catalog.PG_LANGUAGE
- select on pg_catalog.PG_NAMESPACE
- select on pg_catalog.PG_PROC
- select on pg_catalog.PG_TYPE
- select on pg_catalog.PG_VIEWS
- select on information_schema.COLUMNS
- select on information_schema.TABLES
- select on pg_catalog.PG_TABLES
- select on pg_catalog.PG_CLASS_INFO
- select on pg_catalog.PG_PROC_INFO
- select on pg_catalog.SVV_EXTERNAL_TABLES
- select on pg_catalog.SVV_EXTERNAL_COLUMNS
- select on pg_get_late_binding_view_cols() cols(view_schema name, view_name name, col_name name, col_type varchar, col_num int)

•Permissions to run the SHOW EXTERNAL TABLE operation on the tables that you want to process.
•Permissions to access tables from a specific schema:

- GRANT USAGE ON SCHEMA <Schema name> to <User>;
- GRANT SELECT ON ALL TABLES IN SCHEMA <Schema name> TO <User>;

Optionally, to obtain more detailed results, grant permissions that allow you to perform the following operation:

•select on pg_catalog.PG_DATABASE

Permissions to run data profiles

Ensure that you have the required permissions to run profiles.

To perform data profiling, you need to unload data to the Amazon Redshift source system.

To unload data, configure the following connector permissions:

•ListBucket. Required to view objects from Amazon S3 buckets.
•GetBucketPolicy. Required to get the IAM policy information for access privilege details on Amazon S3 buckets or folders.
•GetObject. Required to read objects from Amazon S3 buckets.
•PutObject. Required to process staging data for Avro and Parquet files.
•DeleteObject. Required to delete staging data of Avro and Parquet files.

Grant permissions to perform the following operations:

•Usage permission on the schemas to profile.

GRANT USAGE ON SCHEMA <Schema name> TO <User name>;

•Select permission on all tables or specific tables in the schema.

GRANT SELECT ON ALL TABLES IN SCHEMA <Schema name> TO <User name>;

GRANT SELECT ON <Table name> TO <User name>;

Permissions to perform data classification

You can perform data classification with the permissions required to perform metadata extraction.

Permissions to perform relationship discovery

You can perform relationship discovery with the permissions required to perform metadata extraction.

Permissions to perform glossary association

You can perform glossary association with the permissions required to perform metadata extraction.

Create a connection

Create an Amazon Redshift connection object in Administrator with the connection details of the Amazon Redshift source system.

1In Administrator, select Connections.

2Click New Connection.

3In the Connection Details section, enter the following connection details:

Connection property	Description
Connection Name	Name of the connection. Each connection name must be unique within the organization. Connection names can contain alphanumeric characters, spaces, and the following special characters: _ . + -, Maximum length is 255 characters.
Description	Description of the connection. Maximum length is 4000 characters.

4Select the Amazon Redshift V2 connection type.

5Enter properties specific to the Amazon Redshift connection:

Property	Description
Use Secret Vault	Stores sensitive credentials for this connection in the secrets manager that is configured for your organization. This property appears only if secrets manager is set up for your organization. When you enable the secret vault in the connection, you can select which credentials that the Secure Agent retrieves from the secrets manager. If you don't enable this option, the credentials are stored in the repository or on a local Secure Agent, depending on how your organization is configured. For information about how to configure and use a secrets manager, see Secrets manager configuration.
Runtime Environment	Name of the runtime environment where you want to run tasks. Select a Secure Agent, Hosted Agent, or serverless runtime environment.
JDBC URL	The JDBC URL to connect to the Amazon Redshift cluster. You can get the JDBC URL from your Amazon AWS Redshift cluster configuration page. Enter the JDBC URL in the following format: jdbc:redshift://<cluster_endpoint>:<port_number>/<database_name>, where the endpoint includes the Redshift cluster name and region. For example, jdbc:redshift://infa-rs-cluster.abc.us-west-2.redshift.amazonaws.com:5439/rsdb In the example, - infa-rs-qa-cluster is the name of the Redshift cluster. - us-west-2.redshift.amazonaws.com is the Redshift cluster endpoint, which is the US West (Oregon) region. - 5439 is the port number for the Redshift cluster. - rsdb is the specific database instance in the Redshift cluster to which you want to connect.

6Select the Default authentication type.

7Enter the following connection details:

Property	Description
Username	User name of your database instance in the Amazon Redshift cluster.
Password	Password of the Amazon Redshift database user.

8Click Test Connection.

9Click Save.

Create endpoint catalog sources for connection assignment

An endpoint catalog source represents a source system that the catalog source references. Before you perform connection assignment, create endpoint catalog sources and run the catalog source jobs.

You can then perform connection assignment to reference source systems to view complete lineage with source system objects.

Import a relationship inference model

Import a relationship inference model if you want to configure the relationship discovery capability. You can either import a predefined relationship inference model, or import a model file from your local machine.

1In Metadata Command Center, click Explore on the navigation panel.

2Expand the menu and select Relationship Inference Model. The following image shows the Explore page with the Relationship Inference Model menu: The image shows the Explore page with the Relationship Inference Model menu and the Import Predefined Content options.

The image shows the Explore page with the Relationship Inference Model menu and the Import Predefined Content options.

3Select one of the following options:

- Import Predefined Content. Imports a predefined relationship inference model called Column Similarity Model v1.0.
- Import. Imports the predefined relationship inference model from your local machine. Select this if you previously imported predefined content into your local machine and the inference model is stored on the machine.

To import a file, click Choose File in the Import Relationship Inference Model window and navigate to the model file on your local machine. You can also drag and drop the file.

The imported models appear in the list of relationship inference models on the Relationship Discovery tab.