Amazon Redshift Connectors > Part II: Data Integration with Amazon Redshift V2 Connector > Mappings and mapping tasks with Amazon Redshift > Before you begin

Before you begin

Perform the following tasks before you can create a mapping:

General prerequisites

Ensure that you have access to the Secure Agent directory that contains the success and error files. The directory path must be the same on each Secure Agent machine in the runtime environment.

IAM authentication

If you use IAM authentication, you must create an Redshift Role Amazon Resource Name (ARN), add the minimal Amazon IAM policy to the Redshift Role ARN, and add the Redshift Role ARN to the Redshift cluster. Provide the Redshift Role ARN in the AWS_IAM_ROLE option in the UNLOAD and COPY commands when you create a task.

If you specify both, the access key ID and secret access key in the connection properties and AWS_IAM_ROLE in the UNLOAD and COPY commands, AWS_IAM_ROLE takes the precedence.

Temporary security credentials

Consider the following guidelines when you use the temporary security credentials:

•Before you run a task, ensure that you have enough time to use the temporary security credentials for running the task. You cannot extend the time duration of the temporary security credentials for an ongoing task. For example, when you read from and write to Amazon Redshift and if the temporary security credentials expire, you cannot extend the time duration of the temporary security credentials that causes the task to fail.
•After the temporary security credentials expire, AWS does not authorize the IAM users or IAM roles to access the resources using the credentials. You must request for new temporary security credentials before the previous temporary security credentials expire in a mapping.
•For mappings in advanced mode, the temporary security credentials do not expire even after the configured time in the Temporary Credential Duration advanced source and target property elapses.
•When you create an Amazon Redshift V2 connection with the IAM Role ARN and use the SSE-KMS encryption, you must specify AWS_IAM_ROLE as the unload option in the Amazon Redshift V2 advanced source properties.
•If both the source and target in a mapping point to the same Amazon S3 bucket, use the same Amazon S3 connection in the Source and Target transformations. If you use two different Amazon S3 connections, configure the same values in the connection properties for both the connections.
•If the source and target in a mapping point to different Amazon S3 buckets, you can use two different Amazon S3 connections.

You can configure different values in the connection properties for both the connections. However, you must select the Use EC2 Role to Assume Role check box in the connection property. You must also specify the same value for the Temporary Credential Duration field in the source and target properties.

CDC sources

To create a mapping with a CDC source, ensure that you have the PowerExchangeClient and CDC licenses. Configure a CDC source if you want to create a mapping to capture changed data from the CDC source, and then run the associated mapping tasks to write the changed data to an Amazon Redshift target.

Mappings in advanced mode

If you configure a mapping to run in advanced mode, ensure that the Redshift cluster and the advanced cluster reside in the same virtual private cloud (VPC).

Create an external schema and table for Amazon Redshift Spectrum

To use Amazon Redshift Spectrum, you must create an external table within an external schema that references a database in an external data catalog. You can create the external table for Avro, ORC, Parquet, RCFile, SequenceFIile, and Textfile file formats.

The metadata of the external database and external table are stored in the external data catalog. You must provide Amazon Redshift authorization to access the data catalog and the data files in Amazon S3.

You can create an external database in Amazon Redshift. You can read data from a single external table, multiple external table, or from a standard Amazon Redshift table that is joined to the external table.

Multiple Amazon Redshift clusters can contain multiple external tables. You can run a query for the same data on Amazon S3 from any Amazon Redshift cluster in the same region. When you update the data in Amazon S3, the data is immediately available in all the Amazon Redshift clusters.

When you create an external table, you must specify the Amazon S3 location from where you want to read the data. You can create the external tables by defining the structure of the Amazon S3 data files and registering the external tables in the external data catalog. Then, you can run queries or join the external tables.

When you add an external table as source and create a mapping, the external table name is displayed in the spectrum_schemaname format in the Select Source Object dialog box.

When you create an external table using Athena or Glue data catalogs, ensure that you create the external tables using the data types that Amazon Redshift V2 Connector supports.

The following lists the data types that Amazon Redshift V2 Connector supports when you create an external table:

•Bigint (INT8)
•Boolean (BOOL)
•Char (CHARACTER)
•Date

Note: Applicable when you create an external table for the ORC, Parquet, and Textfile file formats.

•Decimal (NUMERIC)
•Double Precision (FLOAT8)
•Integer (INT, INT4)
•Real (FLOAT4)
•Smallint (INT2)
•Timestamp
•Varchar (CHARACTER VARYING)

Rules and guidelines for external tables

Consider the following rules and guidelines for external tables:

•You can only read data from the Amazon Redshift Spectrum external table. You cannot insert or update data in the Amazon Redshift Spectrum external table.
•The Secure Agent does not remove the external table names from the list of target objects available in the Target transformation.
•You cannot use pre-SQL and post-SQL commands to perform target operations on an external table.

For more information on how to create an external table, see the AWS documentation.