Amazon Redshift Connectors > Part II: Data Integration with Amazon Redshift V2 Connector > Mappings and mapping tasks with Amazon Redshift > Amazon Redshift V2 objects in mappings
  

Amazon Redshift V2 objects in mappings

When you create a mapping, you can configure a Source or Target transformation to represent an Amazon Redshift V2 object. Additionally, you can configure Lookups for mappings and mapping tasks.
Consider the following rules and guidelines when you configure Amazon Redshift V2 objects in mappings:

Amazon Redshift V2 sources in mappings

In a mapping, you can configure a Source transformation to represent an Amazon Redshift V2 source.
The following table describes the Amazon Redshift V2 source properties that you can configure in a Source transformation:
Property
Description
Connection
Name of the source connection. Select a source connection, or click New Parameter to define a new parameter for the source connection.
Source type
Type of the source object.
Select any of the following source object:
  • - Single Object
  • - Multiple Objects. You can use implicit joins and advanced relationships with multiple objects.
  • - Query. When you select the source type as query, you must map all the fields selected in the query in the Field Mapping tab.
  • - Parameter
Note: You cannot override the source query object and multiple objects at runtime using parameter files in a mapping.
When you select the source type as query, the boolean values are written as 0 or false to the target.
The query that you specify must not end with a semicolon (;).
Object
Name of the source object.
You can select single or multiple source objects.
Parameter
Select an existing parameter for the source object or click New Parameter to define a new parameter for the source object. The Parameter property appears only if you select Parameter as the source type. If you want to overwrite the parameter at runtime, select the Overwrite Parameter option.
Filter
Filters records based on the filter condition.
You can specify a simple filter or an advanced filter.
Sort
Sorts records based on the conditions you specify. You can specify the following sort conditions:
  • - Not parameterized. Select the fields and type of sorting to use.
  • - Parameterized. Use a parameter to specify the sort option.
The following table describes the Amazon Redshift V2 advanced source properties that you can configure in a Source transformation:
Property
Description
Read Mode
Specifies the read mode to read data from the Amazon Redshift source.
You can select one of the following read modes:
  • - Direct1. Reads data directly from the Amazon Redshift source without staging the data in Amazon S3.
  • - Staging. Reads data from the Amazon Redshift source by staging the data in the S3 bucket.
Default is Staging.
Fetch Size1
Determines the number of rows to read in one resultant set from Amazon Redshift. Applies only when you select the Direct read mode.
Default is 10000.
Note: If you specify fetch size 0 or if you don't specify a fetch size, the entire data set is read directly at the same time than in batches.
S3 Bucket Name*
Amazon S3 bucket name for staging the data.
You can also specify the bucket name with the folder path. If you provide an Amazon S3 bucket name that is in a different region than the Amazon Redshift cluster, you must configure the REGION attribute in the Unload command options.
Enable Compression*
Compresses the staging files into the Amazon S3 staging directory.
The task performance improves when the Secure Agent compresses the staging files. Default is selected.
Staging Directory Location1*
Location of the local staging directory.
When you run a task in Secure Agent runtime environment, specify a directory path that is available on the corresponding Secure Agent machine in the runtime environment.
Specify the directory path in the following manner: <staging directory>
For example, C:\Temp. Ensure that you have the write permissions on the directory.
Unload Options*
Unload command options.
Add options to the Unload command to extract data from Amazon Redshift and create staging files on Amazon S3. Provide an Amazon Redshift Role Amazon Resource Name (ARN).
You can add the following options:
  • - DELIMITER
  • - ESCAPE
  • - PARALLEL
  • - NULL1
  • - AWS_IAM_ROLE
  • - REGION
  • - ADDQUOTES
For example: DELIMITER = \036;ESCAPE = OFF;NULL=text;PARALLEL = ON;AWS_IAM_ROLE=arn;aws;iam;;<account ID>;role/<role-name>;REGION = ap-south-1
Specify a directory on the machine that hosts the Secure Agent.
Note: If you do not add the options to the Unload command manually, the Secure Agent uses the default values.
Treat NULL Value as NULL*
Retains the null values when you read data from Amazon Redshift.
Encryption Type*
Encrypts the data in the Amazon S3 staging directory.
You can select the following encryption types:
  • - None
  • - SSE-S3
  • - SSE-KMS
  • - CSE-SMK1
Default is None.
Download S3 Files in Multiple Parts1*
Downloads large Amazon S3 objects in multiple parts.
When the file size of an Amazon S3 object is greater than 8 MB, you can choose to download the object in multiple parts in parallel.
Default is 5 MB.
Multipart Download Threshold Size1*
The maximum threshold size to download an Amazon S3 object in multiple parts.
Default is 5 MB.
Schema Name
Overrides the default schema name.
Note: You cannot configure a custom query when you use the schema name.
Source Table Name
Overrides the default source table name.
Ensure that the metadata and column order in the override table match those in the source table imported during design time.
Note: When you select the source type as Multiple Objects or Query, you cannot use the Source Table Name option.
Pre-SQL
The pre-SQL commands to run a query before you read data from Amazon Redshift. You can also use the UNLOAD or COPY command. The command you specify here is processed as a plain text.
Post-SQL
The post-SQL commands to run a query after you write data to Amazon Redshift. You can also use the UNLOAD or COPY command. The command you specify here is processed as a plain text.
Select Distinct
Selects unique values.
The Secure Agent includes a SELECT DISTINCT statement if you choose this option. Amazon Redshift ignores trailing spaces. Therefore, the Secure Agent might extract fewer rows than expected.
Note: If you select the source type as query or use the SQL Query property and select the Select Distinct option, the Secure Agent ignores the Select Distinct option.
SQL Query
Overrides the default SQL query.
Enclose column names in double quotes. The SQL query is case sensitive. Specify an SQL statement supported by the Amazon Redshift database.
When you specify the columns in the SQL query, ensure that the column name in the query matches the source column name in the mapping.
Temporary Credential Duration
The time duration during which an IAM user can use the dynamically generated temporarily credentials to access the AWS resource. Enter the time duration in seconds.
Default is 900 seconds.
If you require more than 900 seconds, you can set the time duration up to a maximum of 12 hours in the AWS console and then enter the same time duration in this property.
Tracing Level
Use the verbose tracing level to get the amount of detail that appears in the log for the Source transformation.
1Does not apply to mappings in advanced mode.
*Does not apply to direct read mode.

Rules and guidelines for reading data

You need to follow some rules and guidelines for the read operation.

Direct read mode

Consider the following rules and guidelines when you read directly from an Amazon Redshift source:

Filter using IS NULL or IS NOT NULL operators

When you convert a simple filter condition with the IS NULL or IS NOT NULL operators to an advanced filter, these operators are converted into SQL functions and the resultant query fails. To avoid this issue when you use these operators, you need to use the Amazon Redshift compatible syntax in the advanced filter condition.

Rules and guidelines for configuring SQL query

Consider the following rules and guidelines when you configure an SQL query:

Adding multiple source objects

When you create a Source transformation, you can select Amazon Redshift V2 multiple object as the source type and then configure a join to combine the tables. You can define a relationship condition or a query to join the tables.
    1In the Source transformation, click the Source Type as Multiple Objects.
    2From the Actions menu, click Add Source Object.
    The image shows how to add a source object.
    3Select the source object that you want to add from the displayed list.
    The image shows the list of objects from which you can select a source object.
    4Click OK.
    5From the Related Objects Actions menu, select Advanced Relationship: The image shows the Advanced Relationship option.
    6In the Advanced Relationship window, you can click Add Object to add more objects.
    Click Add Object to add more objects.
    7Click OK.
    8Set your own conditions or specify a query to define the relationship between the tables:
    The image shows how you can define a relationship between objects.
    Note: When you configure a join expression, select the fields and define a join condition or a query syntax. To join two tables, you can specify a simple condition. For example, employee.department=dept.id. To join more than two tables, you must specify the complete query. For example, <schema_name>.<table_name1> LEFT OUTER JOIN <schema_name>.<table_name2> ON <schema_name>.<table_name1>.<column_name1>=<schema_name>.<table_name2>.<column_name2>
    9Click OK.
    The following image shows an example of an advanced join condition defined between the Amazon Redshift V2 tables:
    The image shows how an advanced join condition is displayed in the UI.

Rules and guidelines for adding multiple source objects

Consider the following rules and guidelines when you add multiple source objects:

Amazon Redshift V2 targets in mappings

To write data to Amazon Redshift, configure an Amazon Redshift V2 object as the target in a mapping.
When you enable the source partition, the Secure Agent uses the pass-through partitioning to write data to Amazon Redshift to optimize the mapping performance at run time. Specify the name and description of the Amazon Redshift V2 target. Configure the target and advanced properties for the target object.
The following table describes the target properties that you can configure in a Target transformation:
Property
Description
Connection
Name of the target connection. Select a target connection, or click New Parameter to define a new parameter for the target connection.
Target Type
Type of the target object.
Select Single Object or Parameter.
Object
Name of the target object.
You can select an existing target object or create a new target object at runtime. To create a new target at runtime, select the Create New at Runtime option
Creates a new target at runtime based on the table type and the path you specify.
Specify the object name, table type, and path to create a new object at runtime.
Create New at Runtime
Applies to the Create New at Runtime option.
Determines if you can create a target object at runtime with the same field names as the source.
Use Exact Source Field Names in Target1
Applies to the Create New at Runtime option. This option does not appear for an existing target.
Select to retain all the source field names in the target exactly as in the source, including any special characters. If you disable this option, special characters in the source are replaced with underscore characters in the target.
Default is disabled.
Parameter
Select an existing parameter for the target object or click New Parameter to define a new parameter for the target object.
The Parameter property appears only if you select Parameter as the target type. If you want to overwrite the parameter at runtime, select the Overwrite Parameter option.
Operation
Type of the target operation.
Select one of the following operations:
  • - Insert
  • - Update
  • - Upsert
  • - Delete
  • - Data Driven1
Select Data Driven if you want to create a mapping to capture changed data from a CDC source.
Data Driven Condition
Enables you to define expressions that flag rows for an insert, update, delete, or reject operation.
You must specify the data driven condition for non-CDC sources. For CDC sources, you must leave the field empty as the rows in the CDC source tables are already marked with the operation types.
Note: Appears only when you select Data Driven as the operation type.
Update Columns
Select columns you want to use as a logical primary key for performing update, upsert, and delete operations on the target.
Note: This field is not required if the target table already has a primary key.
Create Target
Creates a new target.
When you create a new target, enter a value of the following fields:
  • - Name: Enter a name for the target object.
  • - Path: Provide a schema name and create a target table within the schema. By default, the field is empty.
1Doesn't apply to mappings in advanced mode.
The following table describes the Amazon Redshift V2 advanced target properties:
Property
Description
S3 Bucket Name
Amazon S3 bucket name for writing the files to Amazon Redshift target.
You can also specify the bucket name with the folder path. If you provide an Amazon S3 bucket name that is in a different region than the Amazon Redshift cluster, you must configure the REGION attribute in the Copy command options.
Enable Compression
Compresses the staging files before writing the files to Amazon Redshift.
The task performance improves when the Secure Agent compresses the staged files. Default is selected.
Staging Directory Location1
Location of the local staging directory.
When you run a task in Secure Agent runtime environment, specify a directory path that is available on the corresponding Secure Agent machine in the runtime environment.
Specify the directory path in the following manner: <staging directory>
For example, C:\Temp. Ensure that you have the write permissions on the directory.
Batch Size1
Minimum number of rows in a batch.
Enter a number greater than 0. Default is 2000000.
Max Errors per Upload Batch for INSERT
Number of error rows that causes an upload insert batch to fail.
Enter a positive integer. Default is 1.
If the number of errors is equal to or greater than the property value, the Secure Agent writes the entire batch to the error file.
Truncate Target Table Before Data Load
Deletes all the existing data in the Amazon Redshift target table before loading new data.
Require Null Value For Char and Varchar
Replaces the string value with NULL when you write data to Amazon Redshift columns of Char and Varchar data types.
Default is an empty string.
Note: When you run a mapping to write null values to a table that contains a single column of the Int, Bigint, numeric, real, or double data type, the mapping fails. You must provide a value other than the default value in the Require Null Value For Char And Varchar property.
WaitTime In Seconds For S3 File Consistency1
Number of seconds to wait for the Secure Agent to make the staged files consistent with the list of files available on Amazon S3.
Default is 0.
Copy Options
Copy command options.
Add options to the Copy command to write data from Amazon S3 to the Amazon Redshift target when the default delimiter comma (,) or double-quote (") is used in the data. Provide the Amazon Redshift Role Amazon Resource Name (ARN).
You can add the following options:
  • - DELIMITER
  • - ACCEPTINVCHARS
  • - QUOTE
  • - COMPUPDATE
  • - AWS_IAM_ROLE
  • - REGION
For example:
DELIMITER = \036;ACCEPTINVCHARS = #;QUOTE = \037
COMPUPDATE = ON;AWS_IAM_ROLE=arn:aws:iam::<account ID>:role/<role-name>;REGION = ap-south-1
Specify a directory on the machine that hosts the Secure Agent.
Note: If you do not add the options to the Copy command manually, the Secure Agent uses the default values.
S3 Server Side Encryption
Indicates that Amazon S3 encrypts data during upload.
Provide a customer master key ID in the connection property to enable this property. Default is not selected.
S3 Client Side Encryption1
Indicates that the Secure Agent encrypts data using a private key.
Provide a master symmetric key ID in the connection property to enable this property. If you enable both server-side and client-side encryptions, the Secure Agent ignores the server-side encryption.
Analyze Target Table
Runs an ANALYZE command on the target table.
The query planner on Amazon Redshift updates the statistical metadata to build and choose optimal plans to improve the efficiency of queries.
Vacuum Target Table
Recovers disk space and sorts the row in a specified table or all tables in the database.
You can select the following recovery options:
  • - None
  • - Full
  • - Sort Only
  • - Delete Only
  • - Reindex
Default is None.
Prefix to retain staging files on S3
Retains staging files on Amazon S3.
Provide both a directory prefix and a file prefix separated by a slash (/) or only a file prefix to retain staging files on Amazon S3. For example, backup_dir/backup_file or backup_file.
Success File Directory1
Directory for the Amazon Redshift success file.
Specify a directory on the machine that hosts the Secure Agent.
Error File Directory1
Directory for the Amazon Redshift error file.
Specify a directory on the machine that hosts the Secure Agent.
Treat Source Rows As
Overrides the default target operation.
Default is INSERT.
Select one of the following override options:
NONE
By default, none is enabled. The Secure Agent considers the task operation that you select in the Operation target property.
INSERT
Performs insert operation. If enabled, the Secure Agent inserts all rows flagged for insert. If disabled, the Secure Agent rejects the rows flagged for insert.
DELETE
Performs delete operation. If enabled, the Secure Agent deletes all rows flagged for delete. If disabled, the Secure Agent rejects all rows flagged for delete.
UPDATE and UPSERT
Performs update and upsert operations. To perform an update operation, you must map the primary key column and at least one column other than primary key column. You can select the following data object operation attributes:
  • - Update as Update: The Secure Agent updates all rows as updates.
  • - Update else Insert: The Secure Agent updates existing rows and inserts other rows as if marked for insert.
  • For more information, see the Troubleshooting for Amazon Redshift V2 Connector topic.
Amazon Redshift V2 Connector does not support the Upsert operation in the Upgrade Strategy transformation. To use an Update Strategy transformation to write data to an Amazon Redshift target, you must select Treat Source Rows As as None.
By default, the Secure Agent performs the task operation based on the value that you specify in the Operation target property. However, if you specify an option in the Treat Source Rows As property, the Secure Agent ignores the value of that you specify in the Operation target property or in the Update Strategy transformation.
Override Target Query
Overrides the default update query that the Secure Agent generates for the update operation with the update query that you specify.
TransferManager Thread Pool Size1
Number of threads to write data in parallel.
Default is 10.
Pre-SQL
The pre-SQL commands to run a query before you read data from Amazon Redshift. You can also use the UNLOAD or COPY command. The command you specify here is processed as a plain text.
Post-SQL
The post-SQL commands to run a query after you write data to Amazon Redshift. You can also use the UNLOAD or COPY command. The command you specify here is processed as a plain text.
Preserve record order on write1
Retains the order of the records when you read data from a CDC source and write data to an Amazon Redshift target.
Use this property when you create a mapping to capture the changed record from a CDC source. This property enables you to avoid inconsistencies between the CDC source and target.
Minimum Upload Part Size1
Minimum size of the Amazon Redshift object to upload an object.
Default is 5 MB.
Number of files per batch1
Calculates the number of the staging files per batch.
If you do not provide the number of files, Amazon Redshift V2 Connector calculates the number of the staging files.
When you connect to Amazon Redshift serverless and do not provide the number of files, 20 files per batch is considered as the default value.
Schema Name
Overrides the default schema name.
Target table name
Overwrites the default target table name.
Ensure that the metadata and column order in the override table match those in the target table selected during design time.
Recovery Schema Name1
Schema that contains recovery information stored in the infa_recovery_table table on the target system to resume the extraction of the changed data from the last checkpoint.
Temporary Credential Duration
The time duration during which an IAM user can use the dynamically generated temporarily credentials to access the AWS resource. Enter the time duration in seconds.
Default is 900 seconds.
If you require more than 900 seconds, you can set the time duration maximum up to 12 hours in the AWS console and then enter the same time duration in this property.
Forward Rejected Rows
This property is not applicable for Amazon Redshift V2 Connector.
1Doesn't apply to mappings in advanced mode.

Rules and guidelines for creating a target

The Secure Agent converts the target table names that you specify in the Create Target field into lower case.
When you create a target, you can view and edit the metadata of the target object in the Target Fields tab. You can edit the data type, precision and define primary key of the columns in the target objects. To edit the metadata, click Options > Edit Metadata in the Target Fields tab.
Consider the following rules and guidelines when you create a target:

Rules and guidelines to override the target query

You can specify a target query override to override the update query that the Secure Agent generates for the update operation.
Consider the following rules and guidelines when you use the override target query property for an Amazon Redshift target:

Amazon Redshift V2 lookups in mappings

You can create lookups for objects in an Amazon Redshift V2 mapping. You can retrieve data from an Amazon Redshift V2 lookup object based on the specified lookup condition.
Use an Amazon Redshift V2 Lookup transformation to look up data in an Amazon Redshift object. For example, the source table includes the customer code, but you want to include the customer name in the target table to make summary data easy to read. You can use the Amazon Redshift V2 Lookup transformation to look up the customer name in another Amazon Redshift object.
You can add the following lookups to a mapping:
Use the JDBC URL specified in the connection properties to create lookups.
The following table describes the Amazon Redshift V2 lookup object properties that you can configure in a Lookup transformation:
Property
Description
Connection
Name of the lookup connection.
You can select an existing connection, create a new connection, or define parameter values for the lookup connection property.
If you want to overwrite the lookup connection properties at runtime, select the Allow parameter to be overridden at run time option.
Source Type
Type of the source object. Select Single Object, Query, or Parameter.
Note: You cannot configure uncached lookups when you select the source type as query.
Parameter
A parameter file where you define values that you want to update without having to edit the task.
Select an existing parameter for the lookup object or click New Parameter to define a new parameter for the lookup object.
The Parameter property appears only if you select parameter as the lookup type.
If you want to overwrite the parameter at runtime, select the Allow parameter to be overridden at run time option.
When the task runs, the Secure Agent uses the parameters from the file that you specify in the advanced session properties.
Lookup Object
Name of the lookup object for the mapping.
Multiple Matches
Behavior when the lookup condition returns multiple matches. You can return all rows, any row, the first row, the last row, or an error.
You can select from the following options in the lookup object properties to determine the behavior:
  • - Return first row
  • - Return last row
  • - Return any row
  • - Return all rows
  • - Report error
Filter
Not applicable
Sort
Not applicable
The following table describes the Amazon Redshift V2 advanced lookup properties that you can configure in a Lookup transformation:
Property
Description
S3 Bucket Name1
Amazon S3 bucket name for staging the data.
You can also specify the bucket name with the folder path. If you provide an Amazon S3 bucket name that is in a different region than the Amazon Redshift cluster, you must configure the REGION attribute in the Unload command options.
Enable Compression1
Compresses the staging files into the Amazon S3 staging directory.
The task performance improves when the Secure Agent compresses the staging files. Default is selected.
Staging Directory Location1
Location of the local staging directory.
When you run a task in Secure Agent runtime environment, specify a directory path that is available on the corresponding Secure Agent machine in the runtime environment.
Specify the directory path in the following manner: <staging directory>
For example, C:\Temp. Ensure that you have the write permissions on the directory.
Does not apply to mappings in advanced mode.
Unload Options1
Unload command options.
Add options to the Unload command to extract data from Amazon Redshift and create staging files on Amazon S3. Provide an Amazon Redshift Role Amazon Resource Name (ARN).
You can add the following options:
  • - DELIMITER
  • - ESCAPE
  • - PARALLEL
  • - NULL
  • - AWS_IAM_ROLE
  • - REGION
  • - ADDQUOTES
For example: DELIMITER = \036;ESCAPE = OFF;NULL=text;PARALLEL = ON;AWS_IAM_ROLE=arn;aws;iam;;<account ID>;role/<role-name>;REGION = ap-south-1
You cannot use the NULL option in a mapping in advanced mode.
Specify a directory on the machine that hosts the Secure Agent.
Note: If you do not add the options to the Unload command manually, the Secure Agent uses the default values.
Treat NULL Value as NULL1
Retains the null values when you read data from Amazon Redshift.
Encryption Type1
Encrypts the data in the Amazon S3 staging directory.
You can select the following encryption types:
  • - None
  • - SSE-S3
  • - SSE-KMS
  • - CSE-SMK
You can only use SSE-S3 encryption in a mapping configured in advanced mode.
Default is None.
Download S3 Files in Multiple Parts1
Downloads large Amazon S3 objects in multiple parts.
When the file size of an Amazon S3 object is greater than 8 MB, you can choose to download the object in multiple parts in parallel. Default is 5 MB.
Does not apply to mapping in advanced mode.
Multipart Download Threshold Size1
The maximum threshold size to download an Amazon S3 object in multiple parts.
Default is 5 MB.
Does not apply to mapping in advanced mode.
Schema Name
Overrides the default schema name.
Note: You cannot configure a custom query when you use the schema name.
Source Table Name
Overrides the default source table name.
Note: When you select the source type as Multiple Objects or Query, you cannot use the Source Table Name option.
Pre-SQL
The pre-SQL commands to run a query before you read data from Amazon Redshift. You can also use the UNLOAD or COPY command. The command you specify here is processed as a plain text.
Post-SQL
The post-SQL commands to run a query after you write data to Amazon Redshift. You can also use the UNLOAD or COPY command. The command you specify here is processed as a plain text.
Select Distinct1
Selects unique values.
The Secure Agent includes a SELECT DISTINCT statement if you choose this option. Amazon Redshift ignores trailing spaces. Therefore, the Secure Agent might extract fewer rows than expected.
Note: If you select the source type as query or use the SQL Query property and select the Select Distinct option, the Secure Agent ignores the Select Distinct option.
SQL Query1
Overrides the default SQL query.
Enclose column names in double quotes. The SQL query is case sensitive. Specify an SQL statement supported by the Amazon Redshift database.
When you specify the columns in the SQL query, ensure that the column name in the query matches the source column name in the mapping.
Lookup Data Filter
Limits the number of lookups that the mapping performs on the cache of the lookup source table based on the value you specify in the filter condition.
This property is applicable when you select Single Object as the source type and enable lookup cache on the Advanced tab in the Lookup transformation properties.
Maximum length is 32768 characters.
For more information about this property, see Lookup source filter in Transformations.
Temporary Credential Duration1
The time duration during which an IAM user can use the dynamically generated temporarily credentials to access the AWS resource. Enter the time duration in seconds.
Default is 900 seconds.
If you require more than 900 seconds, you can set the time duration up to a maximum of 12 hours in the AWS console and then enter the same time duration in this property.
Tracing Level
Use the verbose tracing level to get the amount of detail that appears in the log for the Source transformation.
1 Does not apply to uncached lookups.
For more information about the Lookup transformation, see Transformations.

Unconnected Lookup transformation

You can configure an unconnected Lookup transformation for the Amazon Redshift source in a mapping. Use the Lookup transformation to retrieve data from Amazon Redshift based on a specified lookup condition.
An unconnected Lookup transformation is a Lookup transformation that is not connected to any source, target, or transformation in the pipeline.
An unconnected Lookup transformation receives input values from the result of a :LKP expression in another transformation. The Integration Service queries the lookup source based on the lookup ports and condition in the Lookup transformation and passes the returned value to the port that contains the :LKP expression. The :LKP expression can pass lookup results to an expression in another transformation.
For more information about the Lookup transformation, see Transformations.

Configuring an unconnected Lookup transformation

To configure an unconnected Lookup transformation, select the Unconnected Lookup option, add incoming fields, configure the lookup condition, and designate a return value. Then configure a lookup expression in a different transformation.
    1Add a Lookup transformation in a mapping.
    2On the General tab of the Lookup transformation, enable the Unconnected Lookup option.
    3On the Incoming Fields tab of the Lookup transformation, create an incoming field for each argument in the :LKP expression.
    For each lookup condition you create, add an incoming field to the Lookup transformation.
    4In the Lookup Object tab, import the lookup object.
    The Multiple Matches property value Return all rows in an unconnected lookup is not applicable.
    5Designate a return value.
    You can pass multiple input values into a Lookup transformation and return one column of data. Data Integration can return one value from the lookup query. Use the return field to specify the return value.
    6Configure a lookup expression in another transformation.
    Supply input values for an unconnected Lookup transformation from a :LKP expression in a transformation that uses an Expression transformation. The arguments are local input fields that match the Lookup transformation input fields used in the lookup condition.
    7Map the fields with the target.

Enabling lookup caching

When you configure a Lookup transformation in a mapping, you can cache the lookup data during the runtime session.
When you select Lookup Caching Enabled, Data Integration queries the lookup source once and caches the values for use during the session, which can improve performance. You can specify the directory to store the cached lookup.
Lookup Cache Persistent
Use lookup cache persistent to save the lookup cache file to reuse it the next time Data Integration processes a Lookup transformation configured to use the cache.
You can specify the file name prefix to use with persistent lookup cache files in the Cache File Name Prefix field.
If the lookup table changes occasionally, you can enable the Re-cache from Lookup Source property to rebuild the lookup cache.
Dynamic Lookup Cache
Use a dynamic lookup cache to keep the lookup cache synchronized with the target. By default, the dynamic lookup cache is disabled and represents static cache.
If the cache is static, the data in the lookup cache does not change as the mapping task runs.
If the task uses the cache multiple times, the task uses the same data. If the cache is dynamic, the task updates the cache based on the actions in the task, so if the task uses the lookup multiple times, downstream transformations can use the updated data.
For information about lookup caching, see Transformations in the Data Integration documentation.

Rules and guidelines for configuring lookup transformations

Consider the following rules and guidelines when you configure an Amazon Redshift lookup transformation:

Mapping task with Oracle CDC sources example

Your organization needs to replicate real-time changed data from a mission-critical Oracle production system to minimize intrusive, non-critical work, such as offline reporting or analytical operations system. You can use Amazon Redshift V2 Connector to capture changed data from the Oracle CDC source and write the changed data to an Amazon Redshift target table. Add the Oracle CDC sources in mappings, and then run the associated mapping tasks to write the changed data to the target.
    1In Data Integration, click New > Mapping > Create.
    The New Mapping dialog box appears.
    2 Enter a name and description for the mapping.
    3On the Source transformation, specify a name and description in the general properties.
    4On the Source tab, select the configured Oracle CDC connection and specify the required source properties.
    5On the Target transformation, specify a name and description in the general properties.
    6On the Target tab, perform the following steps to configure the target properties:
    1. aIn the Connection field, select the Amazon Redshift V2 connection.
    2. bIn the Target Type field, select the type of the target object.
    3. cIn the Object field, select the required target object.
    4. dIn the Operation field, select Data Driven to properly handle insert, update, and delete records from the source.
    5. eIn the Data Driven Condition field, leave the field empty.
    6. fIn the Advanced Properties section, provide the values of the required target properties. You must select the Preserve record order on write check box and enter the value of the Recovery Schema Name property.
    7On the Field Mapping tab, map the incoming fields to the target fields. You can manually map an incoming field to a target field or automatically map fields based on the field names.
    8In the Actions menu, click New Mapping Task.
    The New Mapping Task page appears.
    9On the Definition tab, enter the task name and select the configured mapping.
    10On the CDC Runtime tab, specify the required properties.
    For more information about the CDC Runtime properties, see the help for Oracle CDC Connector.
    11On the Runtime Options tab, specify the following properties in the Advanced Session Properties section:
    1. aSelect Commit on End of File from the menu, and keep the property disabled.
    2. bSelect the Commit Type field as Source.
    3. cSelect Recovery Strategy and set Resume from last checkpoint as the value of the property.
    12Click Save > Run the mapping.
    Alternatively, you can create a schedule that runs the mapping task on a recurring basis without manual intervention. You can define the schedule to minimize the time between mapping task runs.
In Monitor, you can monitor the status of the logs after you run the task.

Mapping in advanced mode example

You work for an organization that stores large amount of purchase order details, such as customer ID, item codes, and item quantity in Amazon S3. You need to port the data from Amazon S3 to another cloud-based environment to quickly analyze the purchase order details and to increase future revenues.
Create a mapping that runs in advanced mode to achieve faster performance when you read all the purchase records from Amazon S3 and write the records to an Amazon Redshift target.
    1In Data Integration, click New > Mappings > Mapping.
    2In the Mapping Designer, click Switch to Advanced.
    The following image shows the Switch to Advanced button in the Mapping Designer. In the Mapping Designer, the header includes the Switch to Advanced button.
    3In the Switch to Advanced dialog box, click Switch to Advanced.
    The Mapping Designer updates the mapping canvas to display the transformations and functions that are available in advanced mode.
    4Enter a name, location, and description for the mapping.
    5Add a Source transformation, and specify a name and description in the general properties.
    6On the Source tab, perform the following steps to provide the source details to read data from the Amazon S3 source:
    1. aIn the Connection field, select the Amazon S3 V2 source connection.
    2. bIn the Source Type field, select the type of the source.
    3. cIn the Object field, select the required object.
    4. dIn the Advanced Properties section, provide the appropriate values.
    7On the Fields tab, map the Amazon S3 source fields to the target fields.
    8On the Target transformation, specify a name and description in the general properties.
    9On the Target tab, perform the following steps to provide the target details to write data to the Amazon Redshift target:
    1. aIn the Connection field, select the Amazon Redshift V2 target connection.
    2. bIn the Target Type field, select the type of the target.
    3. cIn the Object field, select the required object.
    4. dIn the Operation field, select the required operation.
    5. eIn the Advanced Properties section, provide appropriate values for the advanced target properties.
    10Map the Amazon S3 source and the Amazon Redshift target.
    11Click Save > Run to validate the mapping.
    In Monitor, you can monitor the status of the logs after you run the task.

SQL transformation

You can configure an SQL transformation to process SQL queries midstream in an Amazon Redshift V2 mapping. You cannot configure an SQL transformation for a mapping in advanced mode.
When you add an SQL transformation to the mapping, on the SQL tab, you define the connection and the type of SQL that the transformation processes.

Stored procedure

You can configure an SQL transformation to call a stored procedure in Amazon Redshift. The stored procedure must exist in the Amazon Redshift database before you create the SQL transformation. When the SQL transformation processes a stored procedure, it passes input parameters to the stored procedure. The stored procedure passes the return value to the output fields of the transformation.
The following image shows a stored procedure in a SQL transformation:
Select the SQL Type as Stored Procedure and select a stored procedure.
Consider the following rules and guidelines for stored procedures:
For more information about stored procedures, see Transformations in the Data Integration documentation.

SQL query

You can configure an SQL transformation to process an entered query that you define in the SQL editor. The SQL transformation processes the query and returns the rows. The SQL transformation also returns any errors that occur from the underlying database or the user syntax.
You can process an entered query to call a stored procedure that returns a value or no values.
The following image shows an entered query in a SQL transformation:
Select the SQL Type as SQL Query and select the Query Type as Entered Query.
Consider the following rules and guidelines for SQL queries:
For more information about SQL queries, see Transformations in the Data Integration documentation.

Dynamic schema handling for mappings

You can choose how Data Integration handles changes that you make to the data object schemas. To refresh the schema every time the mapping task runs, you can enable dynamic schema handling in the task.
A schema change includes one or more of the following changes to the data object:
Configure schema change handling on the Runtime Options page when you configure the task.
The following table describes the schema change handling options:
Option
Description
Asynchronous
Default. Data Integration refreshes the schema when you edit the mapping or mapping task, and when Informatica Intelligent Cloud Services is upgraded.
Dynamic
Data Integration refreshes the schema every time the task runs.
You can choose from the following options to refresh the schema:
  • - Alter and Apply Changes. Data Integration alters the target schema and adds the new fields from the source.
  • - Drop Current and Recreate. Drops the existing target table and then recreates the target table at runtime using all the incoming metadata fields from the source.
  • - Don't Apply DDL Changes. Data Integration does not apply the schema changes to the target.
For more information, see the "Schema change handling" topic in Tasks.

Rules and guidelines for dynamic schema handling for mappings

Consider the following rules and guidelines when you enable dynamic schema change handling:

Configuring key range partition

Configure key range partition to partition Amazon Redshift data based on field values.
    1In Source Properties, click the Partitions tab.
    2Select the required Partition Key from the list.
    3Click Add New key Range to add partitions.
    4Specify the Start range and End range.

Bulk processing for write operations

You can enable bulk processing to write large amounts of data to Amazon Redshift. Bulk processing utilizes minimal number of API calls and enhances performance of the write operation.
To enable bulk processing, specify the property -DENABLE_WRITER_BULK_PROCESSING=true in the Secure Agent properties:
Perform the following steps to configure bulk processing before you run a mapping:
  1. 1In Administrator, select the Secure Agent listed on the Runtime Environments tab.
  2. 2Click Edit.
  3. 3In the System Configuration Details section, select Data Integration Service as the service and DTM as the type.
  4. 4Edit the JVMOption1 property, and enter -DENABLE_WRITER_BULK_PROCESSING=true.
  5. 5Click Save.
Note: Not applicable to mapping tasks configured with SQL ELT optimization.

Optimize the staging performance for a mapping

Data Integration, by default, creates a flat file locally in a temporary folder to stage the data before reading from and writing to Amazon Redshift. You can set Data Integration to optimize the staging performance.
If you do not set the optimization property, Data Integration performs staging without the optimized settings, which might impact the performance of the task.
Consider the following rules when you enable the optimization property:

Enabling Amazon Redshift Connector to optimize the staging performance

Perform the following steps to enable the Amazon Redshift V2 Connector to optimize the staging performance:
  1. 1In Administrator, click Runtime Environments.
  2. 2Edit the Secure Agent for which you want to set the property.
  3. 3In the System Configuration Details section, select the Service as Data Integration Server and the type as Tomcat.
  4. 4Set the value of the Tomcat property INFA_DTM_RDR_STAGING_ENABLED_CONNECTORS for the read operation, INFA_DTM_STAGING_ENABLED_CONNECTORS for the write operation, and INFA_DTM_LKP_STAGING_ENABLED_CONNECTORS for the cached lookup operation to the plugin ID of Amazon Redshift Connector.
  5. You can find the plugin ID in the manifest file located in the following directory: <Secure Agent installation directory>/downloads/<AmazonRedshift package>/CCIManifest
    The following image shows the optimization property that you set for staging data in the DTM of the Secure Agent: DTM staging property - read, write, lookup
When you run the mapping, the flat file is created in the following directory in your machine: C:\Windows\Temp\AmazonRedshift\stage\<AmazonRedshift_Target.txt>
You can check the session logs to verify if the optimization property is enabled.