Asset Conversion > Asset Conversion > Creating a connection map configuration
  

Creating a connection map configuration

Create a Connection Map configuration to map the PowerCenter connections to IDMC connections.
The Connection Map configuration is associated with a domain and a repository. This configuration contains connection types and connection names used in the assessed repository. If you want to repoint the assets to a new endpoint, map these connections to relevant IDMC connections.
The following video shows you how to create a connection map configuration:
https://docs.informatica.com/content/dam/videos/prod/dmsui/ConnectionMap.mp4
    1On the left navigation menu, click New.
    The New Asset dialog box appears.
    2Click Configuration, and then click Create.
    The New Configuration page appears.
    3From the Configuration Type list, select Connection Map.
    4Enter the required information as shown in the following table:
    Property
    Description
    Configuration Name
    Name of the configuration.
    Description
    (Optional) Description of the configuration.
    Domain
    The domain where you want to use this configuration.
    The list of repositories within the selected domain appears in the Repository field.
    Repository Type
    The type of repository where you want to use this configuration.
    Repository
    The repository where you want to use this configuration.
    Important: The selected repository must be assessed to create the configuration.
    5Click Next.
    On the Set Properties tab, the Connection Map Configuration section displays the list of connections used in the repository. You can filter the connections based on a connection type.
    6In the Cloud Connection Type and Cloud Connection Name columns, select the IDMC connection type and connection name of the endpoint to which you want to repoint the assets.
    7In the Transformation column, select the transformations where you want to use this repoint.
    You can use the following transformations for a repoint:
    8If you want to configure additional properties for the repoint, click Set Properties in the Advanced Properties column.
    9Click Save Configuration.
    The configuration is saved. You can use this configuration in conversion tasks.

Advanced properties

If you are repointing your assets to another endpoint, you can configure additional properties in the Connection Map configuration.
Based on the IDMC connection that you choose for repoint, you can configure the properties specific to that connection.

Amazon Redshift advanced properties

The following tables lists the Amazon Redshift properties that you can configure for a Connection Map configuration:

Object properties

The following table describes the Amazon Redshift connection properties:
Connection property
Description
Schema
The Amazon Redshift schema name to be used when creating the object. Th schema name is similar to the schema name specified in the SCHEMAMAP.properties file.
Default is schema_changeit.

Source advanced properties

The following table describes the Amazon Redshift source advanced properties:
Property name
Description
Schema Name
Overrides the default schema name.
S3 Bucket Name
Amazon S3 bucket name for staging the data.
You can also specify the bucket name with the folder path. If you provide an Amazon S3 bucket name that is in a different region than the Amazon Redshift cluster, you must configure the REGION attribute in the Unload command options.
Read Mode
Specifies the read mode to read data from the Amazon Redshift source.
You can specify one of the following read modes:
  • - Direct: Reads data directly from the Amazon Redshift source without staging the data in Amazon S3.
  • - Staging: Reads data from the Amazon Redshift source by staging the data in the S3 bucket.
Default is Staging.
Fetch Size
Determines the number of rows to read in one resultant set from Amazon Redshift. Applies only when you select the Direct read mode.
Default is 10000.
Enable Compression
Compresses the staging files into the Amazon S3 staging directory. The task performance improves when the Secure Agent compresses the staging files.
Default is enabled.
Staging Directory Location
Location of the local staging directory. When you run a task in the Secure Agent runtime environment, specify a directory path that is available on the corresponding Secure Agent machine in the runtime environment.
For example, C:\Temp. Ensure that you have the write permissions to the directory.
Unload Options
Unload command options. Enter the options to the Unload command to extract data from Amazon Redshift and create staging files on Amazon S3.
You can enter the following unload options:
  • - DELIMITER
  • - ESCAPE
  • - PARALLEL
  • - NULL
  • - AWS_IAM_ROLE
  • - REGION
  • - ADDQUOTES
For example: DELIMITER = \036;ESCAPE = OFF;NULL=text;PARALLEL = ON;AWS_IAM_ROLE=arn;aws;iam;;<account ID>;role/<role-name>;REGION = ap-south-1
For the AWS_IAM_ROLE option, provide the Amazon Resource Name (ARN) of an Amazon Redshift Role. Also, specify a directory on the machine that hosts the Secure Agent.
Temporary Credential Duration
The time duration during which an IAM user can use the dynamically generated temporarily credentials to access the AWS resource. Enter the time duration in seconds.
Default is 900 seconds.
If you require more than 900 seconds, you can set the time duration up to a maximum of 12 hours in the AWS console and then enter the same time duration in this property.
Treat NULL Value as NULL
Retains the null values when you read data from Amazon Redshift.
Encryption Type
Encrypts the data in the Amazon S3 staging directory.
You can specify the following encryption types:
  • - None
  • - SSE-S3
  • - SSE-KMS
  • - CSE-SMK
Default is None.
Download S3 Files In Multiple Parts
Downloads large Amazon S3 objects in multiple parts.
When the file size of an Amazon S3 object is greater than 8 MB, you can choose to download the object in multiple parts in parallel.
Default is 5 MB.
Multipart Download Threshold Size
The maximum threshold size to download an Amazon S3 object in multiple parts.
Default is 5 MB.

Target advanced properties

The following table describes the Amazon Redshift target advanced properties:
Property
Description
S3 Bucket Name
Amazon S3 bucket name for writing the files to Amazon Redshift target.
You can also specify the bucket name with the folder path. If you provide an Amazon S3 bucket name that is in a different region than the Amazon Redshift cluster, you must configure the REGION attribute in the Copy Options property.
Enable Compression
Compresses the staging files before writing the files to Amazon Redshift.
The task performance improves when the Secure Agent compresses the staged files. Default is enabled.
Staging Directory Location
Location of the local staging directory.
When you run a task in Secure Agent runtime environment specify a directory path that is available on the corresponding Secure Agent machine in the runtime environment.
For example, C:\Temp. Ensure that you have the write permissions on the directory.
Batch Size
Minimum number of rows in a batch.
Enter a number greater than 0. Default is 2000000.
Max Errors per Upload Batch for INSERT
Number of error rows that causes an upload insert batch to fail.
Enter a positive integer. Default is 1.
If the number of errors is equal to or greater than the property value, the Secure Agent writes the entire batch to the error file.
Require Null Value For Char and Varchar
Replaces the string value with NULL when you write data to Amazon Redshift columns of Char and Varchar data types.
Default is an empty string.
Note: When you run a mapping to write null values to a table that contains a single column of the Int, Bigint, numeric, real, or double data type, the mapping fails. You must provide a value other than the default value in the Require Null Value For Char And Varchar property.
WaitTime In Seconds For S3 File Consistency
Number of seconds to wait for the Secure Agent to make the staged files consistent with the list of files available on Amazon S3.
Default is 0.
Copy Options
Copy command options.
Enter the options to the Copy command to write data from Amazon S3 to the Amazon Redshift target when the default delimiter comma (,) or double-quote (") is used in the data. Provide the Amazon Redshift Role Amazon Resource Name (ARN).
You can add the following options:
  • - DELIMITER
  • - ACCEPTINVCHARS
  • - QUOTE
  • - COMPUPDATE
  • - AWS_IAM_ROLE
  • - REGION
For example:
DELIMITER = \036;ACCEPTINVCHARS = #;QUOTE = \037
COMPUPDATE = ON;AWS_IAM_ROLE=arn:aws:iam::<account ID>:role/<role-name>;REGION = ap-south-1
Specify a directory on the machine that hosts the Secure Agent.
Note: If you do not add the options to the Copy command manually, the Secure Agent uses the default values.
S3 Server Side Encryption
Indicates that Amazon S3 encrypts data during upload.
Provide a customer master key ID in the connection property to enable this property. Default is not selected.
S3 Client Side Encryption
Indicates that the Secure Agent encrypts data using a private key.
Provide a master symmetric key ID in the connection property to enable this property. If you enable both server-side and client-side encryptions, the Secure Agent ignores the server-side encryption.
Analyze Target Table
Runs an ANALYZE command on the target table.
The query planner on Amazon Redshift updates the statistical metadata to build and choose optimal plans to improve the efficiency of queries.
Vacuum Target Table
Recovers disk space and sorts the row in a specified table or all tables in the database.
You can select the following recovery options:
  • - None
  • - Full
  • - Sort Only
  • - Delete Only
  • - Reindex
Default is None.
Prefix to retain staging files on S3
Retains staging files on Amazon S3.
Provide both a directory prefix and a file prefix separated by a slash (/) or only a file prefix to retain staging files on Amazon S3. For example, backup_dir/backup_file or backup_file.
Success File Directory
Directory for the Amazon Redshift success file.
Specify a directory on the machine that hosts the Secure Agent.
Error File Directory
Directory for the Amazon Redshift error file.
Specify a directory on the machine that hosts the Secure Agent.
Transfer Manager Thread Pool Size
Number of threads to write data in parallel.
Default is 10.
Preserve record order on write
Retains the order of the records when you read data from a CDC source and write data to an Amazon Redshift target.
Use this property when you create a mapping to capture the changed record from a CDC source. This property enables you to avoid inconsistencies between the CDC source and target.
Minimum Upload Part Size
Minimum size of the Amazon Redshift object to upload an object.
Default is 5 MB.
Number of files per batch
Calculates the number of the staging files per batch.
If you do not provide the number of files, Amazon Redshift V2 Connector calculates the number of the staging files.
When you connect to Amazon Redshift serverless and do not provide the number of files, 20 files per batch is considered as the default value.
Schema Name
Overrides the default schema name.
Recovery Schema Name
Schema that contains recovery information stored in the infa_recovery_table table on the target system to resume the extraction of the changed data from the last checkpoint.
Temporary Credential Duration
The time duration during which an IAM user can use the dynamically generated temporarily credentials to access the AWS resource. Enter the time duration in seconds.
Default is 900 seconds.
If you require more than 900 seconds, you can set the time duration maximum up to 12 hours in the AWS console and then enter the same time duration in this property.

Lookup advanced properties

The following table describes the Amazon Redshift lookup transformation advanced properties:
Property
Description
Read Mode
Specifies the read mode to read data from the Amazon Redshift source.
You can specify one of the following read modes:
  • - Direct: Reads data directly from the Amazon Redshift source without staging the data in Amazon S3.
  • - Staging: Reads data from the Amazon Redshift source by staging the data in the S3 bucket.
Default is Staging.
Fetch Size
Determines the number of rows to read in one resultant set from Amazon Redshift. Applies only when you select the Direct read mode.
Default is 10000.
S3 Bucket Name
Amazon S3 bucket name for staging the data.
You can also specify the bucket name with the folder path. If you provide an Amazon S3 bucket name that is in a different region than the Amazon Redshift cluster, you must configure the REGION attribute in the Unload command options.
Enable Compression
Compresses the staging files into the Amazon S3 staging directory.
The task performance improves when the Secure Agent compresses the staging files. Default is selected.
Staging Directory Location
Location of the local staging directory.
When you run a task in Secure Agent runtime environment, specify a directory path that is available on the corresponding Secure Agent machine in the runtime environment.
Specify the directory path in the following manner: <staging directory>
For example, C:\Temp. Ensure that you have the write permissions on the directory.
Unload Options
Unload command options.
Add options to the Unload command to extract data from Amazon Redshift and create staging files on Amazon S3. Provide an Amazon Redshift Role Amazon Resource Name (ARN).
You can add the following options:
  • - DELIMITER
  • - ESCAPE
  • - PARALLEL
  • - NULL
  • - AWS_IAM_ROLE
  • - REGION
  • - ADDQUOTES
For example: DELIMITER = \036;ESCAPE = OFF;NULL=text;PARALLEL = ON;AWS_IAM_ROLE=arn;aws;iam;;<account ID>;role/<role-name>;REGION = ap-south-1
Specify a directory on the machine that hosts the Secure Agent.
Note: If you do not add the options to the Unload command manually, the Secure Agent uses the default values.
Treat NULL Value as NULL
Retains the null values when you read data from Amazon Redshift.
Encryption Type
Encrypts the data in the Amazon S3 staging directory.
You can select the following encryption types:
  • - None
  • - SSE-S3
  • - SSE-KMS
  • - CSE-SMK
Default is None.
Download S3 Files in Multiple Parts
Downloads large Amazon S3 objects in multiple parts.
When the file size of an Amazon S3 object is greater than 8 MB, you can choose to download the object in multiple parts in parallel. Default is 5 MB.
Multipart Download Threshold Size
The maximum threshold size to download an Amazon S3 object in multiple parts.
Default is 5 MB.
Schema Name
Overrides the default schema name.
Note: You cannot configure a custom query when you use the schema name.
Temporary Credential Duration
The time duration during which an IAM user can use the dynamically generated temporarily credentials to access the AWS resource. Enter the time duration in seconds.
Default is 900 seconds.
If you require more than 900 seconds, you can set the time duration up to a maximum of 12 hours in the AWS console and then enter the same time duration in this property.

Databricks Delta advanced properties

The following tables lists the Databricks Delta properties that you can configure for a Connection Map configuration:

Object properties

The following table describes the Databricks Delta connection properties:
Connection property
Description
Schema
The Databricks Delta schema name to be used when creating the object. Th schema name is similar to the schema name specified in the SCHEMAMAP.properties file.
Default is schema_changeit.

Source advanced properties

The following table describes the Databricks Delta source advanced properties:
Property name
Description
Schema Name
Overrides the schema specified in the connection.
Default is $$dbd_schema.
Staging Location
Relative directory path to store the staging files.
If the Databricks cluster is deployed on AWS, use the path relative to the Amazon S3 staging bucket.
If the Databricks cluster is deployed on Azure, use the path relative to the Azure Data Lake Store Gen2 staging filesystem name.
Default is $$dbd_staging_loc.
Database Name
Overrides the database name provided in connection and the database name provided during metadata import.
Note: To read from multiple objects ensure that you have specified the database name in the connection properties.
Job Timeout
Maximum time in seconds that is taken by the Spark job to complete processing.
If the job is not completed within the time specified, the Databricks cluster terminates the job and the mapping fails.
If the job timeout is not specified, the mapping shows success or failure based on the job completion.
Job Status Poll Interval
Poll interval in seconds at which the Secure Agent checks the status of the job completion.
Default is 30 seconds.
DB REST API Timeout
The Maximum time in seconds for which the Secure Agent retries the REST API calls to Databricks when there is an error due to network connection or if the REST endpoint returns 5xx HTTP error code.
Default is 10 minutes.
DB REST API Retry Interval
The time Interval in seconds at which the Secure Agent must retry the REST API call, when there is an error due to network connection or when the REST endpoint returns 5xx HTTP error code.
This value does not apply to the Job status REST API. Use job status poll interval value for the Job status REST API.
Default is 30 seconds.

Target advanced properties

The following table describes the Databricks Delta target advanced properties:
Property Name
Description
Schema Name
Overrides the schema specified in the connection.
Default is $$dbd_schema.
Staging Location
Relative directory path to store the staging files.
If the Databricks cluster is deployed on AWS, use the path relative to the Amazon S3 staging bucket.
If the Databricks cluster is deployed on Azure, use the path relative to the Azure Data Lake Store Gen2 staging filesystem name.
Default is $$dbd_staging_loc.
Target Database Name
Overrides the database name provided in the connection and the database selected in the metadata browser for existing targets.
Write Disposition
Overwrites or adds data to the existing data in a table. You can select from the following options:
  • - Append. Appends data to the existing data in the table even if the table is empty.
  • - Truncate. Overwrites the existing data in the table. Only applies to Insert operation and non-empty sources.
  • - Truncate Always. Overwrites the existing data in the table. Applies to insert, update, upsert, and delete target operations for empty and non-empty sources.
DTM Staging File Size
The size of the flat file that Data Integration creates locally in a temporary folder to stage the data before writing to Databrick.
Default is 50 MB.
Job Timeout
Maximum time in seconds that is taken by the Spark job to complete processing.
If the job is not completed within the time specified, the job terminates and the mapping fails.
If the job timeout is not specified, the mapping shows success or failure based on the job completion.
Job Status Poll Interval
Poll interval in seconds at which the Secure Agent checks the status of the job completion.
Default is 30 seconds.
DB REST API Timeout
The Maximum time in seconds for which the Secure Agent retries the REST API calls to Databricks when there is an error due to network connection or if the REST endpoint returns 5xx HTTP error code.
Default is 10 minutes.
DB REST API Retry
The time Interval in seconds at which the Secure Agent must retry the REST API call, when there is an error due to network connection or when the REST endpoint returns 5xx HTTP error code.
This value does not apply to the Job status REST API. Use job status poll interval value for the Job status REST API.
Default is 30 seconds.

Lookup advanced properties

The following table describes the Databricks Delta lookup transformation advanced properties:
Property Name
Description
Schema Name
Overrides the schema specified in the connection.
Default is $$dbd_schema.
Staging Location
Relative directory path to store the staging files.
If the Databricks cluster is deployed on AWS, use the path relative to the Amazon S3 staging bucket.
If the Databricks cluster is deployed on Azure, use the path relative to the Azure Data Lake Store Gen2 staging filesystem name.
Default is $$dbd_staging_loc.
Database Name
Overrides the database specified in the connection.
Job Timeout
Maximum time in seconds that is taken by the Spark job to complete processing. If the job is not completed within the time specified, the Databricks cluster terminates the job and the mapping fails.
If the job timeout is not specified, the mapping shows success or failure based on the job completion.
Job Status Poll Interval
Poll interval in seconds at which the Secure Agent checks the status of the job completion. Default is 30 seconds.
DB REST API Timeout
The Maximum time in seconds for which the Secure Agent retries the REST API calls to Databricks when there is an error due to network connection or if the REST endpoint returns 5xx HTTP error code.
Default is 10 minutes.
DB REST API Retry Interval
The time Interval in seconds at which the Secure Agent must retry the REST API call, when there is an error due to network connection or when the REST endpoint returns 5xx HTTP error code.
This value does not apply to the Job status REST API. Use job status poll interval value for the Job status REST API.
Default is 30 seconds.

Google BigQuery advanced properties

The following tables lists the Google BigQuery properties that you can configure for a Connection Map configuration:

Object properties

The following table describes the Google BigQuery connection properties:
Connection property
Description
Source Dataset ID
Name of the dataset that contains the source table and target table that you want to connect to.
Default is datasetID_changeit.
Note: Google BigQuery supports the datasets that reside only in the US region.

Source advanced properties

The following table describes the Google BigQuery source advanced properties:
Property name
Description
Source Dataset ID
Overrides the Google BigQuery dataset name that you specified in the connection.
Default is $$sf_database.
Source Staging Dataset
Overrides the Google BigQuery staging dataset name that you specified in the connection and the Source Dataset ID source advanced property.
Number of Rows to Read
Specifies the number of rows to read from the Google BigQuery source table.
Job Poll Interval In Seconds
The number of seconds after which Google BigQuery Connector polls the status of the read job operation.
Default is 10.
Read Mode
Specifies the read mode to read data from the Google BigQuery source.
You can select one the following read modes:
  • - Direct. In direct mode, Google BigQuery Connector reads data directly from the Google BigQuery source table.
  • When you use hybrid and complex connection mode, you cannot use direct mode to read data from the Google BigQuery source.
  • - Staging. In staging mode, Google BigQuery Connector exports data from the Google BigQuery source into Google Cloud Storage. After the export is complete, Google BigQuery Connector downloads the data from Google Cloud Storage into the local stage file and then reads data from the local stage file.
Default is Staging mode.
Use EXPORT DATA Statement to stage
Uses the EXPORT DATA statement to export data from Google BigQuery to Google Cloud Storage.
If the query contains an ORDER BY clause, the specified order is maintained when you export the data.
This property applies to staging mode.
Default is false.
Number of Threads for Downloading Staging Files
Specifies the number of files that Google BigQuery Connector downloads at a time to enable parallel download.
This property applies to staging mode.
Default is 1.
Local Stage File Directory
Specifies the directory on your local machine where Google BigQuery Connector stores the Google BigQuery source data temporarily before it reads the data.
This property applies to staging mode.
Default is $PMTempDir.
Staging File Name
Name of the staging file where data from the Google BigQuery source table is exported to Google Cloud Storage.
This property applies to staging mode.
Persist Extract Staging File After Download
Indicates whether Google BigQuery V2 Connector must persist the staging file after it reads data from the staging file.
By default, Google BigQuery V2 Connector deletes the staging file.
Persist Destination Table
Indicates whether Google BigQuery V2 Connector must persist the query results table after it reads data from the query results table.
By default, Google BigQuery V2 Connector deletes the query results table.
Data format of the staging file
Specifies the data format of the staging file. You can select one of the following data formats:
  • - JSON (Newline Delimited). Supports flat and record data with nested and repeated fields.
  • - CSV. Supports flat data.
  • In a .csv file, columns of the Timestamp data type are represented as floating point numbers that cause the milliseconds value to differ.
Default is JSON.
Enable Staging File Compression
Indicates whether to compress the size of the staging file in Google Cloud Storage before Google BigQuery Connector reads data from the staging file.
You can enable staging file compression to reduce cost and transfer time.
This property applies to staging mode.
Default is false.
Retry Options
Comma-separated list to specify the following retry options:
  • - Retry Count. The number of retry attempts to read data from Google BigQuery.
  • - Retry Interval. The time in seconds to wait between each retry attempt.
  • - Retry Exceptions. The list of exceptions separated by pipe (|) character for which the retries are made.
Default is $$gbq_retry_options.
Use Legacy SQL for SQL Override
Indicates that the SQL Override query is specified in legacy SQL.
Use the following format to specify a legacy SQL query for the SQL Override Query property:
SELECT <Col1, Col2, Col3> FROM [projectID:datasetID.tableName]
Clear this option to define a standard SQL override query.
Use the following format to specify a standard SQL query for the SQL Override Query property:
SELECT * FROM `projectID.datasetID.tableName`
Default is false.
pre SQL Configuration
Specify a pre SQL configuration.
For example,
DestinationTable:PRESQL_SRC,DestinationDataset:EMPLOYEE,
FlattenResults:False,WriteDisposition:WRITE_TRUNCATE,UseLegacySql:False
post SQL Configuration
Specify a post SQL configuration.
For example,
DestinationTable:POSTSQL_SRC,DestinationDataset:EMPLOYEE,
FlattenResults:True,WriteDisposition:WRITE_TRUNCATE,UseLegacySql:False
Label
You can assign a label for the transformation to organize and filter the associated jobs in the Google Cloud Platform Log Explorer.
Billing Project ID
The project ID for the Google Cloud project that is linked to an active Google Cloud Billing account where the Secure Agent runs query and extract jobs.
If you omit the project ID here, the Secure Agent runs query and extract jobs in the Google Cloud project corresponding to the Project ID value specified in the Google BigQuery V2 connection.
Number of Spark Partitions
Specifies the maximum number of partitions that the Spark engine splits the data into.
Default is 1.

Target advanced properties

The following table describes the Google BigQuery target advanced properties:
Property name
Description
Target Dataset ID
Overrides the Google BigQuery dataset name that you specified in the connection.
Default is $$gbq_datasetID.
Write Mode
Specifies the mode to write data to the Google BigQuery target.
You can select one of the following modes:
  • - Bulk. Google BigQuery V2 Connector first writes the data to a staging file in Google Cloud Storage. When the staging file contains all the data, Google BigQuery V2 Connector loads the data from the staging file to the BigQuery target. Google BigQuery V2 Connector then deletes the staging file unless you configure the task to persist the staging file.
  • - Streaming. Google BigQuery V2 Connector directly writes data to the BigQuery target. Google BigQuery V2 Connector writes the data into the target row by row.
  • - CDC. Applies only when you capture changed data from a CDC source. In CDC mode, Google BigQuery V2 Connector captures changed data from any CDC source and writes the changed data to a Google BigQuery target table.
Default is Bulk mode.
Data format of the staging file
Specifies the data format of the staging file. You can select one of the following data formats:
  • - Avro
  • - JSON (Newline Delimited). Supports flat and record data with nested and repeated fields.
  • - Parquet
  • - CSV. Supports flat data.
  • In a .csv file, columns of the Timestamp data type are represented as floating point numbers that cause the milliseconds value to differ.
Only JSON format is applicable for mappings in advanced mode.
This property applies to bulk and CDC mode.
Avro and parquet format is not applicable when you perform a data driven operation.
Default is JSON.
Enable Staging File Compression
Select this option to compress the size of the staging file before Google BigQuery writes the data to the Google Cloud Storage and decompress the staging file before it loads the data to the Google BigQuery target.
You can enable staging file compression to reduce cost and transfer time.
Default is false.
Local Stage File Directory
Specifies the directory on your local machine where Google BigQuery V2 Connector stores the files temporarily before writing the data to the staging file in Google Cloud Storage.
This property applies to bulk mode.
Default is $PMTempDir.
Use Default Column Values
Applicable when the selected data format for the staging file is CSV when the mapping contains unconnected ports. Includes the default column values for the unconnected port from the staging file to create the target. This is applicable when you have defined the default constraint value in the Google BigQuery source column. When you do not enable this option, the agent creates a target only with the connected ports. The agent populates null or empty strings for unconnected ports.
Default is true.
Enable Merge
Implements the Merge query to perform an update, upsert, delete or data driven operation on a Google BigQuery target table.
If you select the Enable Data Driven property, you must select this option.
Default is disabled.
Target Staging Dataset
Optional. Overrides the Google BigQuery staging dataset name that you specified in the connection and the Target Dataset ID target advanced property.
Create Disposition
Specifies whether Google BigQuery V2 Connector must create the target table if it does not exist.
You can select one of the following values:
  • - Create if needed. If the table does not exist, Google BigQuery V2 Connector creates the table.
  • - Create never. If the table does not exist, Google BigQuery V2 Connector does not create the table and displays an error message.
Create disposition is applicable only when you perform an insert operation on a Google BigQuery target.
Write Disposition
Specifies how Google BigQuery V2 Connector must write data in bulk mode if the target table already exists.
You can select one of the following values:
  • - Write append. If the target table exists, Google BigQuery V2 Connector appends the data to the existing data in the table.
  • - Write truncate. If the target table exists, Google BigQuery V2 Connector overwrites the existing data in the table.
  • - Write empty. If the target table exists and contains data, Google BigQuery V2 Connector displays an error and does not write the data to the target. Google BigQuery V2 Connector writes the data to the target only if the target table does not contain any data.
Write disposition is applicable for bulk mode.
Write disposition is applicable only when you perform an insert operation on a Google BigQuery target.
Streaming Template Table Suffix
Specify the suffix to add to the individual target tables that Google BigQuery V2 Connector creates based on the template target table.
This property applies to streaming mode.
If you select the Enable Merge option, Google BigQuery V2 Connector ignores this property.
Streaming mode is not applicable when you perform a data driven operation.
Rows per Streaming Request
Specifies the number of rows that Google BigQuery V2 Connector streams to the BigQuery target for each request.
Default is 500 rows.
The maximum row size that Google BigQuery V2 Connector can stream to the Google BigQuery target for each request is 10 MB.
This property applies to streaming mode.
Streaming mode is not applicable when you perform a data driven operation.
Staging File Name
Name of the staging file that Google BigQuery V2 Connector creates in the Google Cloud Storage before it loads the data to the Google BigQuery target.
This property applies to bulk mode.
Persist Staging File After Loading
Indicates whether Google BigQuery V2 Connector must persist the staging file in the Google Cloud Storage after it writes the data to the Google BigQuery target. You can persist the staging file if you want to archive the data for future reference.
By default, Google BigQuery V2 Connector deletes the staging file in Google Cloud Storage.
This property applies to bulk mode.
Job Poll Interval In Seconds
The number of seconds after which Google BigQuery V2 Connector polls the status of the write job operation.
Default is 10.
Number of Threads for Uploading Staging File
The number of files that Google BigQuery V2 Connector must create to upload the staging file in bulk mode.
Allow Quoted Newlines
Indicates whether Google BigQuery V2 Connector must allow the quoted data sections with newline character in a .csv file.
Field Delimiter
Indicates whether Google BigQuery V2 Connector must allow field separators for the fields in a .csv file.
Allow Jagged Rows
Indicates whether Google BigQuery V2 Connector must accept the rows without trailing columns in a .csv file.
Suppress post SQL on Error
Indicates whether the Secure Agent must abort the post-SQL query execution in case the task fails to write data to the Google BigQuery target table due to errors.
Default is disabled.
pre SQL Configuration
Specify a pre-SQL configuration.
For example,
DestinationTable:PRESQL_TGT2,
DestinationDataset:EMPLOYEE,
FlattenResults:False,
WriteDisposition:WRITE_TRUNCATE,
UseLegacySql:False
post SQL Configuration
Specify a post-SQL configuration.
For example,
DestinationTable:POSTSQL_SRC,
DestinationDataset:EMPLOYEE,
FlattenResults:True,
UseLegacySQL:False
Quote Char
Specifies the quote character to skip when you write data to Google BigQuery. When you write data to Google BigQuery and the source table contains the specified quote character, the task fails. Change the quote character value to a value that does not exist in the source table.
Default is double quotes.
Allow Duplicate Inserts
Indicates that the Secure Agent can insert duplicate rows into the Google BigQuery target.
Applicable only when you perform a data driven operation and DD_INSERT is specified in the data driven condition.
Default is not selected.
Disable Duplicate Update Rows
Determines if multiple incoming rows attempt to update the same target row, the Secure Agent must process only one of the incoming rows and ignore the rest of the incoming rows.
Select this option to configure the mapping to process only one of the incoming rows and ignore the rest of the incoming rows.
Default is disabled.
Label
You can assign a label for the transformation to organize and filter the associated jobs in the Google Cloud Platform Log Explorer.
Billing Project ID
The project ID for the Google Cloud project that is linked to an active Google Cloud Billing account where the Secure Agent runs query and load jobs.
If you omit the project ID here, the Secure Agent runs query and load jobs in the Google Cloud project corresponding to the Project ID value specified in the Google BigQuery V2 connection.
Forward Rejected Rows
Applicable only when you configure DD_REJECT constant in the data driven condition to reject all the rows.
Otherwise, this property is not applicable for Google BigQuery V2 Connector.

Lookup advanced properties

The following table describes the Google BigQuery lookup transformation advanced properties:
Property name
Description
Source Dataset ID
Overrides the Google BigQuery dataset name that you specified in the connection.
Default is $$gbq_datasetID.
Source Staging Dataset
Overrides the Google BigQuery staging dataset name that you specified in the Lookup transformation.
Job Poll Interval In Seconds
The number of seconds after which Google BigQuery V2 Connector polls the status of the read job operation.
Default is 10.
Read Mode
Specifies the read mode to read data from the Google BigQuery source.
You can select one the following read modes:
  • - Direct. In direct mode, Google BigQuery V2 Connector reads data directly from the Google BigQuery source table.
  • When you use hybrid and complex connection mode, you cannot use direct mode to read data from the Google BigQuery source.
  • - Staging. In staging mode, Google BigQuery V2 Connector exports data from the Google BigQuery source into Google Cloud Storage. After the export is complete, Google BigQuery V2 Connector downloads the data from Google Cloud Storage into the local stage file and then reads data from the local stage file.
Default is Staging mode.
Use EXPORT DATA Statement to stage
Uses the EXPORT DATA statement to export data from Google BigQuery to Google Cloud Storage.
If the query contains an ORDER BY clause, the specified order is maintained when you export the data.
This property applies to staging mode.
Default is true.
Number of Threads for Downloading Staging Files
Specifies the number of files that Google BigQuery Connector downloads at a time to enable parallel download.
This property applies to staging mode.
Default is 1.
Local Stage File Directory
Specifies the directory on your local machine where Google BigQuery V2 Connector stores the Google BigQuery source data temporarily before it reads the data.
This property applies to staging mode.
Default is $PMTempDir.
Data format of the staging file
Specifies the data format of the staging file. You can select one of the following data formats:
  • - Avro
  • - JSON (Newline Delimited). Supports flat and record data with nested and repeated fields.
  • - Parquet
  • - CSV. Supports flat data.
  • In a .csv file, columns of the Timestamp data type are represented as floating point numbers that cause the milliseconds value to differ.
Only JSON format is applicable for mappings in advanced mode.
This property applies to bulk and CDC mode.
Avro and parquet format is not applicable when you perform a data driven operation.
Default is JSON.
Enable Staging File Compression
Indicates whether to compress the size of the staging file in Google Cloud Storage before Google BigQuery V2 Connector reads data from the staging file.
You can enable staging file compression to reduce cost and transfer time.
This property applies to staging mode.
Default is false.
Retry Options
Comma-separated list to specify the following retry options:
  • - Retry Count. The number of retry attempts to read data from Google BigQuery.
  • - Retry Interval. The time in seconds to wait between each retry attempt.
  • - Retry Exceptions. The list of exceptions separated by pipe (|) character for which the retries are made.
Default is $$gbq_retry_options.
Use Legacy SQL for SQL Override
Indicates that the SQL Override query is specified in legacy SQL.
Use the following format to specify a legacy SQL query for the SQL Override Query property:
SELECT <Col1, Col2, Col3> FROM [projectID:datasetID.tableName]
Clear this option to define a standard SQL override query.
Use the following format to specify a standard SQL query for the SQL Override Query property:
SELECT * FROM 'projectID.datasetID.tableName'
Number of Rows to Read
Specifies the number of rows to read from the Google BigQuery source table.
Staging File Name
Name of the staging file where data from the Google BigQuery source table is exported to Google Cloud Storage.
This property applies to staging mode.
Persist Extract Staging File After Download
Indicates whether Google BigQuery V2 Connector must persist the staging file in the Google Cloud Storage after it writes the data to the Google BigQuery target. You can persist the staging file if you want to archive the data for future reference.
By default, Google BigQuery V2 Connector deletes the staging file in Google Cloud Storage.
This property applies to bulk mode.
Persist Destination Table
Indicates whether Google BigQuery V2 Connector must persist the query results table after it reads data from the query results table.
By default, Google BigQuery V2 Connector deletes the query results table.
pre SQL Configuration
Specify a pre SQL configuration.
For example,
DestinationTable:PRESQL_SRC,DestinationDataset:EMPLOYEE,
FlattenResults:False,WriteDisposition:WRITE_TRUNCATE,UseLegacySql:False
post SQL Configuration
Specify a post SQL configuration.
For example,
DestinationTable:POSTSQL_SRC,DestinationDataset:EMPLOYEE,
FlattenResults:True,WriteDisposition:WRITE_TRUNCATE,UseLegacySql:False
Label
You can assign a label for the transformation to organize and filter the associated jobs in the Google Cloud Platform Log Explorer.
Billing Project ID
The project ID for the Google Cloud project that is linked to an active Google Cloud Billing account where the Secure Agent runs query and extract jobs.
If you omit the project ID here, the Secure Agent runs query and extract jobs in the Google Cloud project corresponding to the Project ID value specified in the Google BigQuery V2 connection.
Number of Spark Partitions
Specifies the maximum number of partitions that the Spark engine splits the data into.
Default is 1.

Microsoft Azure Synapse SQL advanced properties

The following tables lists the Microsoft Azure Synapse SQL properties that you can configure for a Connection Map configuration:

Object properties

The following table describes the Microsoft Azure Synapse SQL connection properties:
Connection property
Description
Schema
The Microsoft Azure Synapse SQL schema name to be used when creating the object. Th schema name is similar to the schema name specified in the SCHEMAMAP.properties file.
Default is schema_changeit.

Source advanced properties

The following table describes the Microsoft Azure Synapse SQL source advanced properties:
Property
Description
Azure Blob Container Name
The name of the container in Microsoft Azure Blob Storage. The container name must not contain special characters.
Default is $$blob_container_name.
ADLS FileSystem Name
Required if you select ADLS Gen2 storage in the connection properties.
The name of the file system in Microsoft Azure Data Lake Storage Gen2.
The file system name must not contain special characters. You can also specify the path of the directory under the file system. Use only a forward slash to specify the directory path.
Default is $$adls_folder_name.
Schema Name Override
Overrides the schema specified in the connection.
Default is $$syn_schema.
Staging File Format
Type of file format to use when you stage the files in Azure storage.
Select one of the following formats:
  • - Delimited Text
  • - Parquet
Field Delimiter
Character used to separate fields in the file. Default is 0x1e. You can specify 'TAB' or 0-256 single-char printable and non-printable ASCII characters. Non-printable characters must be specified in hexadecimal.
Note: Multi-char ASCII characters except TAB are not applicable. You cannot use the following non-printable characters:
00x0, 0x0, 0x0A , 0x1B, 0x0D, and 0x1F
Number of Concurrent Connections to Blob Store
Number of concurrent connections to extract data from the Microsoft Azure Blob Storage. When reading a large-size blob, you can spawn multiple threads to process data. Configure Blob Part Size to partition a large-size blob into smaller parts.
Default is 4. Maximum is 10.
Blob Part Size
Partitions a blob into smaller parts each of specified part size. When reading a large-size blob, consider partitioning the blob into smaller parts and configure concurrent connections to spawn required number of threads to process data in parallel.
Default is 8 MB.
On Pre-Post SQL Error
Determines the behavior when a task that includes pre-SQL or post-SQL commands encounters errors.
Select one of the following options:
  • - Continue. The task continues regardless of errors.
  • - Stop. The task stops when errors occur while executing pre-SQL or post-SQL commands.
Quote Character
Specifies the quote character to skip when you read data from Microsoft Azure Synapse SQL.
The quote character that you specify must not exist in the source table. If it exists, enter a different quote character value.
Default is 0x1f.
Interim Directory
Optional. Path to the staging directory in the Secure Agent machine.
Specify the staging directory where you want to stage the files when you read data from Microsoft Azure SQL Data Warehouse. Ensure that the directory has sufficient space and you have read permissions to the directory.
Default staging directory is /tmp.
You cannot specify an interim directory when you use the Hosted Agent.

Target advanced properties

The following table describes the Microsoft Azure Synapse SQL target advanced properties:
Advanced Property
Description
Azure Blob Container Name
The name of the container in Microsoft Azure Blob Storage. The container name must not contain special characters.
Default is $$blob_container_name.
ADLS FileSystem Name
Required if you select ADLS Gen2 storage in the connection properties.
The name of the file system in Microsoft Azure Data Lake Storage Gen2.
The file system name must not contain special characters. You can also specify the path of the directory under the file system. Use only a forward slash to specify the directory path.
Default is $$adls_folder_name.
Copy Method
The method to load data from the staging location to Microsoft Azure Synapse SQL. Select one of the following options:
  • - Polybase
  • - Copy Command
Default is Polybase.
Copy Command Options
Options for the copy command in key=value format.
Specify each option on a new line.
Schema Name Override
Overrides the schema specified in the connection.
Default is $$syn_schema.
Staging File Format
Type of file format to use when you stage the files in Azure storage.
Select one of the following formats:
  • - Delimited Text
  • - Parquet
Field Delimiter
Character used to separate fields in the file. Default is 0x1e. You can specify 'TAB' or 0-256 single-char printable and non-printable ASCII characters. Non-printable characters must be specified in hexadecimal.
Note: Multi-char ASCII characters except TAB are not applicable. You cannot use the following non-printable characters:
00x0, 0x0, 0x0A , 0x1B, 0x0D, and 0x1F
Number of Concurrent Connections to Blob Storage
Number of concurrent connections to extract data from the Microsoft Azure Blob Storage. When reading a large-size blob, you can spawn multiple threads to process data.
Default is 4. Maximum is 10.
On Pre-Post SQL Error
Determines the behavior when a task that includes pre-SQL or post-SQL commands encounters errors. You can select any of the following options:
  • - Continue. The task continues regardless of errors.
  • - Stop. The task stops when errors occur while executing pre-SQL or post-SQL commands.
Batch Size
Minimum number of rows in a batch. Enter a number greater than 0.
Default is 2000000.
Reject Threshold
Number of errors within a batch that causes a batch to fail. Enter a positive integer.
If the number of errors is equal to or greater than the property value, the Secure Agent rejects the entire batch to the error file and marks the session failed.
Note: When you do not set the reject threshold, the mapping fails when an error is encountered.
Quote Character
Specifies the quote character to skip when you write data to Microsoft Azure Synapse SQL.
The quote character that you specify must not exist in the source table. If it exists, enter a different quote character value.
Compression Format
Compresses the staging files in the .Gzip format. Default is None.
Interim Directory
Optional. Path to the staging directory in the Secure Agent machine.
Specify the staging directory where you want to stage the files when you write data to Microsoft Azure Synapse SQL. Ensure that the directory has sufficient space and you have write permissions to the directory.
Default staging directory is /tmp.
You cannot specify an interim directory when you use the Hosted Agent.

Lookup advanced properties

The following table describes the Microsoft Azure Synapse SQL lookup transformation advanced properties:
Property name
Description
Schema Name Override
Overrides the schema specified in the connection.
Default is $$syn_schema.
Azure Blob Container Name
The name of the container in Microsoft Azure Blob Storage. The container name must not contain special characters.
Default is $$blob_container_name.
ADLS FileSystem Name
Required if you select ADLS Gen2 storage in the connection properties.
The name of the file system in Microsoft Azure Data Lake Storage Gen2.
The file system name must not contain special characters. You can also specify the path of the directory under the file system. Use only a forward slash to specify the directory path.
Default is $$adls_folder_name.
Staging File Format
Type of file format to use when you stage the files in Azure storage.
Select one of the following formats:
  • - Delimited Text
  • - Parquet
Field Delimiter
Character used to separate fields in the file. Default is 0x1e. You can specify 'TAB' or 0-256 single-char printable and non-printable ASCII characters. Non-printable characters must be specified in hexadecimal.
Note: Multi-char ASCII characters except TAB are not applicable. You cannot use the following non-printable characters:
00x0, 0x0, 0x0A , 0x1B, 0x0D, and 0x1F
Number of concurrent connections to Blob Store
Number of concurrent connections to extract data from the Microsoft Azure Blob Storage. When reading a large-size blob, you can spawn multiple threads to process data. Configure Blob Part Size to partition a large-size blob into smaller parts.
Default is 4. Maximum is 10.
Blob Part Size
Partitions a blob into smaller parts each of specified part size. When reading a large-size blob, consider partitioning the blob into smaller parts and configure concurrent connections to spawn required number of threads to process data in parallel.
Default is 8 MB.
On Pre-Post SQL Error
Determines the behavior when a task that includes pre-SQL or post-SQL commands encounters errors. You can select any of the following options:
  • - Continue. The task continues regardless of errors.
  • - Stop. The task stops when errors occur while executing pre-SQL or post-SQL commands.
Quote Character
The Secure Agent skips the specified character when you read data from Microsoft Azure SQL Data Warehouse.
Default is 0x1f.
Interim Directory
Optional. Path to the staging directory in the Secure Agent machine.
Specify the staging directory where you want to stage the files when you read data from Microsoft Azure SQL Data Warehouse. Ensure that the directory has sufficient space and you have write permissions to the directory.
Default staging directory is /tmp.
You cannot specify an interim directory when you use the Hosted Agent.

PostgreSQL advanced properties

The following tables lists the PostgreSQL properties that you can configure for a Connection Map configuration:

Object properties

The following table describes the PostgreSQL connection properties:
Connection property
Description
Schema
The PostgreSQL schema name to be used when creating the object. Th schema name is similar to the schema name specified in the SCHEMAMAP.properties file.
Default is schema_changeit.

Source advanced properties

The following table describes the PostgreSQL source advanced properties:
Property name
Description
Schema Name
Overrides the schema name of the source object.
Default is $$pg_schema.

Target advanced properties

The following table describes the PostgreSQL target advanced properties:
Property name
Description
Schema Name
Overrides the schema name of the target object.
Default is $$pg_schema.

Lookup advanced properties

The following table describes the PostgreSQL lookup transformation advanced properties:
Property name
Description
Schema Name
Overrides the schema name of the lookup object.
Default is $$pg_schema.

Snowflake advanced properties

The following tables lists the Snowflake properties that you can configure for a Connection Map configuration:

Object properties

The following table describes the Snowflake connection properties:
Connection property
Description
Database
The Snowflake database name to be used when creating the object. The database name similar to the database name specified in the DATABASEMAP.properties file.
Default is dbname_changeit.
Schema
The Snowflake schema name to be used when creating the object. Th schema name is similar to the schema name specified in the SCHEMAMAP.properties file.
Default is schema_changeit.
Note: Specify the parameters in the following format: <database name>/<schema name> You must specify both the database and schema name. If you specify only the database name, source objects do not appear in the Select Source Object window. If you specify only the schema name, an Invalid Schema exception occurs while reading data.

Source advanced properties

The following table describes the Snowflake source advanced properties:
Property name
Description
Database
Overrides the database specified in the connection.
Default is $$sf_database.
Schema
Overrides the schema specified in the connection.
Default is $$sf_schema.
Warehouse
Overrides the Snowflake warehouse name specified in the connection.
Role
Overrides the Snowflake role assigned to user you specified in the connection.

Target advanced properties

The following table describes the Snowflake target advanced properties:
Property Name
Description
Database
Overrides the database that you used to import the object.
Default is $$sf_database.
Schema
Overrides the schema that you used to import the object.
Default is $$sf_schema.
Warehouse
Overrides the Snowflake name specified in the connection. The warehouse name in the mapping overrides the warehouse name you specify in the connection.
Role
Overrides the Snowflake role assigned to the user specified in the connection.
Batch Row Size
The number of rows written to a single file in the agent location. When the number of rows written to the file reaches the value specified, the agent flushes the data queue and starts processing the write commands.
Number of local staging files
The number of files that represents a single batch of data.
The default number of files considered is 64.
After the agent uploads the specified number of local staging files to the Snowflake user stage, Snowflake unloads the data to the target table.
Additional Write Runtime Parameters
The additional runtime parameters that you can use when you write to Snowflake.
You can enter multiple write runtime parameters, separated by ampersand (&).
Rejected File Path
The filename and path of the file on the agent machine where you want to write the rejected records.
For example, \rejectedfiles\reject7
Forward Rejected Rows
Determines whether the transformation passes rejected rows to the next transformation or drops rejected rows. By default, the agent forwards rejected rows to the next transformation.

Lookup advanced properties

The following table describes the Snowflake lookup transformation advanced properties:
Property Name
Description
Database
Overrides the database specified in the connection.
Default is $$sf_database.
Schema
Overrides the schema specified in the connection.
Default is $$sf_schema.
Warehouse
Overrides the Snowflake warehouse name specified in the connection.
Role
Overrides the Snowflake role assigned to user you specified in the connection.