Creating a connection map configuration

Property	Description
Configuration Name	Name of the configuration.
Description	(Optional) Description of the configuration.
Domain	The domain where you want to use this configuration. The list of repositories within the selected domain appears in the Repository field.
Repository Type	The type of repository where you want to use this configuration.
Repository	The repository where you want to use this configuration. Important: The selected repository must be assessed to create the configuration.

Connection property	Description
Schema	The Amazon Redshift schema name to be used when creating the object. Th schema name is similar to the schema name specified in the SCHEMAMAP.properties file. Default is schema_changeit.

Property name	Description
Schema Name	Overrides the default schema name.
S3 Bucket Name	Amazon S3 bucket name for staging the data. You can also specify the bucket name with the folder path. If you provide an Amazon S3 bucket name that is in a different region than the Amazon Redshift cluster, you must configure the REGION attribute in the Unload command options.
Read Mode	Specifies the read mode to read data from the Amazon Redshift source. You can specify one of the following read modes: - Direct: Reads data directly from the Amazon Redshift source without staging the data in Amazon S3. - Staging: Reads data from the Amazon Redshift source by staging the data in the S3 bucket. Default is Staging.
Fetch Size	Determines the number of rows to read in one resultant set from Amazon Redshift. Applies only when you select the Direct read mode. Default is 10000.
Enable Compression	Compresses the staging files into the Amazon S3 staging directory. The task performance improves when the Secure Agent compresses the staging files. Default is enabled.
Staging Directory Location	Location of the local staging directory. When you run a task in the Secure Agent runtime environment, specify a directory path that is available on the corresponding Secure Agent machine in the runtime environment. For example, C:\Temp. Ensure that you have the write permissions to the directory.
Unload Options	Unload command options. Enter the options to the Unload command to extract data from Amazon Redshift and create staging files on Amazon S3. You can enter the following unload options: - DELIMITER - ESCAPE - PARALLEL - NULL - AWS_IAM_ROLE - REGION - ADDQUOTES For example: DELIMITER = \036;ESCAPE = OFF;NULL=text;PARALLEL = ON;AWS_IAM_ROLE=arn;aws;iam;;<account ID>;role/<role-name>;REGION = ap-south-1 For the AWS_IAM_ROLE option, provide the Amazon Resource Name (ARN) of an Amazon Redshift Role. Also, specify a directory on the machine that hosts the Secure Agent.
Temporary Credential Duration	The time duration during which an IAM user can use the dynamically generated temporarily credentials to access the AWS resource. Enter the time duration in seconds. Default is 900 seconds. If you require more than 900 seconds, you can set the time duration up to a maximum of 12 hours in the AWS console and then enter the same time duration in this property.
Treat NULL Value as NULL	Retains the null values when you read data from Amazon Redshift.
Encryption Type	Encrypts the data in the Amazon S3 staging directory. You can specify the following encryption types: - None - SSE-S3 - SSE-KMS - CSE-SMK Default is None.
Download S3 Files In Multiple Parts	Downloads large Amazon S3 objects in multiple parts. When the file size of an Amazon S3 object is greater than 8 MB, you can choose to download the object in multiple parts in parallel. Default is 5 MB.
Multipart Download Threshold Size	The maximum threshold size to download an Amazon S3 object in multiple parts. Default is 5 MB.

Target advanced properties

Property	Description
S3 Bucket Name	Amazon S3 bucket name for writing the files to Amazon Redshift target. You can also specify the bucket name with the folder path. If you provide an Amazon S3 bucket name that is in a different region than the Amazon Redshift cluster, you must configure the REGION attribute in the Copy Options property.
Enable Compression	Compresses the staging files before writing the files to Amazon Redshift. The task performance improves when the Secure Agent compresses the staged files. Default is enabled.
Staging Directory Location	Location of the local staging directory. When you run a task in Secure Agent runtime environment specify a directory path that is available on the corresponding Secure Agent machine in the runtime environment. For example, C:\Temp. Ensure that you have the write permissions on the directory.
Batch Size	Minimum number of rows in a batch. Enter a number greater than 0. Default is 2000000.
Max Errors per Upload Batch for INSERT	Number of error rows that causes an upload insert batch to fail. Enter a positive integer. Default is 1. If the number of errors is equal to or greater than the property value, the Secure Agent writes the entire batch to the error file.
Require Null Value For Char and Varchar	Replaces the string value with NULL when you write data to Amazon Redshift columns of Char and Varchar data types. Default is an empty string. Note: When you run a mapping to write null values to a table that contains a single column of the Int, Bigint, numeric, real, or double data type, the mapping fails. You must provide a value other than the default value in the Require Null Value For Char And Varchar property.
WaitTime In Seconds For S3 File Consistency	Number of seconds to wait for the Secure Agent to make the staged files consistent with the list of files available on Amazon S3. Default is 0.
Copy Options	Copy command options. Enter the options to the Copy command to write data from Amazon S3 to the Amazon Redshift target when the default delimiter comma (,) or double-quote (") is used in the data. Provide the Amazon Redshift Role Amazon Resource Name (ARN). You can add the following options: - DELIMITER - ACCEPTINVCHARS - QUOTE - COMPUPDATE - AWS_IAM_ROLE - REGION For example: DELIMITER = \036;ACCEPTINVCHARS = #;QUOTE = \037 COMPUPDATE = ON;AWS_IAM_ROLE=arn:aws:iam::<account ID>:role/<role-name>;REGION = ap-south-1 Specify a directory on the machine that hosts the Secure Agent. Note: If you do not add the options to the Copy command manually, the Secure Agent uses the default values.
S3 Server Side Encryption	Indicates that Amazon S3 encrypts data during upload. Provide a customer master key ID in the connection property to enable this property. Default is not selected.
S3 Client Side Encryption	Indicates that the Secure Agent encrypts data using a private key. Provide a master symmetric key ID in the connection property to enable this property. If you enable both server-side and client-side encryptions, the Secure Agent ignores the server-side encryption.
Analyze Target Table	Runs an ANALYZE command on the target table. The query planner on Amazon Redshift updates the statistical metadata to build and choose optimal plans to improve the efficiency of queries.
Vacuum Target Table	Recovers disk space and sorts the row in a specified table or all tables in the database. You can select the following recovery options: - None - Full - Sort Only - Delete Only - Reindex Default is None.
Prefix to retain staging files on S3	Retains staging files on Amazon S3. Provide both a directory prefix and a file prefix separated by a slash (/) or only a file prefix to retain staging files on Amazon S3. For example, backup_dir/backup_file or backup_file.
Success File Directory	Directory for the Amazon Redshift success file. Specify a directory on the machine that hosts the Secure Agent.
Error File Directory	Directory for the Amazon Redshift error file. Specify a directory on the machine that hosts the Secure Agent.
Transfer Manager Thread Pool Size	Number of threads to write data in parallel. Default is 10.
Preserve record order on write	Retains the order of the records when you read data from a CDC source and write data to an Amazon Redshift target. Use this property when you create a mapping to capture the changed record from a CDC source. This property enables you to avoid inconsistencies between the CDC source and target.
Minimum Upload Part Size	Minimum size of the Amazon Redshift object to upload an object. Default is 5 MB.
Number of files per batch	Calculates the number of the staging files per batch. If you do not provide the number of files, Amazon Redshift V2 Connector calculates the number of the staging files. When you connect to Amazon Redshift serverless and do not provide the number of files, 20 files per batch is considered as the default value.
Schema Name	Overrides the default schema name.
Recovery Schema Name	Schema that contains recovery information stored in the infa_recovery_table table on the target system to resume the extraction of the changed data from the last checkpoint.
Temporary Credential Duration	The time duration during which an IAM user can use the dynamically generated temporarily credentials to access the AWS resource. Enter the time duration in seconds. Default is 900 seconds. If you require more than 900 seconds, you can set the time duration maximum up to 12 hours in the AWS console and then enter the same time duration in this property.

Lookup advanced properties

Property	Description
Read Mode	Specifies the read mode to read data from the Amazon Redshift source. You can specify one of the following read modes: - Direct: Reads data directly from the Amazon Redshift source without staging the data in Amazon S3. - Staging: Reads data from the Amazon Redshift source by staging the data in the S3 bucket. Default is Staging.
Fetch Size	Determines the number of rows to read in one resultant set from Amazon Redshift. Applies only when you select the Direct read mode. Default is 10000.
S3 Bucket Name	Amazon S3 bucket name for staging the data. You can also specify the bucket name with the folder path. If you provide an Amazon S3 bucket name that is in a different region than the Amazon Redshift cluster, you must configure the REGION attribute in the Unload command options.
Enable Compression	Compresses the staging files into the Amazon S3 staging directory. The task performance improves when the Secure Agent compresses the staging files. Default is selected.
Staging Directory Location	Location of the local staging directory. When you run a task in Secure Agent runtime environment, specify a directory path that is available on the corresponding Secure Agent machine in the runtime environment. Specify the directory path in the following manner: <staging directory> For example, C:\Temp. Ensure that you have the write permissions on the directory.
Unload Options	Unload command options. Add options to the Unload command to extract data from Amazon Redshift and create staging files on Amazon S3. Provide an Amazon Redshift Role Amazon Resource Name (ARN). You can add the following options: - DELIMITER - ESCAPE - PARALLEL - NULL - AWS_IAM_ROLE - REGION - ADDQUOTES For example: DELIMITER = \036;ESCAPE = OFF;NULL=text;PARALLEL = ON;AWS_IAM_ROLE=arn;aws;iam;;<account ID>;role/<role-name>;REGION = ap-south-1 Specify a directory on the machine that hosts the Secure Agent. Note: If you do not add the options to the Unload command manually, the Secure Agent uses the default values.
Treat NULL Value as NULL	Retains the null values when you read data from Amazon Redshift.
Encryption Type	Encrypts the data in the Amazon S3 staging directory. You can select the following encryption types: - None - SSE-S3 - SSE-KMS - CSE-SMK Default is None.
Download S3 Files in Multiple Parts	Downloads large Amazon S3 objects in multiple parts. When the file size of an Amazon S3 object is greater than 8 MB, you can choose to download the object in multiple parts in parallel. Default is 5 MB.
Multipart Download Threshold Size	The maximum threshold size to download an Amazon S3 object in multiple parts. Default is 5 MB.
Schema Name	Overrides the default schema name. Note: You cannot configure a custom query when you use the schema name.
Temporary Credential Duration	The time duration during which an IAM user can use the dynamically generated temporarily credentials to access the AWS resource. Enter the time duration in seconds. Default is 900 seconds. If you require more than 900 seconds, you can set the time duration up to a maximum of 12 hours in the AWS console and then enter the same time duration in this property.

Databricks Delta advanced properties

Object properties

Connection property	Description
Schema	The Databricks Delta schema name to be used when creating the object. Th schema name is similar to the schema name specified in the SCHEMAMAP.properties file. Default is schema_changeit.

Source advanced properties

Property name	Description
Schema Name	Overrides the schema specified in the connection. Default is $$dbd_schema.
Staging Location	Relative directory path to store the staging files. If the Databricks cluster is deployed on AWS, use the path relative to the Amazon S3 staging bucket. If the Databricks cluster is deployed on Azure, use the path relative to the Azure Data Lake Store Gen2 staging filesystem name. Default is $$dbd_staging_loc.
Database Name	Overrides the database name provided in connection and the database name provided during metadata import. Note: To read from multiple objects ensure that you have specified the database name in the connection properties.
Job Timeout	Maximum time in seconds that is taken by the Spark job to complete processing. If the job is not completed within the time specified, the Databricks cluster terminates the job and the mapping fails. If the job timeout is not specified, the mapping shows success or failure based on the job completion.
Job Status Poll Interval	Poll interval in seconds at which the Secure Agent checks the status of the job completion. Default is 30 seconds.
DB REST API Timeout	The Maximum time in seconds for which the Secure Agent retries the REST API calls to Databricks when there is an error due to network connection or if the REST endpoint returns 5xx HTTP error code. Default is 10 minutes.
DB REST API Retry Interval	The time Interval in seconds at which the Secure Agent must retry the REST API call, when there is an error due to network connection or when the REST endpoint returns 5xx HTTP error code. This value does not apply to the Job status REST API. Use job status poll interval value for the Job status REST API. Default is 30 seconds.

Target advanced properties

Property Name	Description
Schema Name	Overrides the schema specified in the connection. Default is $$dbd_schema.
Staging Location	Relative directory path to store the staging files. If the Databricks cluster is deployed on AWS, use the path relative to the Amazon S3 staging bucket. If the Databricks cluster is deployed on Azure, use the path relative to the Azure Data Lake Store Gen2 staging filesystem name. Default is $$dbd_staging_loc.
Target Database Name	Overrides the database name provided in the connection and the database selected in the metadata browser for existing targets.
Write Disposition	Overwrites or adds data to the existing data in a table. You can select from the following options: - Append. Appends data to the existing data in the table even if the table is empty. - Truncate. Overwrites the existing data in the table. Only applies to Insert operation and non-empty sources. - Truncate Always. Overwrites the existing data in the table. Applies to insert, update, upsert, and delete target operations for empty and non-empty sources.
DTM Staging File Size	The size of the flat file that Data Integration creates locally in a temporary folder to stage the data before writing to Databrick. Default is 50 MB.
Job Timeout	Maximum time in seconds that is taken by the Spark job to complete processing. If the job is not completed within the time specified, the job terminates and the mapping fails. If the job timeout is not specified, the mapping shows success or failure based on the job completion.
Job Status Poll Interval	Poll interval in seconds at which the Secure Agent checks the status of the job completion. Default is 30 seconds.
DB REST API Timeout	The Maximum time in seconds for which the Secure Agent retries the REST API calls to Databricks when there is an error due to network connection or if the REST endpoint returns 5xx HTTP error code. Default is 10 minutes.
DB REST API Retry	The time Interval in seconds at which the Secure Agent must retry the REST API call, when there is an error due to network connection or when the REST endpoint returns 5xx HTTP error code. This value does not apply to the Job status REST API. Use job status poll interval value for the Job status REST API. Default is 30 seconds.

Lookup advanced properties

Property Name	Description
Schema Name	Overrides the schema specified in the connection. Default is $$dbd_schema.
Staging Location	Relative directory path to store the staging files. If the Databricks cluster is deployed on AWS, use the path relative to the Amazon S3 staging bucket. If the Databricks cluster is deployed on Azure, use the path relative to the Azure Data Lake Store Gen2 staging filesystem name. Default is $$dbd_staging_loc.
Database Name	Overrides the database specified in the connection.
Job Timeout	Maximum time in seconds that is taken by the Spark job to complete processing. If the job is not completed within the time specified, the Databricks cluster terminates the job and the mapping fails. If the job timeout is not specified, the mapping shows success or failure based on the job completion.
Job Status Poll Interval	Poll interval in seconds at which the Secure Agent checks the status of the job completion. Default is 30 seconds.
DB REST API Timeout	The Maximum time in seconds for which the Secure Agent retries the REST API calls to Databricks when there is an error due to network connection or if the REST endpoint returns 5xx HTTP error code. Default is 10 minutes.
DB REST API Retry Interval	The time Interval in seconds at which the Secure Agent must retry the REST API call, when there is an error due to network connection or when the REST endpoint returns 5xx HTTP error code. This value does not apply to the Job status REST API. Use job status poll interval value for the Job status REST API. Default is 30 seconds.

Google BigQuery advanced properties

Object properties

Connection property	Description
Source Dataset ID	Name of the dataset that contains the source table and target table that you want to connect to. Default is datasetID_changeit. Note: Google BigQuery supports the datasets that reside only in the US region.

Source advanced properties

Property name	Description
Source Dataset ID	Overrides the Google BigQuery dataset name that you specified in the connection. Default is $$sf_database.
Source Staging Dataset	Overrides the Google BigQuery staging dataset name that you specified in the connection and the Source Dataset ID source advanced property.
Number of Rows to Read	Specifies the number of rows to read from the Google BigQuery source table.
Job Poll Interval In Seconds	The number of seconds after which Google BigQuery Connector polls the status of the read job operation. Default is 10.
Read Mode	Specifies the read mode to read data from the Google BigQuery source. You can select one the following read modes: - Direct. In direct mode, Google BigQuery Connector reads data directly from the Google BigQuery source table. When you use hybrid and complex connection mode, you cannot use direct mode to read data from the Google BigQuery source. - Staging. In staging mode, Google BigQuery Connector exports data from the Google BigQuery source into Google Cloud Storage. After the export is complete, Google BigQuery Connector downloads the data from Google Cloud Storage into the local stage file and then reads data from the local stage file. Default is Staging mode.
Use EXPORT DATA Statement to stage	Uses the EXPORT DATA statement to export data from Google BigQuery to Google Cloud Storage. If the query contains an ORDER BY clause, the specified order is maintained when you export the data. This property applies to staging mode. Default is false.
Number of Threads for Downloading Staging Files	Specifies the number of files that Google BigQuery Connector downloads at a time to enable parallel download. This property applies to staging mode. Default is 1.
Local Stage File Directory	Specifies the directory on your local machine where Google BigQuery Connector stores the Google BigQuery source data temporarily before it reads the data. This property applies to staging mode. Default is $PMTempDir.
Staging File Name	Name of the staging file where data from the Google BigQuery source table is exported to Google Cloud Storage. This property applies to staging mode.
Persist Extract Staging File After Download	Indicates whether Google BigQuery V2 Connector must persist the staging file after it reads data from the staging file. By default, Google BigQuery V2 Connector deletes the staging file.
Persist Destination Table	Indicates whether Google BigQuery V2 Connector must persist the query results table after it reads data from the query results table. By default, Google BigQuery V2 Connector deletes the query results table.
Data format of the staging file	Specifies the data format of the staging file. You can select one of the following data formats: - JSON (Newline Delimited). Supports flat and record data with nested and repeated fields. - CSV. Supports flat data. In a .csv file, columns of the Timestamp data type are represented as floating point numbers that cause the milliseconds value to differ. Default is JSON.
Enable Staging File Compression	Indicates whether to compress the size of the staging file in Google Cloud Storage before Google BigQuery Connector reads data from the staging file. You can enable staging file compression to reduce cost and transfer time. This property applies to staging mode. Default is false.
Retry Options	Comma-separated list to specify the following retry options: - Retry Count. The number of retry attempts to read data from Google BigQuery. - Retry Interval. The time in seconds to wait between each retry attempt. - Retry Exceptions. The list of exceptions separated by pipe (\|) character for which the retries are made. Default is $$gbq_retry_options.
Use Legacy SQL for SQL Override	Indicates that the SQL Override query is specified in legacy SQL. Use the following format to specify a legacy SQL query for the SQL Override Query property: SELECT <Col1, Col2, Col3> FROM [projectID:datasetID.tableName] Clear this option to define a standard SQL override query. Use the following format to specify a standard SQL query for the SQL Override Query property: SELECT * FROM `projectID.datasetID.tableName` Default is false.
pre SQL Configuration	Specify a pre SQL configuration. For example, DestinationTable:PRESQL_SRC,DestinationDataset:EMPLOYEE, FlattenResults:False,WriteDisposition:WRITE_TRUNCATE,UseLegacySql:False
post SQL Configuration	Specify a post SQL configuration. For example, DestinationTable:POSTSQL_SRC,DestinationDataset:EMPLOYEE, FlattenResults:True,WriteDisposition:WRITE_TRUNCATE,UseLegacySql:False
Label	You can assign a label for the transformation to organize and filter the associated jobs in the Google Cloud Platform Log Explorer.
Billing Project ID	The project ID for the Google Cloud project that is linked to an active Google Cloud Billing account where the Secure Agent runs query and extract jobs. If you omit the project ID here, the Secure Agent runs query and extract jobs in the Google Cloud project corresponding to the Project ID value specified in the Google BigQuery V2 connection.
Number of Spark Partitions	Specifies the maximum number of partitions that the Spark engine splits the data into. Default is 1.

Target advanced properties

Property name	Description
Target Dataset ID	Overrides the Google BigQuery dataset name that you specified in the connection. Default is $$gbq_datasetID.
Write Mode	Specifies the mode to write data to the Google BigQuery target. You can select one of the following modes: - Bulk. Google BigQuery V2 Connector first writes the data to a staging file in Google Cloud Storage. When the staging file contains all the data, Google BigQuery V2 Connector loads the data from the staging file to the BigQuery target. Google BigQuery V2 Connector then deletes the staging file unless you configure the task to persist the staging file. - Streaming. Google BigQuery V2 Connector directly writes data to the BigQuery target. Google BigQuery V2 Connector writes the data into the target row by row. - CDC. Applies only when you capture changed data from a CDC source. In CDC mode, Google BigQuery V2 Connector captures changed data from any CDC source and writes the changed data to a Google BigQuery target table. Default is Bulk mode.
Data format of the staging file	Specifies the data format of the staging file. You can select one of the following data formats: - Avro - JSON (Newline Delimited). Supports flat and record data with nested and repeated fields. - Parquet - CSV. Supports flat data. In a .csv file, columns of the Timestamp data type are represented as floating point numbers that cause the milliseconds value to differ. Only JSON format is applicable for mappings in advanced mode. This property applies to bulk and CDC mode. Avro and parquet format is not applicable when you perform a data driven operation. Default is JSON.
Enable Staging File Compression	Select this option to compress the size of the staging file before Google BigQuery writes the data to the Google Cloud Storage and decompress the staging file before it loads the data to the Google BigQuery target. You can enable staging file compression to reduce cost and transfer time. Default is false.
Local Stage File Directory	Specifies the directory on your local machine where Google BigQuery V2 Connector stores the files temporarily before writing the data to the staging file in Google Cloud Storage. This property applies to bulk mode. Default is $PMTempDir.
Use Default Column Values	Applicable when the selected data format for the staging file is CSV when the mapping contains unconnected ports. Includes the default column values for the unconnected port from the staging file to create the target. This is applicable when you have defined the default constraint value in the Google BigQuery source column. When you do not enable this option, the agent creates a target only with the connected ports. The agent populates null or empty strings for unconnected ports. Default is true.
Enable Merge	Implements the Merge query to perform an update, upsert, delete or data driven operation on a Google BigQuery target table. If you select the Enable Data Driven property, you must select this option. Default is disabled.
Target Staging Dataset	Optional. Overrides the Google BigQuery staging dataset name that you specified in the connection and the Target Dataset ID target advanced property.
Create Disposition	Specifies whether Google BigQuery V2 Connector must create the target table if it does not exist. You can select one of the following values: - Create if needed. If the table does not exist, Google BigQuery V2 Connector creates the table. - Create never. If the table does not exist, Google BigQuery V2 Connector does not create the table and displays an error message. Create disposition is applicable only when you perform an insert operation on a Google BigQuery target.
Write Disposition	Specifies how Google BigQuery V2 Connector must write data in bulk mode if the target table already exists. You can select one of the following values: - Write append. If the target table exists, Google BigQuery V2 Connector appends the data to the existing data in the table. - Write truncate. If the target table exists, Google BigQuery V2 Connector overwrites the existing data in the table. - Write empty. If the target table exists and contains data, Google BigQuery V2 Connector displays an error and does not write the data to the target. Google BigQuery V2 Connector writes the data to the target only if the target table does not contain any data. Write disposition is applicable for bulk mode. Write disposition is applicable only when you perform an insert operation on a Google BigQuery target.
Streaming Template Table Suffix	Specify the suffix to add to the individual target tables that Google BigQuery V2 Connector creates based on the template target table. This property applies to streaming mode. If you select the Enable Merge option, Google BigQuery V2 Connector ignores this property. Streaming mode is not applicable when you perform a data driven operation.
Rows per Streaming Request	Specifies the number of rows that Google BigQuery V2 Connector streams to the BigQuery target for each request. Default is 500 rows. The maximum row size that Google BigQuery V2 Connector can stream to the Google BigQuery target for each request is 10 MB. This property applies to streaming mode. Streaming mode is not applicable when you perform a data driven operation.
Staging File Name	Name of the staging file that Google BigQuery V2 Connector creates in the Google Cloud Storage before it loads the data to the Google BigQuery target. This property applies to bulk mode.
Persist Staging File After Loading	Indicates whether Google BigQuery V2 Connector must persist the staging file in the Google Cloud Storage after it writes the data to the Google BigQuery target. You can persist the staging file if you want to archive the data for future reference. By default, Google BigQuery V2 Connector deletes the staging file in Google Cloud Storage. This property applies to bulk mode.
Job Poll Interval In Seconds	The number of seconds after which Google BigQuery V2 Connector polls the status of the write job operation. Default is 10.
Number of Threads for Uploading Staging File	The number of files that Google BigQuery V2 Connector must create to upload the staging file in bulk mode.
Allow Quoted Newlines	Indicates whether Google BigQuery V2 Connector must allow the quoted data sections with newline character in a .csv file.
Field Delimiter	Indicates whether Google BigQuery V2 Connector must allow field separators for the fields in a .csv file.
Allow Jagged Rows	Indicates whether Google BigQuery V2 Connector must accept the rows without trailing columns in a .csv file.
Suppress post SQL on Error	Indicates whether the Secure Agent must abort the post-SQL query execution in case the task fails to write data to the Google BigQuery target table due to errors. Default is disabled.
pre SQL Configuration	Specify a pre-SQL configuration. For example, DestinationTable:PRESQL_TGT2, DestinationDataset:EMPLOYEE, FlattenResults:False, WriteDisposition:WRITE_TRUNCATE, UseLegacySql:False
post SQL Configuration	Specify a post-SQL configuration. For example, DestinationTable:POSTSQL_SRC, DestinationDataset:EMPLOYEE, FlattenResults:True, UseLegacySQL:False
Quote Char	Specifies the quote character to skip when you write data to Google BigQuery. When you write data to Google BigQuery and the source table contains the specified quote character, the task fails. Change the quote character value to a value that does not exist in the source table. Default is double quotes.
Allow Duplicate Inserts	Indicates that the Secure Agent can insert duplicate rows into the Google BigQuery target. Applicable only when you perform a data driven operation and DD_INSERT is specified in the data driven condition. Default is not selected.
Disable Duplicate Update Rows	Determines if multiple incoming rows attempt to update the same target row, the Secure Agent must process only one of the incoming rows and ignore the rest of the incoming rows. Select this option to configure the mapping to process only one of the incoming rows and ignore the rest of the incoming rows. Default is disabled.
Label	You can assign a label for the transformation to organize and filter the associated jobs in the Google Cloud Platform Log Explorer.
Billing Project ID	The project ID for the Google Cloud project that is linked to an active Google Cloud Billing account where the Secure Agent runs query and load jobs. If you omit the project ID here, the Secure Agent runs query and load jobs in the Google Cloud project corresponding to the Project ID value specified in the Google BigQuery V2 connection.
Forward Rejected Rows	Applicable only when you configure DD_REJECT constant in the data driven condition to reject all the rows. Otherwise, this property is not applicable for Google BigQuery V2 Connector.

Lookup advanced properties

Property name	Description
Source Dataset ID	Overrides the Google BigQuery dataset name that you specified in the connection. Default is $$gbq_datasetID.
Source Staging Dataset	Overrides the Google BigQuery staging dataset name that you specified in the Lookup transformation.
Job Poll Interval In Seconds	The number of seconds after which Google BigQuery V2 Connector polls the status of the read job operation. Default is 10.
Read Mode	Specifies the read mode to read data from the Google BigQuery source. You can select one the following read modes: - Direct. In direct mode, Google BigQuery V2 Connector reads data directly from the Google BigQuery source table. When you use hybrid and complex connection mode, you cannot use direct mode to read data from the Google BigQuery source. - Staging. In staging mode, Google BigQuery V2 Connector exports data from the Google BigQuery source into Google Cloud Storage. After the export is complete, Google BigQuery V2 Connector downloads the data from Google Cloud Storage into the local stage file and then reads data from the local stage file. Default is Staging mode.
Use EXPORT DATA Statement to stage	Uses the EXPORT DATA statement to export data from Google BigQuery to Google Cloud Storage. If the query contains an ORDER BY clause, the specified order is maintained when you export the data. This property applies to staging mode. Default is true.
Number of Threads for Downloading Staging Files	Specifies the number of files that Google BigQuery Connector downloads at a time to enable parallel download. This property applies to staging mode. Default is 1.
Local Stage File Directory	Specifies the directory on your local machine where Google BigQuery V2 Connector stores the Google BigQuery source data temporarily before it reads the data. This property applies to staging mode. Default is $PMTempDir.
Data format of the staging file	Specifies the data format of the staging file. You can select one of the following data formats: - Avro - JSON (Newline Delimited). Supports flat and record data with nested and repeated fields. - Parquet - CSV. Supports flat data. In a .csv file, columns of the Timestamp data type are represented as floating point numbers that cause the milliseconds value to differ. Only JSON format is applicable for mappings in advanced mode. This property applies to bulk and CDC mode. Avro and parquet format is not applicable when you perform a data driven operation. Default is JSON.
Enable Staging File Compression	Indicates whether to compress the size of the staging file in Google Cloud Storage before Google BigQuery V2 Connector reads data from the staging file. You can enable staging file compression to reduce cost and transfer time. This property applies to staging mode. Default is false.
Retry Options	Comma-separated list to specify the following retry options: - Retry Count. The number of retry attempts to read data from Google BigQuery. - Retry Interval. The time in seconds to wait between each retry attempt. - Retry Exceptions. The list of exceptions separated by pipe (\|) character for which the retries are made. Default is $$gbq_retry_options.
Use Legacy SQL for SQL Override	Indicates that the SQL Override query is specified in legacy SQL. Use the following format to specify a legacy SQL query for the SQL Override Query property: SELECT <Col1, Col2, Col3> FROM [projectID:datasetID.tableName] Clear this option to define a standard SQL override query. Use the following format to specify a standard SQL query for the SQL Override Query property: SELECT * FROM 'projectID.datasetID.tableName'
Number of Rows to Read	Specifies the number of rows to read from the Google BigQuery source table.
Staging File Name	Name of the staging file where data from the Google BigQuery source table is exported to Google Cloud Storage. This property applies to staging mode.
Persist Extract Staging File After Download	Indicates whether Google BigQuery V2 Connector must persist the staging file in the Google Cloud Storage after it writes the data to the Google BigQuery target. You can persist the staging file if you want to archive the data for future reference. By default, Google BigQuery V2 Connector deletes the staging file in Google Cloud Storage. This property applies to bulk mode.
Persist Destination Table	Indicates whether Google BigQuery V2 Connector must persist the query results table after it reads data from the query results table. By default, Google BigQuery V2 Connector deletes the query results table.
pre SQL Configuration	Specify a pre SQL configuration. For example, DestinationTable:PRESQL_SRC,DestinationDataset:EMPLOYEE, FlattenResults:False,WriteDisposition:WRITE_TRUNCATE,UseLegacySql:False
post SQL Configuration	Specify a post SQL configuration. For example, DestinationTable:POSTSQL_SRC,DestinationDataset:EMPLOYEE, FlattenResults:True,WriteDisposition:WRITE_TRUNCATE,UseLegacySql:False
Label	You can assign a label for the transformation to organize and filter the associated jobs in the Google Cloud Platform Log Explorer.
Billing Project ID	The project ID for the Google Cloud project that is linked to an active Google Cloud Billing account where the Secure Agent runs query and extract jobs. If you omit the project ID here, the Secure Agent runs query and extract jobs in the Google Cloud project corresponding to the Project ID value specified in the Google BigQuery V2 connection.
Number of Spark Partitions	Specifies the maximum number of partitions that the Spark engine splits the data into. Default is 1.

Microsoft Azure Synapse SQL advanced properties

Object properties

Connection property	Description
Schema	The Microsoft Azure Synapse SQL schema name to be used when creating the object. Th schema name is similar to the schema name specified in the SCHEMAMAP.properties file. Default is schema_changeit.

Source advanced properties

Property	Description
Azure Blob Container Name	The name of the container in Microsoft Azure Blob Storage. The container name must not contain special characters. Default is $$blob_container_name.
ADLS FileSystem Name	Required if you select ADLS Gen2 storage in the connection properties. The name of the file system in Microsoft Azure Data Lake Storage Gen2. The file system name must not contain special characters. You can also specify the path of the directory under the file system. Use only a forward slash to specify the directory path. Default is $$adls_folder_name.
Schema Name Override	Overrides the schema specified in the connection. Default is $$syn_schema.
Staging File Format	Type of file format to use when you stage the files in Azure storage. Select one of the following formats: - Delimited Text - Parquet
Field Delimiter	Character used to separate fields in the file. Default is 0x1e. You can specify 'TAB' or 0-256 single-char printable and non-printable ASCII characters. Non-printable characters must be specified in hexadecimal. Note: Multi-char ASCII characters except TAB are not applicable. You cannot use the following non-printable characters: 00x0, 0x0, 0x0A , 0x1B, 0x0D, and 0x1F
Number of Concurrent Connections to Blob Store	Number of concurrent connections to extract data from the Microsoft Azure Blob Storage. When reading a large-size blob, you can spawn multiple threads to process data. Configure Blob Part Size to partition a large-size blob into smaller parts. Default is 4. Maximum is 10.
Blob Part Size	Partitions a blob into smaller parts each of specified part size. When reading a large-size blob, consider partitioning the blob into smaller parts and configure concurrent connections to spawn required number of threads to process data in parallel. Default is 8 MB.
On Pre-Post SQL Error	Determines the behavior when a task that includes pre-SQL or post-SQL commands encounters errors. Select one of the following options: - Continue. The task continues regardless of errors. - Stop. The task stops when errors occur while executing pre-SQL or post-SQL commands.
Quote Character	Specifies the quote character to skip when you read data from Microsoft Azure Synapse SQL. The quote character that you specify must not exist in the source table. If it exists, enter a different quote character value. Default is 0x1f.
Interim Directory	Optional. Path to the staging directory in the Secure Agent machine. Specify the staging directory where you want to stage the files when you read data from Microsoft Azure SQL Data Warehouse. Ensure that the directory has sufficient space and you have read permissions to the directory. Default staging directory is /tmp. You cannot specify an interim directory when you use the Hosted Agent.

Target advanced properties

Advanced Property	Description
Azure Blob Container Name	The name of the container in Microsoft Azure Blob Storage. The container name must not contain special characters. Default is $$blob_container_name.
ADLS FileSystem Name	Required if you select ADLS Gen2 storage in the connection properties. The name of the file system in Microsoft Azure Data Lake Storage Gen2. The file system name must not contain special characters. You can also specify the path of the directory under the file system. Use only a forward slash to specify the directory path. Default is $$adls_folder_name.
Copy Method	The method to load data from the staging location to Microsoft Azure Synapse SQL. Select one of the following options: - Polybase - Copy Command Default is Polybase.
Copy Command Options	Options for the copy command in key=value format. Specify each option on a new line.
Schema Name Override	Overrides the schema specified in the connection. Default is $$syn_schema.
Staging File Format	Type of file format to use when you stage the files in Azure storage. Select one of the following formats: - Delimited Text - Parquet
Field Delimiter	Character used to separate fields in the file. Default is 0x1e. You can specify 'TAB' or 0-256 single-char printable and non-printable ASCII characters. Non-printable characters must be specified in hexadecimal. Note: Multi-char ASCII characters except TAB are not applicable. You cannot use the following non-printable characters: 00x0, 0x0, 0x0A , 0x1B, 0x0D, and 0x1F
Number of Concurrent Connections to Blob Storage	Number of concurrent connections to extract data from the Microsoft Azure Blob Storage. When reading a large-size blob, you can spawn multiple threads to process data. Default is 4. Maximum is 10.
On Pre-Post SQL Error	Determines the behavior when a task that includes pre-SQL or post-SQL commands encounters errors. You can select any of the following options: - Continue. The task continues regardless of errors. - Stop. The task stops when errors occur while executing pre-SQL or post-SQL commands.
Batch Size	Minimum number of rows in a batch. Enter a number greater than 0. Default is 2000000.
Reject Threshold	Number of errors within a batch that causes a batch to fail. Enter a positive integer. If the number of errors is equal to or greater than the property value, the Secure Agent rejects the entire batch to the error file and marks the session failed. Note: When you do not set the reject threshold, the mapping fails when an error is encountered.
Quote Character	Specifies the quote character to skip when you write data to Microsoft Azure Synapse SQL. The quote character that you specify must not exist in the source table. If it exists, enter a different quote character value.
Compression Format	Compresses the staging files in the .Gzip format. Default is None.
Interim Directory	Optional. Path to the staging directory in the Secure Agent machine. Specify the staging directory where you want to stage the files when you write data to Microsoft Azure Synapse SQL. Ensure that the directory has sufficient space and you have write permissions to the directory. Default staging directory is /tmp. You cannot specify an interim directory when you use the Hosted Agent.

Lookup advanced properties

Property name	Description
Schema Name Override	Overrides the schema specified in the connection. Default is $$syn_schema.
Azure Blob Container Name	The name of the container in Microsoft Azure Blob Storage. The container name must not contain special characters. Default is $$blob_container_name.
ADLS FileSystem Name	Required if you select ADLS Gen2 storage in the connection properties. The name of the file system in Microsoft Azure Data Lake Storage Gen2. The file system name must not contain special characters. You can also specify the path of the directory under the file system. Use only a forward slash to specify the directory path. Default is $$adls_folder_name.
Staging File Format	Type of file format to use when you stage the files in Azure storage. Select one of the following formats: - Delimited Text - Parquet
Field Delimiter	Character used to separate fields in the file. Default is 0x1e. You can specify 'TAB' or 0-256 single-char printable and non-printable ASCII characters. Non-printable characters must be specified in hexadecimal. Note: Multi-char ASCII characters except TAB are not applicable. You cannot use the following non-printable characters: 00x0, 0x0, 0x0A , 0x1B, 0x0D, and 0x1F
Number of concurrent connections to Blob Store	Number of concurrent connections to extract data from the Microsoft Azure Blob Storage. When reading a large-size blob, you can spawn multiple threads to process data. Configure Blob Part Size to partition a large-size blob into smaller parts. Default is 4. Maximum is 10.
Blob Part Size	Partitions a blob into smaller parts each of specified part size. When reading a large-size blob, consider partitioning the blob into smaller parts and configure concurrent connections to spawn required number of threads to process data in parallel. Default is 8 MB.
On Pre-Post SQL Error	Determines the behavior when a task that includes pre-SQL or post-SQL commands encounters errors. You can select any of the following options: - Continue. The task continues regardless of errors. - Stop. The task stops when errors occur while executing pre-SQL or post-SQL commands.
Quote Character	The Secure Agent skips the specified character when you read data from Microsoft Azure SQL Data Warehouse. Default is 0x1f.
Interim Directory	Optional. Path to the staging directory in the Secure Agent machine. Specify the staging directory where you want to stage the files when you read data from Microsoft Azure SQL Data Warehouse. Ensure that the directory has sufficient space and you have write permissions to the directory. Default staging directory is /tmp. You cannot specify an interim directory when you use the Hosted Agent.

Property Name	Description
Database	Overrides the database that you used to import the object. Default is $$sf_database.
Schema	Overrides the schema that you used to import the object. Default is $$sf_schema.
Warehouse	Overrides the Snowflake name specified in the connection. The warehouse name in the mapping overrides the warehouse name you specify in the connection.
Role	Overrides the Snowflake role assigned to the user specified in the connection.
Batch Row Size	The number of rows written to a single file in the agent location. When the number of rows written to the file reaches the value specified, the agent flushes the data queue and starts processing the write commands.
Number of local staging files	The number of files that represents a single batch of data. The default number of files considered is 64. After the agent uploads the specified number of local staging files to the Snowflake user stage, Snowflake unloads the data to the target table.
Additional Write Runtime Parameters	The additional runtime parameters that you can use when you write to Snowflake. You can enter multiple write runtime parameters, separated by ampersand (&).
Rejected File Path	The filename and path of the file on the agent machine where you want to write the rejected records. For example, \rejectedfiles\reject7
Forward Rejected Rows	Determines whether the transformation passes rejected rows to the next transformation or drops rejected rows. By default, the agent forwards rejected rows to the next transformation.

Creating a connection map configuration

Advanced properties

Amazon Redshift advanced properties

Object properties

Source advanced properties

Target advanced properties

Lookup advanced properties

Databricks Delta advanced properties

Object properties

Source advanced properties

Target advanced properties

Lookup advanced properties

Google BigQuery advanced properties

Object properties

Source advanced properties

Target advanced properties

Lookup advanced properties

Microsoft Azure Synapse SQL advanced properties

Object properties

Source advanced properties

Target advanced properties

Lookup advanced properties

PostgreSQL advanced properties

Object properties

Source advanced properties

Target advanced properties

Lookup advanced properties

Snowflake advanced properties

Object properties

Source advanced properties

Target advanced properties

Lookup advanced properties

Connection property	Description
Database	The Snowflake database name to be used when creating the object. The database name similar to the database name specified in the DATABASEMAP.properties file. Default is dbname_changeit.
Schema	The Snowflake schema name to be used when creating the object. Th schema name is similar to the schema name specified in the SCHEMAMAP.properties file. Default is schema_changeit.