Snowflake Data Cloud Connector > Part II: Data Integration with Snowflake Data Cloud Connector > Targets for Snowflake Data Cloud > Target objects and operations
  

Target objects and operations

In a Target transformation, you can use a single object or parameter as the target type.
When you choose the target type, you can select the operation to insert, update, upsert, or delete data in a Snowflake target. You can also use the data driven operation to define expressions that flag rows for an insert, update, delete, or reject operation.

Target operation restrictions

Some rules apply when you configure a target operation.

Parameterized target

When you parameterize the target object and connection and enable the Allow parameter to be overridden at run time option in a transformation, you cannot override the object name using the fully qualified name such as db.schema.tablename. You must pass the db=<dbname>&schema<schemaname> values in the Additional JDBC URL Parameters field in the Snowflake Data Cloud connection.

Update columns

You need to specify the temporary key column to update data to or delete data from a Snowflake target. If the Snowflake target does not include a primary key column, click Add to add a temporary key. You can select multiple columns.
If the records from the source tables contain duplicate primary keys, perform one of the following tasks in mappings to update or delete records in Snowflake:

Specify a target

You can use an existing target or create a new target to write data to Snowflake. If you choose to create a new target, the agent creates the target when it runs the task.
Ensure that the path for the schema and database that you specify for the target object is in uppercase letters. Specify the Snowflake database name and schema in the following format: <database name>/<schema name> for the target object is in uppercase letters. If you do not enter the path, the Secure Agent considers the schema and database name you specified in the Additional JDBC URL Parameters field from the Snowflake Data Cloud connection properties.
Specify the TableType as table. The TableType property is optional.
The following image shows an example of a target object configured to create at runtime: To create a new target object at runtime, specify the object name and the path where you want to create the table in Snowflake.

Restrictions for a Snowflake target

The following rules apply when you configure the target:

Override the update operation

You can specify an update override to override the update query that the Secure Agent generates for the update operation.
When you configure an update override, the Secure Agent uses the query that you specify, stages the data in files, and then loads that data into a temporary table using the Snowflake's loader copy command. The data from the temporary table is then loaded to the Snowflake target table. The syntax that you specify for the update query must be supported by Snowflake.
Specify the update query in the following format:
UPDATE <Target table name> SET <Target table name>.<Column1> = :TU.<Column1>, <Target table name>.<Column2> = :TU.<Column2>, … <Target table name>.<ColumnN> = :TU.<ColumnN> FROM :TU WHERE <Target table name>.<Update Column1> = :TU.<Update Column1> AND <Target table name>.<Update Column2> = :TU.<Update Column2> AND … <Target table name>.<Update ColumnN> = :TU.<Update ColumnN>
where, :TU. represents the incoming data source for the target port.
The Secure Agent replaces :TU. with a temporary table name while running the mapping and does not validate the update query.
When you configure an update override in a mapping to write to Snowflake, consider the following rules:

Optimize the .csv file size

When you create a mapping to write to Snowflake, you can specify the size of the local staging .csv file in bytes. Specify the local staging file size property, csvFileSize, in the Additional Write Runtime Parameters field in the advanced Snowflake target properties. The default file size is 50 MB.
If the intended file size is 50 MB, calculate the csvFileSize value in bytes, for example 50*1024*1024 and then specify 52428800 as the csvFileSize. For optimal performance, the number and size of the CSV files must be in synchrony.

Configuring the batch size and the number of local staging files

To achieve optimal performance of a mapping that writes data to Snowflake, allow the mapping to consider the default values for both the batch size and number of local staging files based on the amount of data it processes. Unless your use case requires that you set the values, it is recommended that the mapping processes with the default values.
The priority that the mapping gives to these two configurations depends on which threshold is reached first. If the mapping satisfies the criteria for the number of local staging files before reaching the specified batch row size, the number of local staging files takes precedence since that threshold is reached first.
For example, see the following table which displays the number of insert commands issued for various configurations of the specified batch size and the number of local staging files in a write operation to Snowflake:
Number of Source Rows
Row Size (bytes)
Default File Size
Batch Row Size
Number of Local Staging Files
Number of Inserts
4
52019206 bytes
52428800 bytes
4
1
2
4
52019206 bytes
52428800 bytes
4
Not configured. Considers the default value of 64.
1
4
52019206 bytes
52428800 bytes
1
Not configured. Considers the default value of 64.
4
4
52019206 bytes
52428800 bytes
3
1
2
4
52019206 bytes
52428800 bytes
1
1
4
Scenario 1: The batch row size is 4 and the number of local staging files is 1.
With a batch row size of 4 and 1 local staging file, two insert statements are executed in Snowflake when processing data.
When the file size exceeds the default 50 MB because there are two rows, a new file is created for the next incoming row. Given that the number of local staging files is set to 1, a stage can hold a maximum of one file. Hence, two stages are created, each holding one file with two rows. Each of the file sizes is 104038412 bytes, which is 52019206 bytes per row. For each stage, an insert command is issued, leading to two insert commands.
Scenario 2: The batch row size is 4 and the number of local staging files value is not set.
Using the default value of 64 for the local staging files, the mapping considers the batch size when the local staging file threshold is not reached.
Scenario 3: The batch row size is 1 and the number of local staging files value is not set.
With a batch row size of 1 and no specific local staging files value set, the mapping issues four insert statements for four rows as the staging files threshold is not reached.
Scenario 4: The batch row size is 3 and the number of local staging files value is 1.
Using a batch row size of 3 and one local staging file, the mapping issues two insert commands after processing two rows in the first stage and two rows in the second stage.
Scenario 5: The batch row size is 1 and the number of local staging files value is 1.
With a batch row size of 1 and one local staging file, four insert commands are issued for each row individually as the staging files threshold is never met due to the specified smaller row size.

Configure load properties in mappings

You can configure write properties to load data to Snowflake in the Additional Write Runtime Parameters field in the Snowflake Data Cloud advanced target properties of the Target transformation.
The following table lists some of the additional runtime parameters that you can specify to load data to Snowflake:
Property
Description
oneBatch
Process all data in a single batch.
Type is Boolean.
Default is false.
remoteStage
Specifies to use internal stage. External stage is not applicable.
Type is String.
Default is "~"(user stage).
onError
Specifies the action to perform when an error is encountered while loading data from a file.
For example, onError option ABORT_STATEMENT|CONTINUE|SKIP_FILE
Type is String.
Default is CONTINUE.
compressFileByPut
Compress file by PUT.
Type is Boolean.
Default is false.
compressDataBeforePut
Compress data before PUT.
The loader compresses the data to a gzip format before uploading the data.
Type is Boolean.
Default is true.
copyEmptyFieldAsEmpty
The COPY command option to set incoming empty fields as null.
Type is Boolean.
enablePGZIP
Enables parallelism for the file compression.
Type is Boolean.
Default is true.
onError=ABORT_STATEMENT&oneBatch=true
Load the entire data in single batch and to stop the task if an error occurs. Simultaneously, validate the user-specified reject file path and write the error records to this file and to the session log.
Type is onError - String or oneBatch - Boolean.

Example 1. Compress files

You want to compress files by using the Put command before loading data to Snowflake.
Specify the following compression option: compressDataBeforePut=false&compressFileByPut=true
If you specify both the options as true, Snowflake considers the compressDataBeforePut option.

Example 2. Replace empty values as null

You want to replace the incoming fields with empty values as NULL while loading the data to Snowflake.
Specify the copyEmptyFieldAsEmpty Boolean option and set the value to true or false based on your requirement.
Consider the following scenarios before you configure the copyEmptyFieldAsEmpty Boolean parameter:

Example 3. Write data in a single batch

You want to write source data in a single batch to Snowflake. If an error is encountered while loading, you also want to capture the errors.
Specify the onError=ABORT_STATEMENT&oneBatch=true property based on your requirement.
While loading in a single batch, if an error occurs, the Secure Agent checks for the specified reject file name, runs the COPY command, validates the reject file, and then passes the file name to capture the errors, if any.