Snowflake Data Cloud Connector > Part III: SQL ELT with Snowflake Data Cloud Connector > SQL ELT optimization for mapping tasks > Transformations with Snowflake Data Cloud

Transformations with Snowflake Data Cloud

When you configure SQL ELT optimization, Data Integration tries to push the configured transformation to Snowflake.

You can use full SQL ELT to push the following transformations to Snowflake:

•Aggregator
•Expression
•Filter
•Joiner
•Lookup
•Normalizer
•Rank
•Router
•Sequence Generator
•SQL
•Sorter
•Union
•Update Strategy

Note: Router transformation is applicable only for source SQL ELT.

For more information about configuring transformations, see Transformations in the Data Integration documentation.

Aggregator transformation

You can configure full SQL ELT optimization to push an Aggregator transformation to process in Snowflake.

Aggregate calculations

You can perform the following aggregate calculations:

•AVG
•COUNT
•MAX
•MIN
•MEDIAN
•SUM
•VARIANCE

Incoming ports

When you configure an Aggregator transformation and the incoming port is not used in an aggregate function or in a group by field in mappings, the ANY_VALUE() function is used for columns that are not part of the group by or the aggregator function. In this case, the output is not deterministic as the ANY_VALUE() function returns any value from the port.

Expression transformation

You can configure full SQL ELT optimization to push an Expression transformation to process in Snowflake.

You can add an Expression transformation to each of the sources in the mapping, followed by a join downstream in the mapping. Additionally, you can add multiple Expression transformations that branch out from a transformation and then branch in into a transformation downstream in the mapping.

When you configure an Expression transformation, consider the following rules to include variables in the expression:

•You cannot use variables where you are using the value assigned while processing a previous row for calculations in the current row. if you do, the mapping runs without SQL ELT optimization.
•The variables can be nested, but you cannot refer to a variable before it is defined in the expression.

If the variables are not defined in that order, the mapping runs without SQL ELT optimization.

For example,

var: AGEPLUS2 = AGEPLUS1 + 1
var: AGEPLUS1 = AGE + 1
out: NEXTAGE = AGEPLUS2 + 1

Here, AGE +1 is defined later. AGEPLUS2 in the first variable refers to AGEPLUS1 and remains unresolved.

To resolve this, specify the variables in the following order:

var: AGEPLUS1 = AGE + 1
var: AGEPLUS2 = AGEPLUS1 + 1
out: NEXTAGE = AGEPLUS2 + 1

•The variables cannot have an expression that is cyclic or refers to itself:

For example,

var: AGEPLUS1 = AGEPLUS2 + 1
var: AGEPLUS2 = AGEPLUS1 + 1
out: NEXTAGE= AGEPLUS2

Here, AGEPLUS1 refers to AGEPLUS2 and remains unresolved.

Lookup transformation

You can configure full SQL ELT optimization to push a Lookup transformation to process in Snowflake. You can push both a connected and an unconnected lookup.

When the mapping contains an unconnected lookup, you can also nest the unconnected lookup function with other expression functions. For example, :LKP.U_LOOKUP(Upper(argument1), argument)

Lookup objects

Consider the following rules when you configure lookups:

•You can configure a lookup for Snowflake when the Source transformation uses the following sources:

- Amazon S3
- Google Cloud Storage
- Microsoft Azure Data Lake Storage Gen2
- Snowflake source

•You can configure a lookup for an Amazon S3, Google Cloud Storage, Microsoft Azure Data Lake Storage Gen2, or Snowflake object only when the Source transformation uses the corresponding Amazon S3, Google Cloud Storage, Microsoft Azure Data Lake Storage Gen2, or Snowflake source.

Connected lookup

In a mapping with a Snowflake Data Cloud source and target, set the Multiple Matches option for the lookup object to Return all rows.

Unconnected lookup

When you configure an unconnected Lookup transformation, consider the following rules:

•Do not configure an expression for an output received from an unconnected lookup.
•In a mapping with a Snowflake Data Cloud source and target where some of input fields from the unconnected lookup transformation is not mapped to the Snowflake target object, the select query includes all the unmapped fields.

Multiple matches behavior in a connected and unconnected lookup

If you enable SQL ELT optimization for a mapping that contains a connected or unconnected lookup, you must follow these guidelines:

•In an unconnected lookup, ensure that you always select the Multiple Matches option to Report Error. When you look up data and the lookup condition finds multiple matches, all the matching rows are selected and the task runs with SQL ELT optimization. If you enabled Multiple Matches to any option other than Report Error, the mapping runs without SQL ELT optimization.
• In a connected lookup, you can set the Multiple Matches option to Return all rows or Report Error. When you set the Multiple Matches option to Report Error, you can set the Lkp_apdo_allow_report_error custom flag in the task advanced session properties to determine how Data Integration handles multiple matches:

- When you set the property to Yes and if there are multiple matches in the data, the multiple match policy is ignored and the job runs successfully with SQL ELT optimization.
- When you do not set the property, and if there are multiple matches in the data, Data Integration considers the policy and displays a warning message. SQL ELT optimization is ignored and the task fails.

FileName port

When you configure a lookup for an Amazon S3 source in a mapping that contains an Amazon S3 source and Snowflake target, remove the filename port from both the Amazon S3 source and lookup object. The FileName port is not applicable.

Lookup query object

When you use a lookup object as a query in a Lookup transformation in a mapping to lookup data in Snowflake, specify the database and schema in the advanced lookup properties or in the additional JDBC URL parameters in the Snowflake Data Cloud connection.

Normalizer transformation

You can configure a Normalizer transformation in a mapping to return a row for each instance of the multiple-occurring data.

For example, a relational source includes four fields with quarterly sales data. You can configure a Normalizer transformation to generate a separate output row for each quarter.

Router transformation

You can configure source SQL ELT optimization to push a Router transformation to the database for processing.

When you configure a Router transformation, connect or map only one output group to the target transformation.

Sequence Generator transformation

When you configure a Sequence Generator transformation, you can connect the NEXTVAL port to single or multiple ports in the transformations that follow the Sequence Generator transformation.

You can push a Sequence Generator transformation with the following restrictions:

•You cannot use a shared sequence in a Sequence Generator transformation.
•You can add only an Expression or a Target transformation after a Sequence Generator transformation in a mapping.

SQL transformation

You can use an SQL transformation only to push certain functions and shared sequence.

Use functions to run queries

You can include functions in an entered query in an SQL transformation and run queries with the Snowflake target endpoint.

You must use only the SELECT clause SQL statement to push a function. Specify the column name in the select query or function. Do not push functions using statements such as "SELECT * FROM TABLE".

You can use the following functions in an entered query:

•UUID_STRING
•RANDOM
•RANDSTR
•SIGN
•CURRENT_REGION
•CURRENT_ACCOUNT
•CURRENT_ROLE
•CURRENT_USER
•CURRENT_DATABASE
•CURRENT_SCHEMA
•DAYNAME
•SPLIT
•SPLIT_PART

To use the CURRENT_ROLE, CURRENT_DATABASE, and CURRENT_SCHEMA functions in an SQL transformation, ensure to provide the database, role, and schema name in the additional JDBC parameters field in the Snowflake Data Cloud connection. If you do not specify the values in the connection, Data Integration inserts null to the target.

Reuse shared sequence

You can push a mapping with a shared sequence defined in an SQL transformation to a Snowflake endpoint. Data Integration writes the data in the same sequence to the target as in the Snowflake source.

Get the shared sequence from Snowflake and define the sequence in an entered query in an SQL transformation.

Specify the shared sequence in the entered query in the following syntax: Select <Snowflake_schema_name>.<Snowflake_database_name>.<sequence_name>.NEXTVAL

User defined functions

You can configure a custom query in an SQL transformation to read from Java or SQL user-defined functions (UDF) in Snowflake.

The following guidelines apply for UDFs:

•You cannot read UDFs that have newline characters in the UDF name.
•If the UDF contains array parameters, the mapping runs without SQL ELT optimization.

Union transformation

You can push a Union transformation with the following restrictions:

•The Source transformation in the mapping must only include Snowflake source objects.
•A mapping runs without SQL ELT optimization when the source is Amazon S3, Google Cloud Storage, or Microsoft Azure Data Lake Storage Gen2.

Update Strategy transformation

You cannot use an Update Strategy transformation.

You can instead use the update and upsert operations in the Target transformation to write to Snowflake.