You can create mappings in SQL ELT mode to read data from Databricks, Amazon S3, or Microsoft Azure Data Lake Storage Gen2, load it to Databricks, and perform all of the data transformation within Databricks.
To create a mapping in SQL ELT mode, you create a mapping, and then select Mapping - SQL ELT as the mapping type. You're then prompted to choose Databricks as the cloud ecosystem and a Databricks target connection. If your organization doesn't have any Databricks connections, you're prompted to create one.
After you choose the target connection, the Mapping Designer opens. When you create a mapping, a Source transformation and a Target transformation are already on the canvas for you to configure.
Sources in mappings in SQL ELT mode
When you configure the source connection in the Source transformation, you can choose an Amazon S3 V2, a Databricks, or a Microsoft Azure Data Lake Storage Gen2 connection.
Amazon S3 V2 source properties
You can configure the following properties for an Amazon S3 V2 source:
•Source connection - Parameter, Allow parameter to be overridden at run time
•Source Type - Single, Parameter
•Format - Delimited, Avro, JSON, ORC, and Parquet
•Source Type - File and directory
•Folder Path
•File Name
For information on how to configure the supported properties, see the help for Amazon S3 V2 Connector.
Databricks source properties
You can configure the following properties for a Databricks source:
•Source connection - Parameter, Allow parameter to be overridden at run time
•Source Type - Single, query, parameter
•Database Name
•Table Name
•SQL Override
Microsoft Azure Data Lake Storage Gen2 source properties
You can configure the following properties for a Microsoft Azure Data Lake Storage Gen2 source:
•Source connection - Parameter, Allow parameter to be overridden at run time
•Source Type - Single, Parameter
•Format - Delimited, Avro, JSON, ORC, and Parquet
•Filesystem Name Override
•Source Type - File and directory
•Allow Wildcard Characters
•Directory Override
•File Name Override
For information on how to configure the supported properties, see the help for Microsoft Azure Data Lake Storage Gen2 Connector.
Targets in mappings in SQL ELT mode
When you configure a Target transformation in a mapping in SQL ELT mode, you need to use only a Databricks connection.
You can configure the following properties in a Databricks target transformation in a mapping in SQL ELT mode:
•Target Type - Single Object, Parameter, Create New Target at Runtime
•Operation - Insert, Update, Delete, Data Driven
•Target Database Name
•Target Table Name
•Write Disposition - Append, Truncate
•Update Mode - Update as Update, Update else Insert
For information on how the targets in mappings in SQL ELT mode behaves differently from the targets in other types of mappings, see Targets in mappings in SQL ELT mode in Mappings in the Data Integration help.
Transformations in mappings in SQL ELT mode
A mapping in SQL ELT mode includes transformations that Databricks can process.
You can use the following transformations in a mapping in SQL ELT mode:
•Source
•Target
•Aggregator
•Expression
•Filter
•Joiner
•Lookup
•Rank
•Router
•Sorter
•Union
Functions in mappings in SQL ELT mode
When you create expressions within a mapping in SQL ELT mode, you must use the native functions and expression syntax of Databricks and not Informatica functions and expression syntax.
You can use the following native functions in a mapping in SQL ELT mode:
Aggregate functions
Function
Function
Function
ANY()
FIRST()
REGR_INTERCEPT()
ANY_VALUE()
FIRST_VALUE()
REGR_R2()
APPROX_COUNT_DISTINCT()
HLL_SKETCH_AGG()
REGR_SLOPE()
APPROX_PERCENTILE()
HLL_UNION_AGG()
REGR_SXX()
AVG()
KURTOSIS()
REGR_SXY()
BIT_AND()
LAST()
REGR_SYY()
BIT_OR()
LAST_VALUE()
SKEWNESS()
BIT_XOR()
MAX()
SOME()
BITMAP_CONSTRUCT_AGG()
MAX_BY()
STD()
BITMAP_OR_AGG()
MEAN()
STDDEV()
BOOL_AND()
MEDIAN()
STDDEV_POP()
BOOL_OR()
MIN()
STDDEV_SAMP()
CORR()
MIN_BY()
SUM()
COUNT()
MODE()
TRY_AVG()
COUNT_IF()
PERCENTILE()
TRY_SUM()
COUNT_MIN_SKETCH()
PERCENTILE_APPROX(
VAR_POP()
COVAR_POP()
REGR_AVGX()
VAR_SAMP()
COVAR_SAMP()
REGR_AVGY()
VARIANCE()
EVERY()
REGR_COUNT()
-
String and binary functions
Function
Function
Function
AES_DECRYPT()
INSTR()
RPAD()
AES_ENCRYPT()
LCASE()
RTRIM()
ASCII()
LEFT()
SHA()
BASE64()
LEN()
SHA1()
BIN()
LENGTH()
SHA2()
BINARY()
LEVENSHTEIN()
SOUNDEX()
BIT_LENGTH()
LOCATE()
SPACE()
BITMAP_COUNT()
LOWER()
SPLIT_PART()
BTRIM()
LPAD()
STARTSWITH()
CHAR()
LTRIM()
STRING()
CHAR_LENGTH()
MASK()
SUBSTR()
CHARACTER_LENGTH()
MD5()
SUBSTRING()
CHARINDEX()
OCTET_LENGTH()
SUBSTRING_INDEX()
CHR()
OVERLAY()
TO_BINARY()
CONCAT()
PARSE_URL()
TRANSLATE()
CONCAT_WS()
POSITION()
TRIM()
CONTAINS()
REGEXP_COUNT()
TRY_AES_DECRYPT()
CRC32()
REGEXP_EXTRACT()
TRY_TO_BINARY()
DECODE()
REGEXP_INSTR()
UCASE()
ENCODE()
REGEXP_REPLACE()
UNBASE64()
ENDSWITH()
REGEXP_SUBSTR()
UNHEX()
FIND_IN_SET()
REPEAT()
UPPER()
FORMAT_NUMBER()
REPLACE()
URL_DECODE()
HEX()
REVERSE()
URL_ENCODE()
INITCAP()
RIGHT()
-
Numeric functions
Function
Function
Function
ABS()
DOUBLE()
RAND()
ACOSH()
E()
RANDN()
ACOS()
EXP()
RANDOM()
ASIN()
EXPM1()
RINT()
ASINH()
FACTORIAL()
ROUND()
ATAN()
FLOAT()
SEC()
ATAN2()
FLOOR()
SHIFTLEFT()
ATANH()
GETBIT()
SHIFTRIGHT()
BIGINT()
HYPOT()
SHIFTRIGHTUNSIGNED()
BIT_COUNT()
INT()
SIGN()
BIT_GET()
ISNAN()
SIGNUM()
BIT_REVERSE()
LN()
SIN()
BITMAP_BIT_POSITION()
LOG()
SINH()
BITMAP_BUCKET_NUMBER()
LOG10()
SMALLINT()
BROUND()
LOG1P()
SQRT()
CBRT()
LOG2()
TAN()
CEIL()
MOD()
TANH()
CEILING()
NANVL()
TINYINT()
CONV()
NEGATIVE()
TRY_ADD()
COS()
PI()
TRY_DIVIDE()
COSH()
PMOD()
TRY_MULTIPLY()
COT()
POSITIVE()
TRY_SUBTRACT()
CSC()
POW()
WIDTH_BUCKET()
DECIMAL()
POWER()
-
DEGREES()
RADIANS()
-
Date and time functions
Function
Function
Function
ADD_MONTHS()
DAYOFYEAR()
TIMESTAMP()
CURDATE()
FROM_UNIXTIME()
TIMESTAMP_SECONDS()
CURRENT_DATE()
FROM_UTC_TIMESTAMP()
TO_DATE()
CURRENT_TIMESTAMP()
GETDATE()
TO_TIMESTAMP()
CURRENT_TIMEZONE()
HOUR()
TO_UNIX_TIMESTAMP()
DATE_ADD()
LAST_DAY()
TO_UTC_TIMESTAMP()
DATE_FORMAT()
MAKE_DATE()
TRUNC()
DATE_FROM_UNIX_DATE()
MAKE_TIMESTAMP()
TRY_TO_TIMESTAMP()
DATE_PART()
MINUTE()
UNIX_DATE()
DATE()
MONTH()
UNIX_MICROS()
DATE_SUB()
MONTHS_BETWEEN()
UNIX_MILLIS()
DATE_TRUNC()
NEXT_DAY()
UNIX_SECONDS()
DATEADD()
NOW()
UNIX_TIMESTAMP()
DATEDIFF()
QUARTER()
WEEKDAY()
DAY()
SECOND()
WEEKOFYEAR()
DAYOFMONTH()
TIMESTAMP_MICROS()
YEAR()
DAYOFWEEK()
TIMESTAMP_MILLIS()
-
Cast functions
Function
Function
Function
BOOLEAN()
TO_NUMBER()
TRY_TO_NUMBER()
TO_CHAR()
TO_VARCHAR()
-
Miscellaneous functions
Function
Function
Function
ASSERT_TRUE()
IFF()
NVL()
COALESCE()
IFNULL()
RAISE_ERROR()
CURRENT_CATALOG()
INPUT_FILE_BLOCK_LENGTH()
REFLECT()
CURRENT_DATABASE()
INPUT_FILE_BLOCK_START()
SECRET()
CURRENT_METASTORE()
INPUT_FILE_NAME()
SESSION_USER()
CURRENT_SCHEMA()
IS_ACCOUNT_GROUP_MEMBER()
SPARK_PARTITION_ID()
CURRENT_USER()
IS_MEMBER()
TRY_REFLECT()
ELT()
ISNOTNULL()
TYPEOF()
EQUAL_NULL()
ISNULL()
USER()
GREATEST()
JAVA_METHOD()
UUID()
HASH()
LEAST()
VERSION()
HLL_SKETCH_ESTIMATE()
LUHN_CHECK()
XXHASH64()
HLL_UNION()
MONOTONICALLY_INCREASING_ID()
-
IF()
NULLIF()
-
For more information about the native functions and expression syntax of Databricks, see Databricks functions in Databricks documentation.
Operators in mappings in SQL ELT mode
When you use mappings in SQL ELT mode, Data Integration converts the expression in the transformation by determining equivalent operators in the database. If there is no equivalent operator, Data Integration processes the transformation logic.
The table lists the operators that you can push to Databricks:
Operator
Operator
Operator
+
||
<=
-
>
!=
*
<
AND
/
=
OR
%
>=
NOT
Rules and guidelines in mappings in SQL ELT mode
Consider the following rules and guidelines when you run mappings in SQL ELT mode:
Functions
•When you use the following functions, ensure that all the arguments are of the same data type:
- COALESCE
- DECODE
- ELT
- GREATEST
- IF
- IFF
- IFNULL
- LEAST
- NVL
- NVL2
•When you use the following functions, the number of optional arguments must not exceed five:
- COALESCE
- CONCAT
- CONCAT_WS
- COUNT
- ELT
- GREATEST
- HASH
- JAVA_METHOD
- LEAST
- REFLECT
- TRY_REFLECT
- XXHASH64
In the following example of CONCAT_WS function, the first argument ',' is mandatory and the other arguments are optional.