Consider the following rules and guidelines for Databricks objects used as sources, targets, and lookups in mappings:
•If the authentication type field in mappings shows PAT, it indicates that Personal Access Token authentication is used to authenticate access to Databricks resources.
•When you use volume, personal staging location, or set the staging property and want to stage the data in a directory of your choice, you need to configure the DTM property -Ddatabricks.tmpdir=/my/dir/path in the JVM options in the system configuration settings of the Administrator service.
•When you use a volume and run a mapping, the error log might show warning messages even though the mapping runs successfully. Ignore the warning messages.
•If you run a mapping on a job cluster to write to a Databricks target and the Timestamp data in the source contains the date 01-01-0001, the date is incorrectly written to the target.
•Dates earlier than 1582-10-15 in the date data type are incorrectly written to the target.
•You cannot use input or in-out parameters in a parameter file to parameterize multiple objects or parameterize the relationships between the objects.
•When you use a parameter file to parameterize the table name, specify only the table name in the parameter value. Do not specify the database name with the table name in the parameter value. For example, <databasename>.<tablename> or <databasename>/<tablename>.
•Ensure that column names do not contain unicode characters.
•A mapping with Null values in an uncached lookup condition generates incorrect results.
•When you do not specify the database name in Databricks connection and read multiple objects with the same table name from different databases, you must append the database name for each object in the advanced relationship.
• When you specify SESSSTARTTIME variable in a query in a mapping task to return the Datetime values, specify the query in the following format:
:select to_timestamp('$$$SESSSTARTTIME', 'MM/dd/yyyy HH:mm:ss.SSSSSS') as t;
•When you run multiple concurrent mappings to write data to Databricks targets, a transaction commit conflict error might occur and the mappings might fail.
•View objects are displayed in the Table panel instead of the View panel while importing a Databricks object. This issue occurs when the Databricks cluster is deployed on AWS cloud service.
•To avoid java heap space error when you read or write complex files, set the JVM options for type DTM to increase the -Xms and -Xmx values in the system configuration details of the Secure Agent. The recommended values for -Xms is 512 MB and -Xmx is 1024 MB.
•When you import views, the Select Source Object dialog box does not display view objects.
•When you test the Databricks connection, the Secure Agent does not validate the values you specify in the Org ID connection parameter.
•You cannot use the Hosted Agent as a runtime environment when you configure a mapping to run on the SQL warehouse to read or write data that contains unicode characters.
•The number of clusters that the Secure Agent creates to run the mapping depends on the number of Databricks connections used in the transformations in a mapping. For example, if multiple transformations use the same Databricks connection, the mapping runs on a single cluster.
•When you keep the mapping designer idle for more than 15 minutes, the metadata fetch throws an exception.
•If you change the database name in the connection and run the existing mappings, the mappings start failing. After you change the database name in the connection, you must re-import the objects in the existing mappings before you run the mappings.
•Use the following formats to run the mapping successfully, when you import a Databricks source object containing Date or Boolean data types with a simple source filter conditions:
- Boolean = 0 or 1
- Date = YYYY-MM-DD HH24:MM:SS.US
•When you run a mapping with source column data type as string containing TRUE / FALSE value and write data to target with Boolean data type column of a Databricks table, the Secure Agent writes data as 0 to the target.
•When the Databricks all-purpose cluster is down and you perform a test connection or import an object, the connection is timed out after 10 minutes.
•When you parameterize the source or target connection in a mapping and you do not specify the database name, ensure that you specify the database name in lowercase when you assign a default value for the parameter.
•When you parameterize the source filter condition or any expressions in a mapping, ensure that you specify the table name in lowercase when you add the source filter condition or the expression in the mapping task. Otherwise, the Secure Agent throws the following exception:
Invalid expression string for filter condition
•When you run a mapping to write data to a Databricks target using create target at runtime and the target table already exists, ensure that the target table schema is same. Otherwise, the mapping fails.
•When you run a mapping to write data to multiple Databricks targets that use the same Databricks connection and the Secure Agent fails to write data to one of targets, the mapping fails and the Secure Agent does not write data to the remaining targets.
•When you use the Create New at Runtime option to create a Databricks target, you can parameterize only the target connection and the table name using a parameter file. You cannot parameterize other properties such as Path or DBname.
•The pre-SQL and post-SQL commands run non-linearly. In the session logs, you will see that the target pre-SQL cases are executed before the source pre-SQL queries.
•When you run pre-SQL and post-SQL commands to read from sources that have semi-colons within the query, the mappings fails. The queries can only have semi-colons at the end.
•When you read or write Unicode data to Databricks on the SQL endpoint, you need to set some properties for the Secure Agent before you run the mapping. You can perform one of the following tasks:
- Set the following environment variables in the Secure Agent machine, and then restart the Secure Agent.
- export LANGUAGE="en_US.UTF-8"
- export LC_ALL="en_US.UTF-8"
- Configure the property -Dfile.encoding=UTF-8 in the JVM options in the Secure Agent properties.
Note: You cannot read or write Unicode data when you use the Hosted Agent in the connection.
•When you configure a mapping where you have staged data in the Personal Staging Location, the temporary data is not deleted when the mapping stops abruptly.
•When you parameterize the source or lookup object, ensure that the column names in the source object and lookup object are not the same. Else, the mapping fails.