Connections > Databricks Delta connection > Connect to Databricks Delta

Connect to Databricks Delta

Let's configure the Databricks Delta connection properties to connect to Databricks Delta.

Before you begin

Before you get started, you'll need to get information from your Databricks Delta account.

The following video shows you how to get the information you need:

You also need to configure the AWS or Azure staging environment to use the SQL warehouse or the Databricks cluster in the connection.

To learn about the staging prerequisites for the Azure or AWS environment, check out SQL warehouse or Databricks cluster.

Connection details

The following table describes the connection properties:

Property	Description
Connection Name	Name of the connection. Each connection name must be unique within the organization. Connection names can contain alphanumeric characters, spaces, and the following special characters: _ . + -, Maximum length is 255 characters.
Runtime Environment	Informatica Cloud Hosted Agent, the runtime environment where you want to run tasks.
SQL Warehouse JDBC URL	Databricks SQL Warehouse JDBC connection URL. Required to connect to a Databricks SQL warehouse. Doesn't apply to Databricks clusters. This field is required to connect to the Databricks SQL warehouse.
Databricks Token	Personal access token to access Databricks. Required for SQL warehouse and Databricks cluster.
Catalog Name	If you use Unity Catalog, the name of an existing catalog in the metastore. Optional for SQL warehouse. Doesn't apply to Databricks cluster. You can also specify the catalog name in the end of the SQL warehouse JDBC URL. Note: The catalog name cannot contain special characters. For more information about Unity Catalog, see the Databricks Delta documentation.

Advanced settings

The following table describes the advanced connection properties:

Property	Description
Database	The database name that you want to connect to in Databricks Delta. Optional for SQL warehouse and Databricks cluster. By default, all databases available in the workspace are listed.
JDBC Driver Class Name	The name of the JDBC driver class. Optional for SQL warehouse and Databricks cluster. Specify the driver class name as com.simba.spark.jdbc.Driver for the data loader task.
Staging Environment	The cloud provider where the Databricks cluster is deployed. Required for SQL warehouse and Databricks cluster. Select one of the following options: - AWS - Azure - Personal Staging Location Default is Personal Staging Location. You can select the Personal Staging Location as the staging environment instead of Azure or AWS staging environments to stage data locally for mappings and tasks. Personal staging location doesn't apply to Databricks cluster. Note: You cannot switch between clusters once you establish a connection.
Databricks Host	Doesn't apply to a data loader task.
Cluster ID	Doesn't apply to a data loader task.
Organization ID	Doesn't apply to a data loader task.
Min Workers	Doesn't apply to a data loader task.
Max Workers	Doesn't apply to a data loader task.
DB Runtime Version	Doesn't apply to a data loader task.
Worker Node Type	Doesn't apply to a data loader task.
Driver Node Type	Doesn't apply to a data loader task.
Instance Pool ID	Doesn't apply to a data loader task.
Elastic Disk	Doesn't apply to a data loader task.
Spark Configuration	Doesn't apply to a data loader task or to Mass Ingestion tasks.
Spark Environment Variables	Doesn't apply to a data loader task or to Mass Ingestion tasks.

AWS staging environment

The following table describes the properties for the AWS staging environment:

Property	Description
S3 Access Key	The key to access the Amazon S3 bucket.
S3 Secret Key	The secret key to access the Amazon S3 bucket.
S3 Data Bucket	The existing bucket to store the Databricks Delta data.
S3 Staging Bucket	The existing bucket to store the staging files.
S3 Authentication Mode	The authentication mode to access Amazon S3. Select one of the following authentication modes: - Permanent IAM credentials. Uses the S3 access key and S3 secret key to connect to Databricks Delta. - IAM Assume Role. Uses the AssumeRole for IAM authentication to connect to Databricks Delta. Doesn't apply to Databricks cluster.
IAM Role ARN	The Amazon Resource Number (ARN) of the IAM role assumed by the user to use the dynamically generated temporary security credentials. Set the value of this property if you want to use the temporary security credentials to access the Amazon S3 staging bucket. For more information about how to get the ARN of the IAM role, see the AWS documentation.
Use EC2 Role to Assume Role	Optional. Select the check box to enable the EC2 role to assume another IAM role specified in the IAM Role ARN option. The EC2 role must have a policy attached with a permission to assume an IAM role from the same or different AWS account.
S3 Region Name	The AWS cluster region in which the bucket you want to access resides. Select a cluster region if you choose to provide a custom JDBC URL that does not contain a cluster region name in the JDBC URL connection property.
S3 Service Regional Endpoint	The S3 regional endpoint when the S3 data bucket and the S3 staging bucket need to be accessed through a region-specific S3 regional endpoint. Doesn't apply to Databricks cluster. Default is s3.amazonaws.com.
Zone ID	The zone ID for the Databricks job cluster. Optional for Databricks cluster. Doesn't apply to SQL warehouse. Applies only if you want to create a Databricks job cluster in a particular zone at runtime. For example, us-west-2a. Note: The zone must be in the same region where your Databricks account resides.
EBS Volume Type	The type of EBS volumes launched with the cluster. Optional for Databricks cluster. Doesn't apply to SQL warehouse.
EBS Volume Count	The number of EBS volumes launched for each instance. You can choose up to 10 volumes. Optional for Databricks cluster. Doesn't apply to SQL warehouse. Note: In a Databricks Delta connection, specify at least one EBS volume for node types with no instance store. Otherwise, cluster creation fails.
EBS Volume Size	The size of a single EBS volume in GiB launched for an instance. Optional for Databricks cluster. Doesn't apply to SQL warehouse.

Azure staging environment

The following table describes the properties for the Azure staging environment:

Property	Description
ADLS Storage Account Name	The name of the Microsoft Azure Data Lake Storage account.
ADLS Client ID	The ID of your application to complete the OAuth Authentication in the Active Directory.
ADLS Client Secret	The client secret key to complete the OAuth Authentication in the Active Directory.
ADLS Tenant ID	The ID of the Microsoft Azure Data Lake Storage directory that you use to write data.
ADLS Endpoint	The OAuth 2.0 token endpoint from where authentication based on the client ID and client secret is completed.
ADLS Filesystem Name	The name of an existing file system to store the Databricks Delta data.
ADLS Staging Filesystem Name	The name of an existing file system to store the staging data.