Data Engineering Administrator Guide > Cluster Configuration > Create a Databricks Cluster Configuration
  

Create a Databricks Cluster Configuration

A Databricks cluster configuration is an object in the domain that contains configuration information about the Databricks cluster. The cluster configuration enables the Data Integration Service to push mapping logic to the Databricks environment.
Use the Administrator tool to import configuration properties from the Databricks cluster to create a cluster configuration. You can import configuration properties from the cluster or from a file that contains cluster properties. You can choose to create a Databricks connection when you perform the import.

Importing a Databricks Cluster Configuration from the Cluster

When you import the cluster configuration directly from the cluster, you provide information to connect to the cluster.
Before you import the cluster configuration, get cluster information from the Databricks administrator.
  1. 1. From the Connections tab, click the ClusterConfigurations node in the Domain Navigator.
  2. 2. From the Actions menu, select New > Cluster Configuration.
  3. The Cluster Configuration wizard opens.
  4. 3. Configure the following properties:
  5. Property
    Description
    Cluster configuration name
    Name of the cluster configuration.
    Description
    Optional description of the cluster configuration.
    Distribution type
    The distribution type. Choose Databricks.
    Method to import the cluster configuration
    Choose Import from cluster.
    Databricks domain
    Domain name of the Databricks deployment.
    Databricks access token
    The token ID created within Databricks, required for authentication. .
    Note: If the token has an expiration date, verify that you get a new token from the Databricks administrator before it expires.
    Databricks cluster ID
    The cluster ID of the Databricks cluster.
    To find the cluster ID on the Databricks portal, follow these steps:
    1. a. Select Clusters from the object bar on the left side.
    2. b. Select the cluster you want to integrate with the Informatica domain.
    3. c. Click the Spark ID tab and expand the list of Spark Properties.
    4. d. Select the Tags tab.
    Create connection
    Choose to create a Databricks connection.
    If you choose to create a connection, the Cluster Configuration wizard associates the cluster configuration with the Databricks connection.
    If you do not choose to create a connection, you must manually create one and associate the cluster configuration with it.
  6. 4. Click Next to verify the information on the summary page.

Importing a Databricks Cluster Configuration from a File

You can import properties from an archive file to create a cluster configuration.
Complete the following tasks to import a Databricks cluster from a file:
  1. 1. Get required cluster properties from the Databricks administrator.
  2. 2. Create an .xml file with the cluster properties, and compress it into a .zip or .tar file.
  3. 3. Log in to the Administrator tool and import the file.

Create the Import File

To import the cluster configuration from a file, you must create an archive file.
To create the .xml file for import, you must get required information from the Databricks administrator. You can provide any name for the file and store it locally.
The following table describes the properties required to import the cluster information:
Property Name
Description
cluster_name
Name of the Databricks cluster.
cluster_ID
The cluster ID of the Databricks cluster.
baseURL
URL to access the Databricks cluster.
This is the domain URL that appears in your browser menu bar. It commonly incorporates your account region. For example, https://southcentralus.azuredatabricks.net or https://westus.azuredatabricks.net.
accesstoken
The token ID created within Databricks required for authentication.
Optionally, you can include other properties specific to the Databricks environment.
When you complete the .xml file, compress it into a .zip or .tar file for import.

Sample Import File

The following text shows a sample import file with the required properties:
<?xml version="1.0" encoding="UTF-8"?><configuration>
<property>
<name>cluster_name</name>
<value>my_cluster</value>
</property>
<property>
<name>cluster_id</name>
<value>0926-294544-bckt123</value>
</property>
<property>
<name>baseURL</name>
<value>https://<region>.azuredatabricks.net</value>
</property>
<property>
<name>accesstoken</name>
<value>dapicf76c2d4567c6sldn654fe875936e778</value>
</property>
</configuration>

Import the Cluster Configuration

After you create the .xml file with the cluster properties, use the Administrator tool to import into the domain and create the cluster configuration.
  1. 1. From the Connections tab, click the ClusterConfigurations node in the Domain Navigator.
  2. 2. From the Actions menu, select New > Cluster Configuration.
  3. The Cluster Configuration wizard opens.
  4. 3. Configure the following properties:
  5. Property
    Description
    Cluster configuration name
    Name of the cluster configuration.
    Description
    Optional description of the cluster configuration.
    Distribution type
    The distribution type. Choose Databricks.
    Method to import the cluster configuration
    Choose Import from file.
    Upload configuration archive file
    The full path and file name of the file. Click the Browse button to navigate to the file.
    Create connection
    Choose to create a Databricks connection.
    If you choose to create a connection, the Cluster Configuration wizard associates the cluster configuration with the Databricks connection.
    If you do not choose to create a connection, you must manually create one and associate the cluster configuration with it.
  6. 4. Click Next to verify the information on the summary page.