•You have a configured Databricks Delta connection in your organization to intermediately stage the data in Databricks. For more information about the properties required to create a Databricks Delta connection, see the Connections for INFACore documentation. For more information on how to create a Databricks Delta connection, see the INFACore SDlK for Python Reference documentation.
Set up the Databricks cluster
Set up the Databricks cluster for use with Databricks Connect. Databricks Connect runs your jobs remotely on a Databricks cluster using Spark APIs. The Databricks cluster must be using Databricks Runtime version 5.1 or later.
1In the compute configuration of the Databricks cluster, go to the advanced options.
2Edit the Spark configuration section and enter the following code snippet:
spark.databricks.service.server.enabled true
3Enter the following code snippet, based on whether you want to use AWS Databricks or Azure Databricks:
- To use AWS Databricks, enter the following code snippet:
spark.databricks.service.port 15001
- To use Azure Databricks, enter the following code snippet:
spark.databricks.service.port 8787
4Restart the cluster.
Install Databricks Connect in your development environment
Install the Databricks Connect library in your development environment.
1Create a development environment. Ensure that your Python environment is compatible with the cluster version.
For example, if you are using Anaconda, run the following code snippet to create a Databricks environment that is compatible with cluster version 3.x: