Connections for INFACore > Connections to source and target endpoints > Microsoft Azure SQL Data Warehouse
  

Microsoft Azure SQL Data Warehouse

Create a Microsoft Azure SQL Data Warehouse V2 connection to read from or write to Microsoft Azure SQL Data Warehouse.

Feature snapshot

Operation
Support
Read
Yes
Write
Yes

Before you begin

Before you configure the connection properties, you'll need to get information from your Azure Synapse account.
The following video shows you how to get information from your Azure Synapse account:
https://infa.media/3csGICQ

Connection properties

The following table describes the Microsoft Azure Synapse SQL connection properties:
Property
Description
Connection Name
Name of the connection.
Each connection name must be unique within the organization. Connection names can contain alphanumeric characters, spaces, and the following special characters: _ . + -,
Maximum length is 255 characters.
Azure DW JDBC URL
The Microsoft Azure Synapse SQL JDBC connection string.
Enter the connection string in the following format for Microsoft SQL Server authentication:
jdbc:sqlserver://<Server>.database.windows.net:1433;database=<Database>
Enter the connection string in the following format for Azure Active Directory (AAD) authentication:
jdbc:sqlserver://<Server>.database.windows.net:1433;
database=<Database>;encrypt=true;trustServerCertificate=false;
hostNameInCertificate=*.database.windows.net;loginTimeout=30;
Authentication=ActiveDirectoryPassword;
Default is Microsoft SQL Server authentication.
Azure DW JDBC Username
User name to connect to the Microsoft Azure Synapse SQL account. Provide AAD user name for AAD authentication.
Azure DW JDBC Password
Password to connect to the Microsoft Azure Synapse SQL account.
Azure DW Schema Name
Name of the schema in Microsoft Azure Synapse SQL.
Azure Storage Type
Type of Azure storage to stage the files.
Select one of the following storage types:
  • - Azure Blob. Uses Microsoft Azure Blob Storage to stage the files.
  • - ADLS Gen2. Uses Microsoft Azure Data Lake Storage Gen2 to stage the files.
Default is Azure Blob.
Authentication Type
Authentication type to connect to Azure storage to stage the files.
Select one of the following options:
  • - Shared Key Authentication. Uses the account name and account key to connect to Microsoft Azure Blob Storage or Microsoft Azure Data Lake Storage Gen2.
  • - Service Principal Authentication. Applies to Microsoft Azure Data Lake Storage Gen2. Uses the client ID, client secret, and tenant ID to connect to Microsoft Azure Data Lake Storage Gen2. To use Service Principal authentication, register an application in the Azure Active Directory, generate a client secret, and then assign the Storage Blob Contributor role to the application.
  • - Managed Identity Authentication. Applies to Microsoft Azure Data Lake Storage Gen2. Select to authenticate using identities that are assigned to applications in Azure to access Azure resources in Microsoft Azure Data Lake Storage Gen2.
Azure Blob Account Name
Applies to Shared Key Authentication for Microsoft Azure Blob Storage.
Name of the Microsoft Azure Blob Storage account to stage the files.
Azure Blob Account Key
Applies to Shared Key Authentication for Microsoft Azure Blob Storage.
The Microsoft Azure Blob Storage access key to stage the files.
Container Name
Applies to Microsoft Azure Blob Storage.
The name of the container in the Azure Blob Storage account.
ADLS Gen2 Storage Account Name
Applies to Shared Key Authentication and Service Principal Authentication for Microsoft Azure Data Lake Storage Gen2.
Name of the Microsoft Azure Data Lake Storage Gen2 account to stage the files.
ADLS Gen2 Account Key
Applies to Shared Key Authentication for Microsoft Azure Data Lake Storage Gen2.
The Microsoft Azure Data Lake Storage Gen2 access key to stage the files.
Client ID
Applies to Service Principal Authentication and Managed Identity Authentication for Microsoft Azure Data Lake Storage Gen2.
The client ID of your application.
To use service principal authentication, enter the application ID or client ID for your application registered in the Azure Active Directory.
To use managed identity authentication, enter the client ID for the user-assigned managed identity. If the managed identity is system-assigned, leave the field empty.
Client Secret
Applies to Service Principal Authentication for Microsoft Azure Data Lake Storage Gen2.
The client secret for your application.
Tenant ID
Applies to Service Principal Authentication for Microsoft Azure Data Lake Storage Gen2.
The directory ID or tenant ID for your application.
File System Name
Applies to Microsoft Azure Data Lake Storage Gen2.
The name of the file system in the Microsoft Azure Data Lake Storage Gen2 account.
Blob End-point
Type of Microsoft Azure endpoints.
Select one of the following endpoints:
  • - core.windows.net. Connects to Azure endpoints.
  • - core.usgovcloudapi.net. Connects to US Government Microsoft Azure Synapse SQL endpoints.
  • - core.chinacloudapi.cn. Connects to Microsoft Azure Synapse SQL endpoints in the China region.
Default is core.windows.net.
VNet Rule
Enable to connect to a Microsoft Azure Synapse SQL endpoint residing in a virtual network (VNet).

Read properties

The following table describes the advanced source properties that you can configure in the Python code to read from Microsoft Azure SQL Data Warehouse:
Property
Description
Azure Blob Container Name
Required if you select Azure Blob storage in the connection properties.
The name of the container in Microsoft Azure Blob Storage.
The container name must not contain special characters.
ADLS FileSystem Name
Required if you select ADLS Gen2 storage in the connection properties.
The name of the file system in Microsoft Azure Data Lake Storage Gen2.
The file system name must not contain special characters.
You can also specify the path of the directory under the file system. Use only a forward slash to specify the directory path.
Schema Name Override
Overrides the schema specified in the connection.
Table Name Override
Overrides the table name of the imported source table.
Staging File Format
Type of file format to use when you stage the files.
Select one of the following formats:
  • - Delimited Text
  • - Parquet
Field Delimiter
Character used to separate fields in the file. Default is 0x1e. You can specify 'TAB' or 0-256 single-char printable and non-printable ASCII characters. Non-printable characters must be specified in hexadecimal.
Note: Multi-char ASCII characters except TAB are not applicable. You cannot use the following non-printable characters:
00x0, 0x0, 0x0A , 0x1B, 0x0D, and 0x1F
Number of Concurrent Connections to Blob Store
Number of concurrent connections to extract data from the Microsoft Azure Blob Storage. When reading a large-size blob, you can spawn multiple threads to process data. Configure Blob Part Size to partition a large-size blob into smaller parts.
Default is 4. Maximum is 10.
Blob Part Size
Partitions a blob into smaller parts each of specified part size. When reading a large-size blob, consider partitioning the blob into smaller parts and configure concurrent connections to spawn required number of threads to process data in parallel.
Default is 8 MB.
Pre-SQL
Pre-SQL command that must be run before reading data from the source.
Post-SQL
Post-SQL command that must be run after reading data from the source.
SQL Override
When you read data from an object, you can configure SQL overrides and define constraints.
On Pre-Post SQL Error
Determines the behavior when a task that includes pre-SQL or post-SQL commands encounters errors.
Select one of the following options:
  • - Continue. The task continues regardless of errors.
  • - Stop. The task stops when errors occur while executing pre-SQL or post-SQL commands.
Quote Character
Specifies the quote character to skip when you read data from Microsoft Azure Synapse SQL.
The quote character that you specify must not exist in the source table. If it exists, enter a different quote character value.
Default is 0x1f.
Interim Directory
Optional. Path to the staging directory in the Secure Agent machine.
Specify the staging directory where you want to stage the files when you read data. Ensure that the directory has sufficient space and you have write permissions to the directory.
Default staging directory is /tmp.
You cannot specify an interim directory when you use the Hosted Agent.
Tracing Level
Sets the amount of detail that appears in the log file. You can choose terse, normal, verbose initialization, or verbose data. Default is normal.

Write properties

The following table describes the advanced target properties that you can configure in the Python code to write to Microsoft Azure SQL Data Warehouse:
Advanced Property
Description
Azure Blob Container Name
Required if you select Azure Blob storage in the connection properties.
The name of the container in Microsoft Azure Blob Storage.
The container name cannot contain special characters.
ADLS FileSystem Name
Required if you select ADLS Gen2 storage in the connection properties.
The name of the file system in Microsoft Azure Data Lake Storage Gen2.
The file system name cannot contain special characters.
You can also specify the path of the directory under the file system. Use only a forward slash to specify the directory path.
Copy Method
The method to load data from the staging location to Microsoft Azure Synapse SQL. Select one of the following options:
  • - Polybase
  • - Copy Command
Default is Polybase.
Copy Command Options
Options for the copy command in key=value format.
Specify each option on a new line.
For more information on copy command options, see the topic Copy Commands in the Cloud Data Integration documentation.
Schema Name Override
Overrides the schema specified in the connection.
Table Name Override
Overrides the table name of the imported Microsoft Azure Synapse SQL Data Warehouse target table.
Field Delimiter
Character used to separate fields in the file. Default is 0x1e. You can specify 'TAB' or 0-256 single-char printable and non-printable ASCII characters. Non-printable characters must be specified in hexadecimal.
Note: Multi-char ASCII characters except TAB are not applicable. You cannot use the following non-printable characters:
00x0, 0x0, 0x0A , 0x1B, 0x0D, and 0x1F
Number of Concurrent Connections to Blob Storage
Number of concurrent connections to extract data from the Microsoft Azure Blob Storage. When reading a large-size blob, you can spawn multiple threads to process data.
Default is 4. Maximum is 10.
Truncate Table
Truncates the target before inserting data to the target.
Pre-SQL
Pre-SQL command that must be run before reading data from the source.
Post-SQL
Post-SQL command that must be run after writing data to the target.
On Pre-Post SQL Error
Determines the behavior when a task that includes pre-SQL or post-SQL commands encounters errors. You can select any of the following options:
  • - Continue. The task continues regardless of errors.
  • - Stop. The task stops when errors occur while executing pre-SQL or post-SQL commands.
Treat Source Rows As
Select one of the following options:
  • - NONE
  • - INSERT
  • - DELETE
  • - UPDATE
  • - UPSERT
  • - DATA DRIVEN. Select to honor the flagged rows from the update strategy or any other custom transformation, or a CDC source.
  • Default is None.
Batch Size
Minimum number of rows in a batch. Enter a number greater than 0.
Default is 2000000.
Reject Threshold
Number of errors within a batch that causes a batch to fail. Enter a positive integer.
If the number of errors is equal to or greater than the property value, the Secure Agent rejects the entire batch to the error file and marks the session failed.
Note: When you do not set the reject threshold, the mapping fails when an error is encountered.
Quote Character
Specifies the quote character to skip when you write data to Microsoft Azure Synapse SQL.
The quote character that you specify must not exist in the source table. If it exists, enter a different quote character value.
Compression Format
Compresses the staging files in the .Gzip format. Default is None.
Update Override
Overrides the default update SQL statement that the Secure Agent generates.
Interim Directory
Optional. Path to the staging directory in the Secure Agent machine.
Specify the staging directory where you want to stage the files when you write data to Microsoft Azure Synapse SQL. Ensure that the directory has sufficient space and you have write permissions to the directory.
Default staging directory is /tmp.
You cannot specify an interim directory when you use the Hosted Agent.