Microsoft Azure Data Lake Storage Gen2 connection properties
Create a Microsoft Azure Data Lake Storage Gen2 connection to securely read data from or write to Microsoft Azure Data Lake Storage Gen2.
Prepare for authentication
You can configure Shared Key, Managed Identity, and Service Principal authentication types to access Microsoft Azure Data Lake Storage Gen2. Before you configure the authentication, you need to set up your environment and keep the authentication details handy.
Create storage account and configure access
To access Microsoft Azure Data Lake Storage Gen2, follow these steps to set up your environment:
1Set up a storage account to use with Microsoft Azure Data Lake Storage Gen2 and create a blob container in the storage account. You can use role-based access control or access control lists to authorize users to access the resources in the storage account.
2Register the application in Azure Active Directory to authenticate users to access the Microsoft Azure Data Lake Storage Gen2 account. You can use role-based access control or access control lists to authorize the application.
3Create an Azure Active Directory web application for service-to-service authentication with Microsoft Azure Data Lake Storage Gen2. Ensure that you have superuser privileges to access the folders or files created in the application.
Ensure you get all the required authentication details based on the authentication method you want to use in the connection:
Service principal authentication
You need the client ID, client secret, and tenant ID for your application registered in the Azure Active Directory.
Shared key authentication
You need the account key for the Microsoft Azure Data Lake Storage Gen2 account.
Managed identity authentication
You need the client ID or application ID for your application registered in the Azure Active Directory. Before you get the client ID or application ID, be sure to complete certain prerequisites.
Managed identity authentication
Managed Identity authentication uses managed identities in Azure Active Directory to authenticate and authorize access to Azure resources securely.
Before you use managed identity authentication to connect to Microsoft Azure Data Lake Storage Gen2, be sure to complete certain prerequisites.
1Create an Azure virtual machine.
To configure managed identity authentication in a Microsoft Azure Data Lake Storage Gen2 connection, select the Azure virtual machine on which you have installed the Secure Agent.
2Install the Secure Agent on the Azure virtual machine.
3Enable system assigned identity or user assigned identity for the Azure virtual machine.
If you enable system assigned identity, assign the required role or permissions to the Azure virtual machine to run mappings and tasks. If you enable user assigned identity, assign the required role or permissions to the user assigned identity. For example, if you use role-based access control, assign the Storage Blob Data Contributor role and if you use access control lists, assign the read, write, and execute permissions. If you enable both and do not specify the client ID, the system assigned identity is used for authentication.
4After you add or remove a managed identity, restart the Azure virtual machine.
Connect to Microsoft Azure Data Lake Storage Gen2
Let's configure the Microsoft Azure Data Lake Storage Gen2 connection properties to connect to Microsoft Azure Data Lake Storage Gen2.
Before you begin
Before you get started, you'll need to get information from your Microsoft Azure Data Lake Storage Gen2 account based on the authentication type that you want to configure.
The following table describes the basic connection properties:
Property
Description
Connection Name
Name of the connection.
Each connection name must be unique within the organization. Connection names can contain alphanumeric characters, spaces, and the following special characters: _ . + -,
Maximum length is 255 characters.
Description
Description of the connection. Maximum length is 4000 characters.
Type
Microsoft Azure Data Lake Storage Gen2
Runtime Environment
The name of the runtime environment where you want to run tasks.
You cannot run an application ingestion and replication task, database ingestion and replication task or streaming ingestion and replication task on a Hosted Agent or serverless runtime environment.
Account Name
Microsoft Azure Data Lake Storage Gen2 account name or the service name.
File System Name
The name of the file system in the Microsoft Azure Data Lake Storage Gen2 account.
Directory Path
The path of a directory without the file system name.
You can select from the following directory structures:
- / for root directory
- /dir1
- dir1/dir2
Default is /.
Authentication types
You can select service principal authentication, shared key authentication, and managed identity authentication to access the Microsoft Azure Data Lake Storage Gen2 account.
Note: Data Ingestion and Replication supports managed identity authentication. However, Streaming Ingestion and Replication does not support shared key authentication or managed identity authentication.
Select your preferred authentication type and then configure the authentication-specific parameters.
Service principal authentication
Service principal authentication uses the client ID, client secret, and tenant ID to connect to Microsoft Azure Data Lake Storage Gen2.
The following table describes the basic connection properties for service principal authentication:
Property
Description
Client ID
The client ID of your application.
Specify the client ID for your application registered in the Azure Active Directory.
Client Secret
The client secret key generated for the client ID.
Specify the client secret key to complete the OAuth authentication in the Azure Active Directory.
Tenant ID
The directory ID of the Azure Active Directory.
Endpoint Suffix
The type of Microsoft Azure endpoints.
Select one of the following endpoints:
- core.windows.net. Connects to Azure endpoints.
- core.usgovcloudapi.net. Connects to US government Microsoft Azure Data Lake storage Gen2 endpoints.
- core.chinacloudapi.cn. Connects to Microsoft Azure Data Lake storage Gen2 endpoints in the China region.
Default is core.windows.net.
Shared key authentication
Shared key authentication uses the account key to connect to Microsoft Azure Data Lake Storage Gen2.
The following table describes the basic connection properties for shared key authentication:
Property
Description
Account Key
The account key for the Microsoft Azure Data Lake Storage Gen2 account.
Endpoint Suffix
The type of Microsoft Azure endpoints.
Select one of the following endpoints:
- core.windows.net. Connects to Azure endpoints.
- core.usgovcloudapi.net. Connects to US government Microsoft Azure Data Lake storage Gen2 endpoints.
- core.chinacloudapi.cn. Connects to Microsoft Azure Data Lake storage Gen2 endpoints in the China region.
Default is core.windows.net.
Managed identity authentication
Managed identity authentication authenticates using identities that are assigned to applications in Azure to access Azure resources in Microsoft Azure Data Lake Storage Gen2.
The following table describes the basic connection properties for managed identity authentication:
Property
Description
Client ID
The client ID of your application.
To use managed identity authentication, specify the client ID for the user-assigned managed identity.
Leave the field blank in the following scenarios:
- If the permission is provided by system-assigned managed identity.
- If there is no system-assigned identity but only a single user-assigned managed identity.
Endpoint Suffix
The type of Microsoft Azure endpoints.
Select one of the following endpoints:
- core.windows.net. Connects to Azure endpoints.
- core.usgovcloudapi.net. Connects to US government Microsoft Azure Data Lake storage Gen2 endpoints.
- core.chinacloudapi.cn. Connects to Microsoft Azure Data Lake storage Gen2 endpoints in the China region.
Default is core.windows.net.
Proxy Server Settings
If your organization uses an outgoing proxy server to connect to the Internet, the Secure Agent connects to Informatica Intelligent Cloud Services through the proxy server.
Note: You cannot use a proxy server with managed identity authentication.
You can use one of the following types of proxy servers:
•Unauthenticated proxy - Requires only the host and port address for configuration.
•Authenticated proxy - Requires the host address, port address, user name, and password for configuration.
To configure proxy settings for the Secure Agent, use one of the following methods:
•Configure the Secure Agent through the Secure Agent Manager on Windows or shell command on Linux.
- To bypass proxy server for service principal authentication, append login.microsoftonline.com to the command.
- To bypass proxy server for managed identity authentication, append 169.254.169.254 to the command.
For example, InfaAgent.NonProxyHost=localhost|127.*|[\:\:1]|<accountname>.blob.core.windows.net|<accountname>.dfs.core.windows.net|<accountname>.blob.core.windows.net|login.microsoftonline.com|169.254.169.254