Operation | Support |
---|---|
Read | Yes |
Write | Yes |
Property | Description |
---|---|
Connection Name | Name of the connection. Each connection name must be unique within the organization. Connection names can contain alphanumeric characters, spaces, and the following special characters: _ . + -, Maximum length is 255 characters. |
Account Name | Microsoft Azure Data Lake Storage Gen2 account name or the service name. |
Authentication Type | Authentication type to access the Microsoft Azure Data Lake Storage Gen2 account. Select one of the following options:
|
Client ID | Applies to Service Principal Authentication and Managed Identity Authentication. The client ID of your application. To use service principal authentication, specify the application ID or client ID for your application registered in the Azure Active Directory. To use managed identity authentication, specify the client ID for the user-assigned managed identity. If the permission is provided by system-assigned managed identity, leave the field empty. If there is no system-assigned identity but only a single user-assigned managed identity, you may also leave the field empty. |
Client Secret | Applies to Service Principal Authentication. The client secret key to complete the OAuth authentication in the Azure Active Directory. |
Tenant ID | Applies to Service Principal Authentication. The directory ID of the Azure Active Directory. |
Account Key | Applies to Shared Key Authentication. The account key for the Microsoft Azure Data Lake Storage Gen2 account. |
File System Name | The name of the file system in the Microsoft Azure Data Lake Storage Gen2 account. |
Directory Path | The path of an existing directory without the file system name. You can select one of the following syntax:
There is no default directory. |
Adls Gen2 End-point | The type of Microsoft Azure endpoints. Select one of the following endpoints:
Default is core.windows.net. |
Property | Description |
---|---|
Concurrent Threads | Number of concurrent connections to extract data from the Microsoft Azure Data Lake Storage Gen2. When reading a large file or object, you can spawn multiple threads to process data. Configure Block Size to divide a large file into smaller parts. Default is 4. Maximum is 10. |
Filesystem Name Override | Overrides the default file system name. |
Source Type | Select the type of source from which you want to read data. You can select the following source types:
Default is File. |
Allow Wildcard Characters | Indicates whether you want to use wildcard characters for the directory source type. |
Directory Override | Microsoft Azure Data Lake Storage Gen2 directory that you use to read data. Default is root directory. The directory path specified at run time overrides the path specified while creating a connection. You can specify an absolute or a relative directory path:
Example of absolute path: Dir1/Dir2 Example of relative path: /Dir1/Dir2 When you use the relative path, the imported object path is added to the file path used during the metadata fetch at runtime. Do not specify a root directory (/) to override the directory. |
File Name Override | Source object. Select the file from which you want to read data. The file specified at run time overrides the file specified in Object. |
Block Size | Applicable to flat file format. Divides a large file into smaller specified block size. When you read a large file, divide the file into smaller parts and configure concurrent connections to spawn the required number of threads to process data in parallel. Specify an integer value for the block size. Default value in bytes is 8388608. |
Timeout Interval | Not applicable. |
Recursive Directory Read | Indicates whether you want to read objects stored in subdirectories in mappings. |
Incremental File Load | Not applicable. |
Compression Format | Reads compressed data from the source. Select one of the following options:
You cannot read compressed JSON files. You cannot preview data for a compressed flat file. |
Interim Directory | Optional. Applicable to flat files and JSON files. Path to the staging directory in the Secure Agent machine. Specify the staging directory where you want to stage the files when you read data from Microsoft Azure Data Lake Storage Gen2. Ensure that the directory has sufficient space and you have write permissions to the directory. Default staging directory is /tmp. You cannot specify an interim directory when you use the Hosted Agent. |
Advanced Target Property | Description |
---|---|
Concurrent Threads | Number of concurrent connections to load data from the Microsoft Azure Data Lake Storage Gen2. When writing a large file, you can spawn multiple threads to process data. Configure Block Size to divide a large file into smaller parts. Default is 4. Maximum is 10. |
Filesystem Name Override | Overrides the default file name. |
Directory Override | Microsoft Azure Data Lake Storage Gen2 directory that you use to write data. Default is root directory. The Secure Agent creates the directory if it does not exist. The directory path specified at run time overrides the path specified while creating a connection. You can specify an absolute or a relative directory path:
Example of absolute path: Dir1/Dir2 Example of relative path: /Dir1/Dir2 When you use the relative path, the imported object path is added to the file path used during the metadata fetch at runtime. Do not specify a root directory (/) to override the directory. |
File Name Override | Target object. Select the file from which you want to write data. The file specified at run time overrides the file specified in Object. |
Write Strategy | Applicable to flat files in mappings. When you create a mapping in advanced mode, you can use write strategy for both flat files and complex files. If the file exists in Microsoft Azure Data Lake Storage Gen2, you can select to overwrite or append the existing file. The maximum size of data that you can append is 450 MB. When you append data for mappings in advanced mode, the data is appended as a new part file in the existing target directory. |
Block Size | Applicable to flat, Avro, and Parquet file formats. Divides a large file into smaller specified block size. When you write a large file, divide the file into smaller parts and configure concurrent connections to spawn the required number of threads to process data in parallel. Specify an integer value for the block size. Default value in bytes is 8388608. |
Compression Format | Compresses and writes data to the target based on the format you specify. Select one of the following options:
You cannot write compressed JSON files. When the task runs, the file extensions .gz or .snappy do not appear in target object name. |
Timeout Interval | Not applicable. |
Interim Directory | Optional. Applicable to flat files and JSON files. Path to the staging directory in the Secure Agent machine. Specify the staging directory where you want to stage the files when you write data to Microsoft Azure Data Lake Storage Gen2. Ensure that the directory has sufficient space and you have write permissions to the directory. Default staging directory is /tmp. You cannot specify an interim directory for mappings in advanced mode. You cannot specify an interim directory when you use the Hosted Agent. |