Connections > Microsoft Azure Data Lake Storage Gen2 connection > Microsoft Azure Data Lake Storage Gen2 source properties
  

Microsoft Azure Data Lake Storage Gen2 source properties

You can use a Microsoft Azure Data Lake Storage Gen2 object as a source in a data loader task.
When you create a data loader task to read data from Microsoft Azure Data Lake Storage Gen2, specify the Microsoft Azure Data Lake Storage Gen2 source properties and formatting options.
The following table describes the Microsoft Azure Data Lake Storage Gen2 source properties:
Property
Description
Connection
Name of the source connection.
Select a source connection or create a new connection.
Source Path
The path to the directory that contains the source data.
Default is the file system name specified in the Microsoft Azure Data Lake Storage Gen2 connection.

Rules and guidelines for primary key fields and watermark fields

When you read data from Microsoft Azure Data Lake Storage Gen2, you can manually define the primary key fields and watermark fields. The default values are Primary key fields not required and Watermark fields not required.

Formatting options

Select the format of the Microsoft Azure Data Lake Storage Gen2 file and configure the formatting options.
The following table describes the formatting options for Avro, Parquet, JSON, and ORC files:
Property
Description
Formatting Type
Specifies the file format to read data from Microsoft Azure Data Lake Storage Gen2.
Select one of the following options:
  • - None
  • - Avro
  • - Parquet
  • - JSON
  • - ORC
To read binary files, select None as the Format Type. Default is None.
Schema Source
The schema of the source file.
Select one of the following options to specify a schema:
  • - Read from data file. Imports the schema from a file in Microsoft Azure Data Lake Storage Gen2.
  • - Import from Schema File. Imports the schema from a schema definition file in your local machine.
Schema File
The schema definition file in the agent machine from where you want to upload the schema.
Data elements to sample
Doesn't apply to a data loader task.
Memory available to process data
Doesn't apply to a data loader task.
The following table describes the formatting options for delimited files:
Property
Description
Formatting Type
Specifies the file format to read data from Microsoft Azure Data Lake Storage Gen2.
Select Delimited.
Schema Source
The schema of the source file.
Select one of the following options to specify a schema:
  • - Read from data file. Imports the schema from a file in Microsoft Azure Data Lake Storage Gen2.
  • - Import from Schema File. Imports the schema from a schema definition file in the agent machine.
Schema File
The schema definition file in the agent machine from where you want to upload the schema.
Delimiter
The character used to separate columns of data. You can set values as comma, tab, colon, semicolon, or others.
You can't set a tab as a delimiter directly in the Delimiter field. To set a tab as a delimiter, type the tab character in any text editor. Then, copy the tab character to the Delimiter field.
EscapeChar
The character immediately preceding a column delimiter character embedded in an unquoted string, or immediately preceding the quote character in a quoted string.
Qualifier
Quote character that defines the boundaries of data.
Enter a single quote or double quote.
Qualifier Mode
Doesn't apply to a data loader task.
Code Page
The code page to read or write data.
Select UTF-8. Ignore rest of the code pages.
Header Line Number
The line number that you want to use as the header when you read data from Microsoft Azure Data Lake Storage Gen2.
To read data from a file with no header, enter the value of the Header Line Number field as 0.
First Data Row
The line number from where you want to start reading the data.
Enter a value greater than or equal to one.
To read data from the header, the value of the Header Line Number and the First Data Row fields must be the same.
Default is 1.
Target Header
Doesn't apply to a data loader task.
Distribution Column
Doesn't apply to a data loader task.
Fixed Width File Format
Doesn't apply to a data loader task.
Retain Escape Character in Data
Doesn't apply to a data loader task.
Maximum Rows to Preview
Doesn't apply to a data loader task.
Row Delimiter
Doesn't apply to a data loader task.