Connections > Google Cloud Storage V2 connection > Google Cloud Storage V2 source properties
  

Google Cloud Storage V2 source properties

You can use a Google Cloud Storage object as a source in a data loader task.
When you create a data loader task to read data from Google Cloud Storage, specify the Google Cloud Storage V2 connection and source properties.
The following table describes the Google Cloud Storage V2 source properties:
Property
Description
Connection
Name of the source connection.
Select a source connection or create a new connection.
Source Path
The bucket name or folder path that contains the source objects.

Rules and guidelines for primary key fields and watermark fields

When you read data from Google Cloud Storage, you can manually define the primary key fields and watermark fields. The default values are Primary key fields not required and Watermark fields not required

Formatting options

When you select the format of a Google Cloud Storage file, you can configure the formatting options.
The following table describes the formatting options for Avro, Parquet, JSON, and ORC files:
Property
Description
Formatting Type
The file format to read data from Google Cloud Storage.
You can select the following file format types:
  • - None
  • - Avro
  • - Parquet
  • - ORC
  • - JSON
Default is None.
To read binary files, select None as the Format Type.
Schema Source
The schema of the source or target file.
You can select one of the following options to specify a schema:
  • - Read from data file. Imports the schema from the file in Google Cloud Storage.
  • - Import from schema file. Imports the schema from a schema definition file in the agent machine.
If you select an Avro, JSON, or Parquet format type and select the Read from data file option, you can't configure the delimiter, escape character, and qualifier options.
If you select an Avro, JSON, or Parquet format type and select the Import from schema file option, you can only upload a schema file in the Schema File property field. You can't configure the delimiter, escape character, and qualifier options.
Schema File
The schema definition file in the agent machine from where you want to upload the schema.
Data elements to sample
Doesn't apply to a data loader task.
Memory available to process data
Doesn't apply to a data loader task.
The following table describes the formatting options for delimited files:
Property
Description
Formatting Type
The file format to read data from Google Cloud Storage.
Select the Delimited file format type.
Schema Source
The schema of the source or target file.
You can select one of the following options to specify a schema:
  • - Read from data file. Imports the schema from the file in Google Cloud Storage.
  • - Import from Schema File. Imports the schema from a schema definition file in the agent machine.
Schema File
The schema definition file in the agent machine from where you want to upload the schema.
Delimiter
Character used to separate columns of data. You can set values as comma, tab, colon, semicolon, or others.
You can't set a tab as a delimiter directly in the Delimiter field. To set a tab as a delimiter, you must type the tab character in any text editor. Then, copy the tab character to the Delimiter field.
Escape Character
Character immediately preceding a column delimiter character embedded in an unquoted string, or immediately preceding the quote character in a quoted string.
Qualifier
Quote character that defines the boundaries of text strings. You can configure parameters such as single quote or double quote.
You can use the output text qualifier when a delimiter value is present in the data.
Qualifier Mode
Doesn't apply to a data loader task.
Code Page
The code page to read data.
Google Cloud Storage V2 Connector supports the following code pages:
  • - UTF-8. Select for Unicode and non-Unicode data.
  • - MS Windows Latin 1. Select for ISO 8859-1 Western European data.
  • - Shift-JIS. Select for double-byte character data.
  • - ISO 8859-15 Latin 9 (Western European).
  • - ISO 8859-2 Eastern European.
  • - ISO 8859-3 Southeast European.
  • - ISO 8859-5 Cyrillic.
  • - ISO 8859-9 Latin 5 (Turkish).
  • - IBM EBCDIC International Latin-1.
Header Line Number
The line number that you want to use as the header when you read data from Google Cloud Storage.
You can also read a file that doesn't have a header. Default is 1.
To read data from a file with no header, specify the value of the Header Line Number field as 0. To read data from a file with a header, set the value of the Header Line Number field to a value that is greater than or equal to one.
First Data Row
The line number from where you want to read data.
You must enter a value that is greater or equal to one.
To read data from the header, the value of the Header Line Number and the First Data Row fields should be the same. Default is 1.
Target Header
Doesn't apply to a data loader task.
Distribution Column
Doesn't apply to a data loader task.
Maximum Rows to Preview
Doesn't apply to a data loader task.
Row Delimiter
Doesn't apply to a data loader task.