Hadoop Files V2 Connector > Mappings and tasks with Hadoop Files V2 Connector > Hadoop Files V2 targets in mappings
  

Hadoop Files V2 targets in mappings

To write data to a Hadoop Files V2, configure a Hadoop Files V2 object as the Target transformation in a mapping. You can configure a Target transformation to represent a single Hadoop Files V2 target.
When you use an Hadoop Files V2 target object, select an Hadoop Files object as the target.
The following table describes the Hadoop Files V2 target properties that you can configure in a Target transformation:
Target Property
Description
Connection
Name of the target connection or create a connection parameter.
Target Type
Type of target object. Select Single Object or Parameter.
Object
Select the file to which you want to write data. Though selecting a target object is mandatory, the agent ignores this object. The agent processes the target object specified in File Directory and File Name in advanced target properties. You can select an existing object or create an object at runtime.
Create New at Runtime
Creates a complex file target object at runtime.
Enter a name for the target object and map the source fields that you want to use. By default, all source fields are mapped.
You can use parameters defined in a parameter file in the target name.
Format
File format of the target object.
You can select from the following file format types:
  • - None
  • - Flat file
  • - Avro
  • - Parquet
  • - JSON
Default is None. If you select None as the format type, the Secure Agent writes data in binary format. In advanced mode, None is not applicable.
Note: You can only use Avro, Parquet, and JSON file format types in Hadoop Files V2 Connector. You cannot write data to ORC file format types even though they are listed in the Formatting Options.
Parameter
The parameter for the target object. Create or select the parameter for the target object.
Note: The parameter property appears if you select parameter as the target type.
Operation
Select the target operation. You can use only the insert operation.
The following table describes the Hadoop Files V2 target advanced properties that you can configure in a Target transformation:
Advanced Property
Description
File Directory
Optional. The directory location of one or more output files. Maximum length is 255 characters. If you do not specify a directory location, the output files are created at the location specified in the connection.
If the directory is in HDFS, enter the path without the node URI. For example, /user/lib/testdir specifies the location of a directory in HDFS. The path must not contain more than 512 characters.
If the file or directory is in the local system, enter the fully qualified path. For example, /user/testdir specifies the location of a directory in the local system.
File Name
Optional. Renames the output file.
Enter the file name for your output file in the following format:
<file name>.<file extension>
If you enter a file name in this field, the Secure Agent generates the output file name in the following format:
<file name>_<partition index>_<hex of time in milliseconds>_<file sequence number>.<file extension>
For example, if you enter abc.txt, the Secure Agent generates the output file named abc_1_190fea05bdd_0001.txt.
Where,
  • - file name is abc
  • - partition index is 1
  • - hex of time in milliseconds is 190fea05bdd (The current time in milliseconds converted into a hex string)
  • - file sequence number is 0001
  • - file extension is .txt
Overwrite Target
Indicates whether the Secure Agent first deletes the target data before writing data.
If you select this option, the Secure Agent deletes all files that match the file name specified in the File Name field.
For example, if you enter abc.txt in the File Name field with the Overwrite Target option enabled, the Secure Agent deletes all files named abc*.txt.
If you do not select this option, the Secure Agent creates a new file in the target and writes the data to that file.
File Format
Specifies a file format of a complex file source. Select one of the following options:
  • - Binary
  • - Custom Input
  • - Sequence File Format
Default is Binary.
Output Format
The class name for files of the output format. If you select Output Format in the File Format field, you must specify the fully qualified class name implementing the OutputFormat interface.
Output Key Class
The class name for the output key. If you select Output Format in the File Format field, you must specify the fully qualified class name for the output key.
You can specify one of the following output key classes:
  • - BytesWritable
  • - Text
  • - LongWritable
  • - IntWritable
Note: Hadoop Files V2 generates the key in ascending order.
Output Value Class
The class name for the output value. If you select Output Format in the File Format field, you must specify the fully qualified class name for the output value.
You can use any custom writable class that Hadoop supports. Determine the output value class based on the type of data that you want to write.
Note: When you use custom output formats, the value part of the data that is streamed to the complex file data object write operation must be in a serialized form.
Compression Format
Compression format of the source files. Select one of the following options:
  • - None
  • - Auto
  • - DEFLATE
  • - gzip
  • - bzip2
  • - LZO
  • - Snappy
  • - Custom
Custom Compression Codec
Required if you use custom compression format. Specify the fully qualified class name implementing the CompressionCodec interface.
Sequence File Compression Type
Optional. The compression format for sequence files. Select one of the following options:
  • - None
  • - Record
  • - Block
streamRolloverSize
The maximum size a file can reach before it is segmented into a new file.
The rollover size manages large data streams by breaking them into smaller, more manageable parts.
streamRollovertime
The time interval after which a new file is created to write the data, regardless of the file size.
This time interval helps to manage files by time and ensures data segmentation over fixed time periods.
Forward Rejected Rows
Determines whether the transformation passes rejected rows to the next transformation or drops rejected rows. By default, the mapping task forwards rejected rows to the next transformation.

Writing to multiple target objects

When you import target objects, the Secure Agent appends a FilePath field to the imported target object. When you map the FilePath field in the target object to an incoming field, the Secure Agent creates the folder structure and the target files based on the FilePath field. For example:
Syntax:
<tgt_FilePath_folder>/<tgt_FilePath=incoming_value_folder>/part_file
Sample:
emp_tgt.parquet/emp_tgt.parquet=128000/part-0000-e9ca8-6af-efd43-455c-8709.c000.parquet
The FilePath field is applicable to the following file formats:
Consider the following guidelines when using the target FilePath field in mappings: