Hadoop Files V2 Connector > Mappings and tasks with Hadoop Files V2 Connector > Hadoop Files V2 compression formats
  

Hadoop Files V2 compression formats

You can read or write compressed complex files, specify compression formats for the target objects, and decompress the output files.
You can use compression formats such as Gzip and DEFLATE.
The following table describes the complex file compression formats:
Compression Options
Description
None or empty
The file is not compressed.
Auto
The Data Integration Service detects the compression format of the file based on the file extension.
DEFLATE
The DEFLATE compression format that uses a combination of the LZ77 algorithm and Huffman coding.
Gzip
The GNU zip compression format that uses the DEFLATE algorithm.
Bzip2
The Bzip2 compression format that uses the Burrows–Wheeler algorithm.
LZO
The Lempel-Ziv-Oberhumer(LZO) compression format uses the LZO data-compression algorithm.
Snappy
The Snappy compression format uses the Snappy compression algorithm.

Data compression for Hadoop Files V2 sources and targets

You can decompress data when you read data from Hadoop Files V2 and compress the data when you write data to Hadoop Files V2.
Configure the compression format in the Compression Format option under the advanced source and target properties.
The following table lists the compression format that you use for the file formats:
File format
Compression format
Avro
None, Auto, Deflate, Snappy
JSON
None, Auto
Parquet
None, Auto, Gzip, LZO, Snappy
Binary
None, Auto, Bzip2, Deflate, Gzip, LZO, Snappy
Flat
None

Configuring LZO compression format

To write .jar files in the LZO compression format, you must copy the files for compression to the machine where the Secure Agent runs.
Perform the following steps to configure the Secure Agent for LZO compression:
  1. 1Copy the LZO native binaries from the cluster to one of the following directories on the machine on which the Secure Agent runs:
  2. 2In the Control Panel Window, click System and Security.
  3. 3In the System and Security Window, click Advanced system settings.
  4. 4On the Advanced tab, select the Environment Variables button.
  5. The Edit Environment Variables dialog box appears.
  6. 5Click New to add a new environment variable.
  7. The New Environment Variables dialog box appears.
  8. 6Enter the value of the Name field as LD_LIBARY_PATH.
  9. 7Enter the following path in the Value field:
  10. <agent-root>/downloads/package-<distribution>/package/<distribution name>/infaLib/
  11. 8Restart the Secure Agent.