Hadoop Files V2 compression formats
You can read or write compressed complex files, specify compression formats for the target objects, and decompress the output files.
You can use compression formats such as Gzip and DEFLATE.
The following table describes the complex file compression formats:
Compression Options | Description |
---|
None or empty | The file is not compressed. |
Auto | The Data Integration Service detects the compression format of the file based on the file extension. |
DEFLATE | The DEFLATE compression format that uses a combination of the LZ77 algorithm and Huffman coding. |
Gzip | The GNU zip compression format that uses the DEFLATE algorithm. |
Bzip2 | The Bzip2 compression format that uses the Burrows–Wheeler algorithm. |
LZO | The Lempel-Ziv-Oberhumer(LZO) compression format uses the LZO data-compression algorithm. |
Snappy | The Snappy compression format uses the Snappy compression algorithm. |
Data compression for Hadoop Files V2 sources and targets
You can decompress data when you read data from Hadoop Files V2 and compress the data when you write data to Hadoop Files V2.
Configure the compression format in the Compression Format option under the advanced source and target properties.
The following table lists the compression format that you use for the file formats:
File format | Compression format |
---|
Avro | None, Auto, Deflate, Snappy |
JSON | None, Auto |
Parquet | None, Auto, Gzip, LZO, Snappy |
Binary | None, Auto, Bzip2, Deflate, Gzip, LZO, Snappy |
Flat | None |
Configuring LZO compression format
To write .jar files in the LZO compression format, you must copy the files for compression to the machine where the Secure Agent runs.
Perform the following steps to configure the Secure Agent for LZO compression:
- 1Copy the LZO native binaries from the cluster to one of the following directories on the machine on which the Secure Agent runs:
- - <agent-root>/downloads/package-<distribution>/package/<distribution name>/lib/native
- 2In the Control Panel Window, click System and Security.
- 3In the System and Security Window, click Advanced system settings.
- 4On the Advanced tab, select the Environment Variables button.
The Edit Environment Variables dialog box appears.
- 5Click New to add a new environment variable.
The New Environment Variables dialog box appears.
- 6Enter the value of the Name field as LD_LIBARY_PATH.
- 7Enter the following path in the Value field:
<agent-root>/downloads/package-<distribution>/package/<distribution name>/infaLib/
- 8Restart the Secure Agent.