Rank Transformation in the Hadoop Environment
The Rank transformation processing in the Hadoop environment depends on the engine that runs the transformation.
Rank Transformation Support on the Blaze Engine
The data cache for the Rank transformation is optimized to use variable length to store binary and string data types that pass through the Rank transformation. The optimization is enabled for record sizes up to 8 MB. If the record size is greater than 8 MB, variable length optimization is disabled.
When variable length is used to store data that passes through the Rank transformation in the data cache, the Rank transformation is optimized to use sorted input and a pass-through Sorter transformation is inserted before the Rank transformation in the run-time mapping.
To view the Sorter transformation, view the optimized mapping or view the execution plan in the Blaze validation environment.
During data cache optimization, the data cache and the index cache for the Rank transformation are set to Auto. The sorter cache for the Sorter transformation is set to the same size as the data cache for the Rank transformation. To configure the sorter cache, you must configure the size of the data cache for the Rank transformation.
Rank Transformation Support on the Spark Engine
Some processing rules for the Spark engine differ from the processing rules for the Data Integration Service.
Mapping Validation
Mapping validation fails in the following situations:
- •Case sensitivity is disabled.
- •The rank port is of binary data type.
Data Cache Optimization
You cannot optimize the data cache for the transformation to store data using variable length.
Rank Transformation Support on the Hive Engine
Some processing rules for the Hive engine differ from the processing rules for the Data Integration Service.
Mapping Validation
Mapping validation fails in the following situations:
- •Case sensitivity is disabled.
Data Cache Optimization
You cannot optimize the data cache for the transformation to store data using variable length.