Developer Transformation Guide > Sorter Transformation > Sorter Cache
  

Sorter Cache

The Data Integration Service creates a cache in memory to run the Sorter transformation. The Data Integration Service passes all incoming data into the Sorter transformation before it performs the sort operation. If the Data Integration Service requires more space than available in the memory cache, it temporarily stores data in the Sorter transformation work directory.
If you do not configure the cache size to sort all of the data in memory, a warning appears in the session log stating that the Data Integration Service made multiple passes on the source data. The Data Integration Service makes multiple passes on the data when it has to page information to disk to complete the sort. The message specifies the amount of memory required for a single pass, which is when the Data Integration Service reads the data once and performs the sort in memory without paging to disk. To optimize mapping performance, configure the cache size so that the Data Integration Service makes one pass on the data.
If the amount of incoming data is greater than the Sorter cache size, the Data Integration Service temporarily stores data in the Sorter transformation work directory. The Data Integration Service requires disk space of at least twice the amount of incoming data when storing data in the work directory.
For best performance, configure the Sorter cache size with a value less than or equal to the amount of available physical memory on the machine that runs the mapping. To sort data using a Sorter transformation, allocate at least 16 MB (16,777,216 bytes) of physical memory. Sorter cache size is set to Auto by default.

Optimizing the Sorter Cache

The sorter cache is optimized to use variable length to store binary and string data types that pass through the Sorter transformation.
Variable length reduces the amount of data that the Data Integration Service stores in the sorter cache and the disk space consumption on the Data Integration Service machine.
For example, you store data on customers. Some customers have longer names than other customers. If the Data Integration Service uses fixed length to store the data on customer names, the Data Integration Service might store data on 20 characters for each name. If the Data Integration Service uses variable length, the Data Integration Service might store data with an average length of 10 characters.