Developer Transformation Guide > Transformation Caches > Cache Size
  

Cache Size

Cache size determines how much memory the Data Integration Service allocates for each transformation cache at the start of a mapping run. You can configure a transformation cache size to use auto cache mode or to use a specific value.

Auto Cache Size

By default, a transformation cache size is set to Auto. The Data Integration Service automatically calculates the cache memory requirements at run time. You define the maximum amount of memory that the service can allocate.
The Data Integration Service uses the following guidelines to automatically allocate the memory:
Allocates more memory to transformations with higher processing times.
The Data Integration Service allocates more memory to transformations that typically have higher processing times. For example, the Data Integration Service allocates more memory to the Sorter transformation because the Sorter transformation typically takes longer to run.
Allocates more memory to the data cache than to the index cache.
Aggregator, Joiner, Lookup, and Rank transformations require an index cache and a data cache. When the Data Integration Service divides the memory allocated for the transformation across the index and data cache, it allocates more memory to the data cache.
Sorter transformations require a single cache. The service allocates all of the memory allocated for the transformation to the Sorter cache.

Maximum Memory for Auto Cache Size

You define the maximum amount of memory that the Data Integration Service can allocate to transformation caches in the Maximum Memory Per Request property for Data Integration Service modules in the Administrator tool.
Each module runs different types of requests that have different memory requirements. For example, mapping and profile requests typically require more cache memory than SQL service or web service requests. You can configure the Maximum Memory Per Request property for the following Data Integration Service modules:
Note: Mapping Service Module requests include mappings and mappings run from Mapping tasks within a workflow.
For the Profiling Service Module, Maximum Memory Per Request defines the maximum amount of memory that the Data Integration Service can allocate for each mapping run for a single profile request.
For the remaining modules, the behavior of Maximum Memory Per Request depends on the Data Integration Service configuration. The behavior depends on the Launch Job Options property and the Maximum Memory Size property on the Data Integration Service.
The following table describes the behavior of Maximum Memory Per Request for the mapping, SQL service, and web service modules based on the Data Integration Service configuration:
Data Integration Service Configuration
Maximum Memory Per Request Behavior
Runs jobs in separate local or remote system processes, or Maximum Memory Size is 0 (default)
Maximum amount of memory, in bytes, that the Data Integration Service can allocate for all transformations that use auto cache mode in a single request.
The value that you define for Maximum Memory Per Request affects only transformations that use auto cache mode. The Data Integration Service allocates memory separately to transformations for which you configure a specific cache size. The total memory used by the request can exceed the value of Maximum Memory Per Request.
For example, Maximum Memory Per Request is set to 800 MB. A mapping has three transformations that require caching. You configure two transformations to use auto cache mode and configure the third transformation to use a total of 500 MB for the cache sizes. The Data Integration Service allocates a total of 1,300 MB of memory for all of the transformation caches.
Runs jobs in the Data Integration Service process, and Maximum Memory Size is greater than 0
Maximum amount of memory, in bytes, that the Data Integration Service can allocate for a single request.
The value that you define for the Maximum Memory Per Request property affects all transformations. The total memory used by the request cannot exceed the value of Maximum Memory Per Request.
When you increase the maximum amount of memory used for auto cache mode, you increase the maximum cache size that can be used for all requests to the module. You can increase the maximum amount of memory to ensure that no cache files are paged to the disk. However, because this value is used for all requests, the Data Integration Service might allocate more memory than is needed for some requests.

Specific Cache Size

You can configure a specific cache size for a transformation. The Data Integration Service allocates the specified amount of memory to the transformation cache at the start of the mapping run. Configure a specific value in bytes when you tune the cache size.
The first time that you configure a cache size, use auto cache mode. After you run the mapping, analyze transformation statistics in the mapping log to determine the cache sizes required to run the transformations in memory. When you configure the cache size to use the value specified in the mapping log, you can ensure that no allocated memory is wasted. However, the optimal cache size varies based on the size of the source data. Review the mapping logs after subsequent mapping runs to monitor changes to the cache size. If you configure a specific cache size for a reusable transformation, verify that the cache size is optimal for each use of the transformation in a mapping.
To define specific cache sizes, configure the cache size values in the transformation properties in the Developer tool.