Cache Size Optimization
For optimal mapping performance, configure the cache sizes so that the Data Integration Service can run the complete transformation in memory.
To configure optimal cache sizes, perform the following tasks:
- 1. Set the tracing level to verbose initialization.
- 2. Run the mapping in auto cache mode.
- 3. Analyze caching performance in the mapping log.
- 4. Configure specific values for the cache sizes.
Step 1. Set the Tracing Level to Verbose Initialization
In the Developer tool, set the tracing level to verbose initialization to enable the Data Integration Service to write transformation statistics to the mapping log. The transformation statistics list the cache sizes required for optimal performance. By default, the tracing level is set to normal.
Set the tracing level to verbose initialization in one of the following ways:
- •Modify the advanced properties for each transformation that uses a cache.
- •Modify the default mapping configuration properties if you plan to run the mapping for the first time from the Developer tool. For more information, see the Informatica Developer Tool Guide.
- •Modify the advanced properties for an application that contains the mapping if you plan to run the deployed mapping for the first time from the command line. For more information, see the Informatica Developer Tool Guide.
Step 2. Run the Mapping in Auto Cache Mode
The first time that you run the mapping, use auto cache mode for the transformation cache sizes.
You can run the mapping from the Developer tool. Or, you can add the mapping to an application and then deploy the application to the Data Integration Service so that you can run the mapping from the command line.
Step 3. Analyze Caching Performance
After you run the mapping in auto cache mode, analyze the transformation statistics in the mapping log to determine the cache sizes required for optimal mapping performance.
When an Aggregator, Joiner, Lookup, or Rank transformation pages to the disk, the mapping log specifies the index and data cache sizes required to run the transformation in memory. For example, you run an Aggregator transformation called AGG_TRANS. The mapping log contains the following text:
CMN_1791, The index cache size that would hold [1098] aggregate groups of input rows for [AGG_TRANS], in memory, is [286720] bytes
CMN_1790, The data cache size that would hold [1098] aggregate groups of input rows for [AGG_TRANS], in memory, is [1774368] bytes
The log shows that the index cache requires 286,720 bytes and the data cache requires 1,774,368 bytes to run the transformation in memory without paging to the disk.
When a Sorter transformation pages to the disk, the mapping log states that the Data Integration Service made multiple passes on the source data. The Data Integration Service makes multiple passes on the data when it has to page to the disk to complete the sort. The message specifies the number of bytes required for a single pass, which is when the Data Integration Service reads the data once and performs the sort in memory without paging to the disk.
For example, you run a Sorter transformation called SRT_TRANS. The mapping log contains the following text:
SORT_40427, Sorter Transformation [SRT_TRANS] required 2-pass sort (1-pass temp I/O: 13126221824 bytes). You may try to set the cache size to 14128 MB or higher for 1-pass in-memory sort.
The log shows that the Sorter cache requires 14,128 MB so that the Data Integration Service makes one pass on the data.
Step 4. Configure Specific Cache Sizes
For optimal performance, configure the transformation cache sizes to use the values specified in the mapping log. Update the index and data cache size transformation properties in the Developer tool.
1. In the Developer tool, open the reusable or non-reusable transformation.
2. Locate the cache size properties depending on the following transformation types:
Option | Description |
---|
Reusable Aggregator, Joiner, Rank, or Sorter transformation | Click the Advanced view. |
Non-reusable Aggregator, Joiner, Rank, or Sorter transformation | Click the Advanced tab in the Properties view. |
Reusable Lookup transformation | Click the Run-time view. |
Non-reusable Lookup transformation | Click the Run-time tab in the Properties view. |
3. Enter the values in bytes that the mapping log recommended for the index and data cache sizes.
The following image shows a non-resuable Aggregator transformation that has specific values configured for the index and data cache sizes:
4. Click File > Save.