Developer Transformation Guide > Joiner Transformation > Joiner Caches
  

Joiner Caches

When you run a mapping that uses a Joiner transformation, the Data Integration Service creates an index cache and data cache in memory to run the transformation. If the Data Integration Service requires more space than available in the memory cache, it stores overflow data in cache files.
When you run a mapping that uses a Joiner transformation, the Data Integration Service reads rows from the master and detail sources concurrently and builds index and data caches based on the master rows. The Data Integration Service performs the join based on the detail source data and the cached master data.
The type of Joiner transformation determines the number of rows that the Data Integration Service stores in the cache.
The following table describes the information that the Data Integration Service stores in the caches for different types of Joiner transformations:
Joiner Transformation Type
Index Cache
Data Cache
Unsorted Input
Stores all master rows in the join condition with unique index keys.
Stores all master rows.
Sorted Input with Different Sources
Stores 100 master rows in the join condition with unique index keys.
Stores master rows that correspond to the rows stored in the index cache. If the master data contains multiple rows with the same key, the Data Integration Service stores more than 100 rows in the data cache.
Sorted Input with the Same Source
Stores all master or detail rows in the join condition with unique keys. Stores detail rows if the Data Integration Service processes the detail pipeline faster than the master pipeline. Otherwise, stores master rows. The number of rows it stores depends on the processing rates of the master and detail pipelines. If one pipeline processes its rows faster than the other, the Data Integration Service caches all rows that have already been processed. The service keeps the rows cached until the other pipeline finishes processing its rows.
Stores data for the rows stored in the index cache. If the index cache stores keys for the master pipeline, the data cache stores the data for master pipeline. If the index cache stores keys for the detail pipeline, the data cache stores data for detail pipeline.