Partitioned Relational Sources

The Data Integration Service queries the DB2 for LUW or Oracle database system for partition information. If the source tables support database partitioning, the Data Integration Service can use multiple threads to read the partitioned data from the corresponding nodes in the database. The Data Integration Service generates an SQL query for each reader thread.

Number of database partitions is less than or equal to the maximum parallelism value.

The Data Integration Service uses one reader thread for each database partition. The Data Integration Service distributes one database partition to each reader thread.

For Oracle sources that use composite partitioning, the Data Integration Service uses one reader thread for each database subpartition. For example, if an Oracle source contains three partitions and two subpartitions for each partition, then the Data Integration Service uses six reader threads.

Number of database partitions is more than the maximum parallelism value.

The Data Integration Service uses the number of reader threads defined by the maximum parallelism value. The Data Integration Service distributes multiple database partitions to some of the reader threads. For example, a DB2 for LUW source has five database partitions, and the maximum parallelism value is three. The Data Integration Service uses three reader threads. The Data Integration Service distributes two database partitions to the first reader thread and the second reader thread. The service distributes one database partition to the third reader thread.

No database partitions.

The Data Integration Service uses one thread to read from the source. The Data Integration Service can use multiple threads for the remaining mapping pipeline stages.

Relational Connection Types for Partitioning

SQL Queries for Partitioned Relational Sources

If the database source has more database partitions than the maximum parallelism value, the Data Integration Service distributes the data across the reader threads. The Data Integration Service can generate SQL queries that read from multiple database partitions. When an Oracle source contains subpartitions, the Data Integration Service can generate SQL queries that read from multiple database subpartitions.

DB2 for LUW or Oracle Source Example

The maximum parallelism value is three, and the relational source has five database partitions. When the Data Integration Service runs SQL queries against the database partitions, the first and second reader threads receive data from two database partitions. The third reader thread receives data from one database partition. In this example, the simply query in the Read transformation does not have the select distinct option enabled.

Oracle Source with Subpartitions Example

An Oracle source has five partitions, 1–5, and two subpartitions, a and b, in each partition. The maximum parallelism value is three. The first reader thread receives data from four database subpartitions. The second and third reader threads receive data from three database subpartitions. In this example, the simple query in the Read transformation does not have the select distinct option enabled.

SELECT <column list> FROM <table name> SUBPARTITION <database_subpartition1_a name> UNION ALL
SELECT <column list> FROM <table name> SUBPARTITION <database_subpartition1_b name> UNION ALL
SELECT <column list> FROM <table name> SUBPARTITION <database_subpartition4_a name> UNION ALL
SELECT <column list> FROM <table name> SUBPARTITION <database_subpartition4_b name> UNION ALL

Partitioned Relational Sources

Relational Connection Types for Partitioning

SQL Queries for Partitioned Relational Sources

DB2 for LUW or Oracle Source Example

Oracle Source with Subpartitions Example

Rules and Guidelines for Relational Source Partitions