Application Service Guide > Data Integration Service Grid > Data Integration Service Grid Overview
  

Data Integration Service Grid Overview

If your license includes grid, you can configure the Data Integration Service to run on a grid. A grid is an alias assigned to a group of nodes. When you run jobs on a Data Integration Service grid, you improve scalability and performance by distributing jobs to processes running on multiple nodes in the grid.
To configure a Data Integration Service to run on a grid, you create a grid object and assign nodes to the grid. Then, you assign the Data Integration Service to run on the grid.
When you enable a Data Integration Service assigned to a grid, a Data Integration Service process runs on each node in the grid that has the service role. If a service process shuts down unexpectedly, the Data Integration Service remains available as long as another service process runs on another node. Jobs can run on each node in the grid that has the compute role. The Data Integration Service balances the workload among the nodes based on the type of job and based on how the grid is configured.
When the Data Integration Service runs on a grid, the service and compute components of the Data Integration Service can run on the same node or on different nodes, based on how you configure the grid and the node roles. Nodes in a Data Integration Service grid can have a combination of the service only role, the compute only role, and both the service and compute roles.

Grid Configuration by Job Type

A Data Integration Service that runs on a grid can run DTM instances in the Data Integration Service process, in separate DTM processes on the local node, or in separate DTM processes on remote nodes. Configure the service based on the types of jobs that the service runs.
Configure a Data Integration Service grid based on the following types of jobs that the service runs:
SQL data services and web services
When a Data Integration Service grid runs SQL queries and web service requests, configure the service to run jobs in the Data Integration Service process. All nodes in the grid must have both the service and compute roles. The Data Integration Service dispatches jobs to available nodes in a round-robin fashion.
SQL data service and web service jobs typically achieve better performance when the Data Integration Service runs jobs in the service process.
Mappings, profiles, and workflows that run in local mode
When a Data Integration Service grid runs mappings, profiles, and workflows, you can configure the service to run jobs in separate DTM processes on the local node. All nodes in the grid must have both the service and compute roles. The Data Integration Service dispatches jobs to available nodes in a round-robin fashion.
When the Data Integration Service runs jobs in separate local processes, stability increases because an unexpected interruption to one job does not affect all other jobs.
Mappings, profiles, and workflows that run in remote mode
When a Data Integration Service grid runs mappings, profiles, and workflows, you can configure the service to run jobs in separate DTM processes on remote nodes. The nodes in the grid can have a different combination of roles. The Data Integration Service designates one node with the compute role as the master compute node. The Service Manager on the master compute node communicates with the Resource Manager Service to dispatch jobs to an available worker compute node. The Resource Manager Service matches job requirements with resource availability to identify the best compute node to run the job.
When the Data Integration Service runs jobs in separate remote processes, stability increases because an unexpected interruption to one job does not affect all other jobs. In addition, you can better use the resources available on each node in the grid. When a node has the compute role only, the node does not have to run the service process. The machine uses all available processing power to run mappings.
Note: Ad hoc jobs, with the exception of profiles, can run in the Data Integration Service process or in separate DTM processes on the local node. Ad hoc jobs include mappings run from the Developer tool or previews, scorecards, or drill downs on profile results run from the Developer tool or Analyst tool. If you configure a Data Integration Service grid to run jobs in separate remote processes, the service runs ad hoc jobs in separate local processes.
By default, each Data Integration Service is configured to run jobs in separate local processes, and each node has both the service and compute roles.
If you run SQL queries or web service requests, and you run other job types in which stability and scalability is important, create multiple Data Integration Services. Configure one Data Integration Service grid to run SQL queries and web service requests in the Data Integration Service process. Configure the other Data Integration Service grid to run mappings, profiles, and workflows in separate local processes or in separate remote processes.