Run Jobs in Separate Processes
The Data Integration Service can run jobs in the Data Integration Service process or in separate DTM processes on local or remote nodes. You optimize service performance when you configure the recommended option based on the job types that the service runs.
When the Data Integration Service receives a request to run a job, the service creates a DTM instance to run the job. A DTM instance is a specific, logical representation of the execution Data Transformation Manager. You can configure the Data Integration Service to run DTM instances in the Data Integration Service process, in a separate DTM process on the local node, or in a separate DTM process on a remote node.
A DTM process is an operating system process started to run DTM instances. Multiple DTM instances can run within the Data Integration Service process or within the same DTM process.
The Launch Job Options property on the Data Integration Service determines where the service starts DTM instances. Configure the property based on whether the Data Integration Service runs on a single node or a grid and based on the types of jobs that the service runs.
Choose one of the following options for the Launch Job Options property:
- In the service process
Configure when you run SQL data service and web service jobs on a single node or on a grid where each node has both the service and compute roles.
SQL data service and web service jobs typically achieve better performance when the Data Integration Service runs jobs in the service process.
- In separate local processes
Configure when you run mapping, profile, and workflow jobs on a single node or on a grid where each node has both the service and compute roles.
Configure when the Data Integration Service uses operating system profiles.
When the Data Integration Service runs jobs in separate local processes, stability increases because an unexpected interruption to one job does not affect all other jobs.
- In separate remote processes
Configure when you run mapping, profile, and workflow jobs on a grid where nodes have a different combination of roles. If you choose this option when the Data Integration Service runs on a single node, then the service runs jobs in separate local processes.
When the Data Integration Service runs jobs in separate remote processes, stability increases because an unexpected interruption to one job does not affect all other jobs. In addition, you can better use the resources available on each node in the grid. When a node has the compute role only, the node does not have to run the service process. The machine uses all available processing power to run mappings.
Note: If you run multiple job types, create multiple Data Integration Services. Configure one Data Integration Service to run SQL data service and web service jobs in the Data Integration Service process. Configure the other Data Integration Service to run mappings, profiles, and workflows in separate local processes or in separate remote processes.
DTM Process Pool Management
When the Data Integration Service runs jobs in separate local or remote processes, the Data Integration Service maintains a pool of reusable DTM processes.
The DTM process pool includes DTM processes that are running jobs and DTM processes that are idle. Each running DTM process in the pool is reserved for use by one of the following groups of related jobs:
- •Jobs from the same deployed application
- •Preview jobs
- •Profiling jobs
- •Mapping jobs run from the Developer tool
For example, if you run two jobs from the same deployed application, two DTM instances are created in the same DTM process. If you run a preview job, the DTM instance is created in a different DTM process.
When a DTM process finishes running a job, the process closes the DTM instance. When the DTM process finishes running all jobs, the DTM process is released to the pool as an idle DTM process. An idle DTM process is available to run any type of job.
Rules and Guidelines when Jobs Run in Separate Processes
Consider the following rules and guidelines when you configure the Data Integration Service to run jobs in separate local or remote processes:
- •You cannot use the Maximum Memory Size property for the Data Integration Service to limit the amount of memory that the service allocates to run jobs. If you set the maximum memory size, the Data Integration Service ignores it.
- •If the Data Integration Service runs on UNIX, the host file on each node with the compute role and on each node with both the service and compute roles must contain a localhost entry. If the host file does not contain a localhost entry, jobs that run in separate processes fail. Windows does not require a localhost entry in the host file.
- •If you configure connection pooling, each DTM process maintains its own connection pool library. All DTM instances running in the DTM process can use the connection pool library. The number of connection pool libraries depends on the number of running DTM processes.