Configuration for the PowerCenter Integration Service Grid
A grid is an alias assigned to a group of nodes that run sessions and workflows. When you run a workflow on a grid, you improve scalability and performance by distributing Session and Command tasks to service processes running on nodes in the grid. When you run a session on a grid, you improve scalability and performance by distributing session threads to multiple DTM processes running on nodes in the grid.
To run a workflow or session on a grid, you assign resources to nodes, create and configure the grid, and configure the PowerCenter Integration Service to run on a grid.
To configure a grid, complete the following tasks:
- 1. Create a grid and assign nodes to the grid.
- 2. Configure the PowerCenter Integration Service to run on a grid.
- 3. Configure the PowerCenter Integration Service processes for the nodes in the grid. If the PowerCenter Integration Service uses operating system profiles, all nodes on the grid must run on UNIX.
- 4. Assign resources to nodes. You assign resources to a node to allow the PowerCenter Integration Service to match the resources required to run a task or session thread with the resources available on a node.
After you configure the grid and PowerCenter Integration Service, you configure a workflow to run on the PowerCenter Integration Service assigned to a grid.
Creating a Grid
To create a grid, create the grid object and assign nodes to the grid. You can assign a node to more than one grid.
When you create a grid for the Data Integration Service, the nodes assigned to the grid must have specific roles depending on the types of jobs that the Data Integration Service runs. For more information, see
Grid Configuration by Job Type.
1. In the Administrator tool, click the Manage tab.
2. Click the Services and Nodes view.
3. In the Domain Navigator, select the domain.
4. On the Navigator Actions menu, click New > Grid.
The Create Grid dialog box appears.
5. Enter the following properties:
Property | Description |
---|
Name | Name of the grid. The name is not case sensitive and must be unique within the domain. It cannot exceed 128 characters or begin with @. It also cannot contain spaces or the following special characters: ` ~ % ^ * + = { } \ ; : ' " / ? . , < > | ! ( ) ] [ |
Description | Description of the grid. The description cannot exceed 765 characters. |
Nodes | Select nodes to assign to the grid. |
Path | Location in the Navigator, such as: DomainName/ProductionGrids |
6. Click OK.
Configuring the PowerCenter Integration Service to Run on a Grid
You configure the PowerCenter Integration Service by assigning the grid to the PowerCenter Integration Service.
To assign the grid to a PowerCenter Integration Service:
1. In the Administrator tool, select the PowerCenter Integration Service Properties tab.
2. Edit the grid and node assignments, and select Grid.
3. Select the grid you want to assign to the PowerCenter Integration Service.
Configuring the PowerCenter Integration Service Processes
When you run a session or a workflow on a grid, a service process runs on each node in the grid. Each service process running on a node must be compatible or configured the same. It must also have access to the directories and input files used by the PowerCenter Integration Service.
To ensure consistent results, complete the following tasks:
- •Verify the shared storage location. Verify that the shared storage location is accessible to each node in the grid. If the PowerCenter Integration Service uses operating system profiles, the operating system user must have access to the shared storage location.
- •Configure the service process. Configure $PMRootDir to the shared location on each node in the grid. Configure service process variables with identical absolute paths to the shared directories on each node in the grid. If the PowerCenter Integration Service uses operating system profiles, the service process variables you define in the operating system profile override the service process variable setting for every node. The operating system user must have access to the $PMRootDir configured in the operating system profile on every node in the grid.
Complete the following process to configure the service processes:
1. Select the PowerCenter Integration Service in the Navigator.
2. Click the Processes tab.
The tab displays the service process for each node assigned to the grid.
3. Configure $PMRootDir to point to the shared location.
4. Configure the following service process settings for each node in the grid:
- - Code pages. For accurate data movement and transformation, verify that the code pages are compatible for each service process. Use the same code page for each node where possible.
- - Service process variables. Configure the service process variables the same for each service process. For example, the setting for $PMCacheDir must be identical on each node in the grid.
- - Directories for Java components. Point to the same Java directory to ensure that java components are available to objects that access Java, such as Custom transformations that use Java coding.
Resources
Informatica resources are the database connections, files, directories, node names, and operating system types required by a task. You can configure the PowerCenter Integration Service to check resources. When you do this, the Load Balancer matches the resources available to nodes in the grid with the resources required by the workflow. It dispatches tasks in the workflow to nodes where the required resources are available. If the PowerCenter Integration Service is not configured to run on a grid, the Load Balancer ignores resource requirements.
For example, if a session uses a parameter file, it must run on a node that has access to the file. You create a resource for the parameter file and make it available to one or more nodes. When you configure the session, you assign the parameter file resource as a required resource. The Load Balancer dispatches the Session task to a node that has the parameter file resource. If no node has the parameter file resource available, the session fails.
Resources for a node can be predefined or user-defined. Informatica creates predefined resources during installation. Predefined resources include the connections available on a node, node name, and operating system type. When you create a node, all connection resources are available by default. Disable the connection resources that are not available on the node. For example, if the node does not have Oracle client libraries, disable the Oracle Application connections. If the Load Balancer dispatches a task to a node where the required resources are not available, the task fails. You cannot disable or remove node name or operating system type resources.
User-defined resources include file/directory and custom resources. Use file/directory resources for parameter files or file server directories. Use custom resources for any other resources available to the node, such as database client version.
The following table lists the types of resources you use in Informatica:
Type | Predefined/User-Defined | Description |
---|
Connection | Predefined | Any resource installed with PowerCenter, such as a plug-in or a connection object. A connection object may be a relational, application, FTP, external loader, or queue connection. When you create a node, all connection resources are available by default. Disable the connection resources that are not available to the node. Any Session task that reads from or writes to a relational database requires one or more connection resources. The Workflow Manager assigns connection resources to the session by default. |
Node Name | Predefined | A resource for the name of the node. A Session, Command, or predefined Event-Wait task requires a node name resource if it must run on a specific node. |
Operating System Type | Predefined | A resource for the type of operating system on the node. A Session or Command task requires an operating system type resource if it must run a specific operating system. |
Custom | User-defined | Any resource for all other resources available to the node, such as a specific database client version. For example, a Session task requires a custom resource if it accesses a Custom transformation shared library or if it requires a specific database client version. |
File/Directory | User-defined | Any resource for files or directories, such as a parameter file or a file server directory. For example, a Session task requires a file resource if it accesses a session parameter file. |
You configure resources required by Session, Command, and predefined Event-Wait tasks in the task properties.
You define resources available to a node on the Resources tab of the node in the Administrator tool.
Note: When you define a resource for a node, you must verify that the resource is available to the node. If the resource is not available and the PowerCenter Integration Service runs a task that requires the resource, the task fails.
You can view the resources available to all nodes in a domain on the Resources view of the domain. The Administrator tool displays a column for each node. It displays a checkmark when a resource is available for a node
Assigning Connection Resources
You can assign the connection resources available to a node in the Administrator tool.
1. In the Administrator tool, click the Manage tab > Services and Nodes view.
2. In the Domain Navigator, select a node.
3. In the contents panel, click the Resources view.
4. Click on a resource that you want to edit.
5. On the Manage tab Actions menu, click Enable Selected Resource or Disable Selected Resource.
Defining Custom and File/Directory Resources
You can define custom and file/directory resources available to a node in the Administrator tool. When you define a custom or file/directory resource, you assign a resource name. The resource name is a logical name that you create to identify the resource.
You assign the resource to a PowerCenter task or PowerCenter mapping object instance using this name. To coordinate resource usage, you may want to use a naming convention for file/directory and custom resources.
To define a custom or file/directory resource:
1. In the Administrator tool, click the Manage tab > Services and Nodes view.
2. In the Domain Navigator, select a node.
3. In the contents panel, click the Resources view.
4. On the Manage tab Actions menu, click New Resource.
5. Enter a name for the resource.
The name is not case sensitive and must be unique within the domain. It cannot exceed 128 characters or begin with @. It also cannot contain spaces or the following special characters: ` ~ % ^ * + = { } \ ; : / ? . , < > | ! ( ) ] [
6. Select a resource type.
7. Click OK.
To remove a custom or file/directory resource, select a resource and click Delete Selected Resource on the Manage tab Actions menu.
Resource Naming Conventions
Using resources with PowerCenter requires coordination and communication between the domain administrator and the workflow developer. The domain administrator defines resources available to nodes. The workflow developer assigns resources required by Session, Command, and predefined Event-Wait tasks. To coordinate resource usage, you can use a naming convention for file/directory and custom resources.
Use the following naming convention:
resourcetype_description
For example, multiple nodes in a grid contain a session parameter file called sales1.txt. Create a file resource for it named sessionparamfile_sales1 on each node that contains the file. A workflow developer creates a session that uses the parameter file and assigns the sessionparamfile_sales1 file resource to the session.
When the PowerCenter Integration Service runs the workflow on the grid, the Load Balancer distributes the session assigned the sessionparamfile_sales1 resource to nodes that have the resource defined.
Editing and Deleting a Grid
You can edit or delete a grid from the domain. Edit the grid to change the description, add nodes to the grid, or remove nodes from the grid. You can delete the grid if the grid is no longer required.
Before you remove a node from the grid, disable the PowerCenter Integration Service process running on the node.
Before you delete a grid, disable any PowerCenter Integration Services running on the grid.
1. In the Administrator tool, click the Manage tab > Services and Nodes view.
2. Select the grid in the Domain Navigator.
3. To edit the grid, click Edit in the Grid Details section.
You can change the grid description, add nodes to the grid, or remove nodes from the grid.
4. To delete the grid, select Actions > Delete.
Troubleshooting a Grid
- I changed the nodes assigned to the grid, but the Integration Service to which the grid is assigned does not show the latest Integration Service processes.
When you change the nodes in a grid, the Service Manager performs the following transactions in the domain configuration database:
- 1. Updates the grid based on the node changes. For example, if you add a node, the node appears in the grid.
- 2. Updates the Integration Services to which the grid is assigned. All nodes with the service role in the grid appear as service processes for the Integration Service.
If the Service Manager cannot update an Integration Service and the latest service processes do not appear for the Integration Service, restart the Integration Service. If that does not work, reassign the grid to the Integration Service.