Configuring Interactive Data Preparation Service on Grid for Scalability
The Interactive Data Preparation Service requires most memory and CPU resources for in-memory database to support high performance interactive data preparation. When too many users try to prepare data simultaneously, performance of the interactive preparation can decline. The administrator might need to upgrade the hardware to improve the performance levels. To support increased preparation data volumes, the administrator can achieve horizontal scaling by creating a Interactive Data Preparation Service Grid with multiple service nodes.
Each user is assigned a node in the grid using round-robin method to distribute the load across the nodes. Homogeneous combinations of nodes are allowed. You can combine nodes with the same operating system, same CPU, same memory, and security setup. This allows for seamless restoration of data after node failures, enabling the Interactive Data Preparation Service to be highly available.
- 1. Install the Enterprise Data Preparation binaries on every node that is part of the grid.
- 2. Select Grid while configuring the Interactive Data Preparation Service.
- 3. Ensure that all of the folder locations mentioned in the configuration are present in all the nodes.
You can add or remove nodes dynamically from a grid. When a node is added into an active grid, the Interactive Data Preparation Service process does not start automatically. The Enterprise Data Preparation administrator must enable the process in the Processes tab of the Interactive Data Preparation Service to start the process in the node.
Adding a New Node when the Interactive Data Preparation Service is Running
When you add a new node to the grid where the Interactive Data Preparation Service runs, the new node will be in the Disabled state.
1. Log in to the Administrator tool.
2. Click Services.
3. Select the Interactive Data Preparation Service from the list.
4. Click the Processes tab of the service.
5. Select the newly added node.
6. On the top right hand corner, click the Enable icon to start the process.
A warning message appears.
7. Click OK.
Removing Interactive Data Preparation Service Nodes from the Grid
At least one node should be active for the Interactive Data Preparation Service to run.
When you shut down a node or a node goes down, it does not affect the Interactive Data Preparation Service as long as at least one node remains enabled in the grid. An active session will not be automatically recovered. An error will appear and the user must reconnect the session to continue. If all the nodes in a grid are removed or shut down, the Interactive Data Preparation Service is disabled.
Monitoring Interactive Data Preparation Service Node Status
You can troubleshoot by finding the state of the Interactive Data Preparation Service nodes in a grid at any given point in time.
To find the nodes of the service along with the state, connect to the Data Preparation repository and execute the following SQL query:
select node_id, node_ip, state, created_ts, node_port, isp_node_name from dp_physical node;
The state column shows the current state of the node service. It can be in any of the following states:
- •ACTIVE: The node is ready to take new user sessions.
- •SUSPECTED_UNREACHABLE: The node cannot accept new sessions as peer-check operation is failing on that node. The node might not be completely down as the server may recover after a brief period of high load.
To find the user to node assignment, connect to the Data Preparation repository and execute the following SQL query:
select login_id, node_ip, a.node_id, isp_node_name from dp_physical_node a, dp_user u, dp_user_to_node_map m where a.node_id = m.node_id and u.id = m.user_id;