User Guide > Introduction to Test Data Management > TDM Process
  

TDM Process

Run a profile against source data, create a subset of the data, and mask the subset data.
The TDM process includes the following high-level steps:
  1. 1. Create policies that define the types of data you want to mask and the rules that you might use to mask the data.
  2. 2. Create a project and import data sources.
  3. 3. Optionally, discover information about the source data. Run profiles for data and metadata discovery to discover primary keys, entities, and data domains.
  4. 4. Define data subset operations and data masking operations. Define the tables that you want to include in the subset database and the relationships between the tables. Assign data masking rules to columns in the source data.
  5. 5. Define data generation operations. Define the tables that you want to include and assign data generation rules to columns in the target table.
  6. 6. In addition to the target connection, to directly copy flat file results to an HP ALM server, enter the test tool integration properties in the plan.
  7. 7. To store data in the test data warehouse, select the test data warehouse as the target in the plan.
  8. 8. Generate and run the workflow for data masking, data subset, or data generation.
  9. 9. Monitor the workflow.

Create a Data Masking Policy

Design policies to mask specific types of data. A policy includes the data domains that describe the data that you want to mask. A policy does not contain any data source. You can apply a policy to more than one project in Test Data Manager.
Define data domains to group sensitive fields by column name or by the column data. Define patterns in the column name or the column data using regular expressions. A data domain also contains masking rules that describe how to mask the data.
To design a data masking rule, select a built-in data masking technique in Test Data Manager. A rule is a data masking technique with specific parameters. You can create data masking rules with mapplets imported into TDM from PowerCenter.

Create a Project and Import Metadata

Create a project to organize the components for data discovery, masking, subset and generation operations.
Import data sources in the project. Create a target schema. TDM overwrites any data that already exists in the target schema. Import metadata for the sources on which you want to perform data subset or masking operations. Import target metadata to perform data generation operations. You can import metadata from a PowerCenter folder or an external database source.
When you import PowerCenter source metadata, the TDM Server sends a request to the PowerCenter Repository Service to extract source metadata from the PowerCenter repository. The PowerCenter Repository Service loads the source metadata to the TDM repository. When you import external database metadata, the TDM Server extracts metadata from the source tables and loads it into the TDM repository.

Discover Source Information

You can run profiles to discover primary and foreign key data, entity relationships, and data domains in source tables.
When a data source has no keys, you can run a primary key profile to identify possible primary keys. When the project contains multiple sources, you can run an entity profile to discover possible relationships between tables. Select the primary keys and the entities from the profile results to define the subset data structure.
You can run a data domain profile to search for columns in the source data to add to each data domain. Use data domain profile results to determine which columns to mask with the same masking rules.
When you run profiles for data discovery, the TDM Server sends a request to the Data Integration Service to extract data from the source tables. The Data Integration Service loads the profile results to the profiling warehouse. When you add constraints to tables, the TDM Server stores the constraints in the TDM repository. The TDM server does not update the data sources.

Define Data Masking and Data Subset Operations

To define data subset operations, define the tables that you want to include in the subset database and the relationships between the tables. To perform data masking operations, create a plan to run the masking operations. Add policies for masking the data. You can also add rules that are not in policies.
Perform the following tasks in Test Data Manager to define the data masking and data subset operations:
  1. 1. Create entities, groups, and templates to define the tables that you want to copy to the subset database. An entity defines a set of tables that are related based on physical or logical constraints. A group defines a set of unrelated tables. A template is an optional component that contains the entities and groups.
  2. 2. Assign data masking rules to columns in the data source.
  3. 3. Create a data subset plan, and add the entities, groups, and templates to it. For each column in a parent table, you can define criteria to filter the data.
  4. 4. Create a data masking plan and assign the policies and rules to the plan that you want to apply.
The TDM Server stores projects, entities, groups, templates, and plans in the TDM repository. When you generate and run workflows from plans, the PowerCenter Integration Service runs the workflows and loads the data into the target database.

Define a Data Generation Operation

To perform a data generation operation, create a data generation plan. Add tables and entities to the plan.
Perform the following tasks in Test Data Manager to define the data generation operation:
  1. 1. Create entities that you want to add to the generation plan.
  2. 2. Create data generation rules and assign the rules to the columns in the target table.
  3. 3. Create a data generation plan, and add the entities and tables to the plan. Assign default data generation rules to the columns that do not have rule assignments.

Create a Plan for Data Masking and Data Subset

Create a plan for the data masking and data subset operations. A plan includes the components that you need to generate a workflow. You can combine a data masking and a data subset operation in the same plan, or you can create separate plans. To save the results to an integrated HP ALM server or to store the results in the test data warehouse, select the appropriate properties in the plan.
  1. 1. Create a data subset plan and add the entities, groups, and templates to it. You can define additional criteria to filter the data.
  2. 2. Create a data masking plan and assign the policies and rules to the plan that you want to apply.
  3. 3. To store the results in the test data warehouse, select the test data warehouse from the list of target connections in the plan.
  4. 4. To also copy flat file target results to an integrated HP ALM server, enter the test tool integration properties in the plan.
  5. 5. Generate a workflow from the plan.
  6. 6. Run the workflow.
When you generate and run workflows from plans, the PowerCenter Integration Service runs the workflows and loads the data into the target database.

Create a Plan for Data Generation

Create a plan to perform data generation operations. You cannot combine a data generation operation with a data masking or a data subset operation in the same plan. You must create a separate plan for data generation. To save the results to an integrated HP ALM server or to store the results in the test data warehouse, select the appropriate properties in the plan.
Perform the following tasks when you want to create a data generation plan:
  1. 1. Create a data generation plan and add the tables and entities. You can define additional criteria to filter the data.
  2. 2. Enter the number of records that you want to generate.
  3. 3. To store the results in the test data warehouse, select the test data warehouse from the list of target connections in the plan.
  4. 4. To also copy flat file target results to an integrated HP ALM server, enter the test tool integration properties in the plan.
  5. 5. Generate a workflow from the plan.
  6. 6. Run the workflow.

Monitor the Workflow

Monitor workflow progress and monitor progress and logs of other jobs such as importing metadata and profiling in the Monitor view. Each workflow appears as a job in the Monitor view.
Access the Monitor view to determine the status of the workflow jobs. You can run the Row Count Report on a successfully run workflow to view the number of rows that a plan affects. View the workflow job status. Access the TDM job log to troubleshoot problems.