TDM Example
An organization wants to enforce a policy to mask sensitive employee stock data in a large data processing environment.
The IT department needs test data for a new employee stock plan in an organization. The organization must ensure that the sensitive data is not compromised in the test data. The test database must contain representative data from the various application environments, including employee personal data, salary data, stock purchases, and job information. Multiple test teams must be able to access the test data and replace modified test data with the original test data when required. The organization uses TDM to establish and enforce a policy for creating the data in the test environment and to store and reuse the test data in the test data mart.
The organization completes the following steps:
- 1. Create a policy. The compliance officer determines the type of employee data that should be masked. The compliance officer creates an Employee_Stock policy.
- 2. Define data domains. The compliance officer defines data domains to group similar fields for data masking. For example, the data contains columns called Employee_Salary, Yearly_Salary, and Salary_History. All columns that contain "Salary" in the name belong to the same data domain. All columns in the same data domain can receive the same data masking rules.
- 3. Define data masking rules. The compliance officer creates data masking rules to mask the employee data. For example, the compliance officer masks employee names with substitution masking from a dictionary. The compliance officer applies random masking to the salary columns. He applies Social Security masking to Social Security numbers.
- 4. Define a project. A project developer defines an Employee_Stock project and imports the data sources to the project. The project developer performs all the data subset, data profiling, and data masking configuration in the project.
- 5. Run a profile for data discovery. The project developer runs a profile for data discovery. The profile identifies sensitive columns in the source tables and it populates the data domains that the compliance officer defined in the policy.
- 6. Create table relationships. The database does not contain primary and foreign keys. The project developer runs a profile for primary keys and entities to find relationships between tables. The project developer examines the primary key profile results and the entity profile results to create relationships. The project developer creates logical primary and foreign keys in the tables. In some cases, the project developer selects an entity to use from the profile results.
- 7. Create entities and groups for data subset. With the constraints in place, the project developer can create entities in an Employee_Stock project. An entity defines a set of related source tables based on constraints. The project includes the Employee, JobHistory, Salary, and Employee_Stock tables. The project developer also creates a group in the project. A group defines unrelated tables to include in the test database. The group includes a table called Stock_History.
- 8. Approve or reject profile job results. The compliance officer reviews the results and approves or rejects the column assignments to the data domains.
- 9. Verify all sensitive fields are masked. The compliance officer reviews reports that describe what source data is masked in the project.
- 10. Create a plan to run data subset and data masking. The project developer creates one plan to run the data masking and subset operations in a workflow. The project developer adds the entities and groups to the plan to define which data to copy to the subset database. The project developer adds the Employee_Stock policy to the plan to define how to mask the data. When the project developer runs a workflow from the plan, the PowerCenter Integration Service runs the workflow and loads the masked data into the subset database.
- 11. The compliance officer validates the results in the subset database.
- 12. Create a plan to move the masked data subset to the test data mart. The project developer creates a plan with the subset database as the source connection and the test data warehouse as the target connection. When the project developer runs a workflow from the plan, the PowerCenter Integration Service runs the workflow and loads the masked data as a data set in the test data mart.
- 13. Reset a data set from the test data mart. The project developer runs a reset operation on the data set to restore the original test data to the required connection. When the reset operation runs, the PowerCenter Integration Service runs the workflow and loads the data set from the test data mart to the target connection.