Data Discovery Task Flow
You can run profiles for data discovery to find primary keys and entity relationships between tables in the source data. You can run a data domain profile to search for columns to assign to data domains for data masking.
Before you can run profiles, the administrator must configure a connection to the source database for data discovery. The administrator must also configure connections to the Data Integration Service and the Model Repository Service.
Complete the following high-level steps to perform data discovery:
- 1. Create a profile.
- 2. Select the type of profiling you want to perform. You can choose to run a primary key profile, an entity profile, or a data domain discovery profile.
- 3. If you choose to run a data domain discovery profile, choose the data domains to search for.
- 4. Choose the sampling size for the profile.
- 5. Run the profile and monitor the job.
- 6. After the job completes, open the profile again.
- 7. Review the primary key profile results, the entity profile results, and data domain profile results.
- 8. Select and approve the results that you want to use for data masking and data subset operations.