Curation for Analysts and Developers
As a data analyst or data steward, you can curate the column profile results and data domain discovery results in the Analyst tool. You can curate the profile results to make accurate profile information ready for discovery search and further validation of the data assets.
As a developer or data architect, you can curate column profile results, data domain discovery results, primary key discovery results, and foreign key discovery results in the Developer tool.
Example 1. Curation Examples
When you perform enterprise discovery as a developer, the Developer tool processes the selected data domains for the entire data set. This action can result in multiple data domain inferences, such as phone number data inferred as the Social Security number data domain. Multiple data domain inferences occur when parts of the data within a column match different data domains. For example, a 10-digit phone number that is missing one digit might have the same pattern as a Social Security number. This occurrence indicates potential data quality issues within a column or a matching pattern across multiple data domains. In this case, the Developer tool might infer both phone number data domain and Social Security number data domain. You can curate the profile results so that you can select the most appropriate data domain and approve it. In the example, phone number is the relevant data domain because the inference of Social Security number data domain occurs due to a data quality issue.
When you run enterprise discovery, the Developer tool might infer multiple datatypes, such as Date, String, and Varchar, for a date column. As a data architect, you might want to choose and approve the Date datatype, which is the most relevant datatype for a date column.
Enterprise discovery in the Developer tool might infer all the data object relationships based on the column data. Some of these data object relationships include unwanted data object relationships in the discovered candidate keys. For example, the Developer tool might infer columns that represent a sequence as possible keys and discover relationships with other tables with similar columns. These data object relationships might not form valid relationships in the database. In such cases, you can assess, verify, and approve the most appropriate inferred profile results as part of curation.