To identify exception records in your data set, you perform steps in Data Quality and in Data Profiling.
The following image shows the steps involved in the exception management process:
Note: Although profiling tasks and exception tasks are reusable assets, you must ensure that the tasks are tailored to the data set that you select. You are likely to create a unique profiling task and exception task for each data set that you examine for exception data.
The exception management process includes the following steps:
1Identify the data set that you will examine for exception records.
You may decide to examine a data set that you previously updated but that may contain records with unresolved data quality issues.
2In Data Quality, configure one or more rule specifications to identify and update the exception records in the data set.
Each rule specification has the following characteristics:
- The rule specification includes one or more statements that can identify records as exception. For example, a record may be an exception if it contains a primary key field and the primary key is null.
- The rule specification adds the following exception indicators as outputs to the data:
▪ A field for status values.
▪ A field that indicates the priority of the data quality issue that identifies the record as an exception.
▪ A field that describes the data quality issue.
- The rule specification uses a single set of outputs for exception indicators.
In basic mode, you configure the exception indicators in the primary rule set. If you add a rule specification within a rule statement, take care that the child rule specification does not create additional outputs for exception indicators. In advanced mode, you can configure exception indicators in logical statements throughout the rule logic.
Create rule specifications according to the number and type of data quality issues that you expect to find in the data. For example, you might create a single rule specification with multiple rule statements that relate to a range of issues. Or, you might create a single rule specification for each type of data quality issue that you are concerned about.
3In Data Profiling, create a data profiling task and add one or more rule specifications as rules to the task. Include a rule specification that you configured to find and update exception records. Configure the profiling task to read the data set that contains the exception records.
You can add multiple rule specifications to a profiling task. Select one or more rule specifications that examine the source data in a manner that best suits your project requirements.
4Create an exception task from the profiling task. You can create an exception task in Data Profiling or in Data Quality.
Add one or more rules from the profiling task to the exception task. Include the rule specification that you configured earlier to find and update exception records.
5Run the exception task, and review the job summary on the My Jobs page.
If you configure the task to write the job output as a file to your local system, browse to the file and review your data. If you configure the task to write the job output to the exception data store in Informatica cloud, use the link in the summary to download the job data.
You can run the exception task in Data Profiling or Data Quality. Run a single exception task at a time.
You can view the status of your exception jobs from the My Jobs page in Data Quality, Data Integration, and Data Profiling.
For more information on data profiling tasks, see Data Profiling in the Data Profiling documentation.