Human Tasks and Exception Data Management
A Human task reads the output of a mapping that contains an Exception transformation. An Exception transformation analyzes output from other transformations to validate the data quality status of the records in a data set. A mapping developer uses the Exception transformation to identify records that need manual processing.
The Exception transformation writes records to one or more database tables based on the data quality status of each record. The transformation specifies a table as a target for records with an unverified data quality status. The objective of the user in a Human task is to verify the data quality status of the records in the table.
When the mapping developer completes the mapping that contains the Exception transformation, a workflow developer adds the mapping to a Mapping task in a workflow. When you add a Human task to the workflow, you configure the Human task to read the database table created when the Mapping task runs. The users who perform the Human task examine the records and makes any change required.
The users then update the status of the records in one of the following ways:
- •If a record is valid, the user updates the table metadata so that the record is confirmed for persistent storage in the database.
- •If a record is not valid, the user updates the table metadata so that the record is removed from the database at a later stage in the workflow.
- •If the status of a record cannot be confirmed, the user updates the table metadata so that the record is returned to the workflow for further processing in a Mapping task.
Types of Exception Data
The Exception transformation generates database tables that contain records with an unverified data quality status. The Human task user examines each record and attempts to resolve any issue the record contains.
A record can have the following types of data quality issue:
- •The record may contain errors or empty cells. The Human task user examines the records and attempts to update the record with correct and complete data.
- •The record may be a duplicate of another record. The Analyst tool displays duplicate record sets in groups called clusters. The Human task user examines the clusters and attempts to create a single preferred version of the records in each cluster.
The user can apply the following status indicators to a record or cluster:
- •The record or cluster issues are resolved, and the record can remain in the database. In the case of clusters, the preferred record remains in the table and the redundant duplicate records are dropped.
- •The record or cluster issues are unresolved and the record needs further processing.
The record or cluster contains unusable data and can be dropped from the table.
Analyst Tool
The Analyst tool is a web-based application that enables users to view and update records and clusters in a Human task.
The Analyst tool uses an Inbox to notify users of the Human tasks assigned to them. A user logs in to the Analyst tool and opens a task from the My Tasks panel.
The Analyst tool provides options to edit record or cluster data and to update the status of a record or cluster. The task view includes metadata columns that contain the status indicators for each record or cluster.
When a user completes a task in the Analyst tool, the records in the task pass to the next step in the Human task.