Duplicate Record Exception Configuration View
Define the match score thresholds and configure where the Duplicate Record Exception transformation writes the different types output data.
The following figure shows the properties that you can configure:
You can configure the following properties:
- Lower Threshold
- The lower limit for the duplicate record score range. The transformation processes records with match scores less than this value as unique records. The lower threshold value is a number from 0 to 1.
- Upper Threshold
- The upper limit for the duplicate record score range. The transformation processes records with match scores greater than or equal to the upper threshold as duplicate records. The upper threshold value is a number greater than the lower threshold number.
- Automatic Consolidation
- Clusters in which all records have match scores greater than the upper threshold. Automatic consolidation clusters do not require review. The records are duplicate. You can use the Consolidation transformation to combine the records. By default, the Duplicate Record Exception transformation writes automatic consolidation clusters to standard output ports.
- Manual Consolidation
- Clusters in which all records have match scores greater than or equal to the lower threshold and at least one record has a match score less than the upper threshold. You must perform a manual review of the clusters to determine if they contain duplicate records. By default, the Duplicate Record Exception transformation writes manual consolidation records to the duplicate record table.
- Unique Consolidation
- Clusters with a cluster size equal to one or clusters in which any record has a match score less than the lower threshold. Unique record clusters are not duplicates. By default, the Duplicate Record Exception transformation does not write unique records to an output table.
- Standard Output
- The types of records that the transformation writes to the standard output ports.
- Default is automatic consolidation records.
- Duplicate Record Table
- The types of record that the transformation writes to the duplicate record output ports. Default is manual consolidation records.
- Create separate output group for unique records
- Creates a separate output group for unique records. If you do not create a separate table for the unique records, you can configure the transformation to write the unique records to one of the other groups. Or, you can skip writing unique records to an output table. Default is disabled.
- Generate duplicate record table
- Creates a database object to contain the duplicate record cluster data. When you select this option, the Developer tool creates the database object. The Developer tool adds the object to the Model repository, adds an instance of the object to the mapping canvas, and links the ports to the object.
Generating a Duplicate Records Table
You can generate a Duplicate Records table from a Duplicate Record Exception transformation instance in a mapping.
1. In the Configuration view, click Generate Duplicate Records table to generate the table.
The Create Relational Data Object dialog box appears.
2. Browse for and select a connection to the database to contain the table.
3. Enter a name for the Duplicate Records table in the database.
4. Enter a name for the Duplicate Records table object in the Model repository.
5. Click Finish.
The Developer tool adds the new table to the mapping canvas.