Test a deduplicate asset to verify that the data flows through the asset in the ways that you expect.
1Open the deduplicate asset.
2Select the Deduplication tab.
3 Select a Secure Agent to run the test.
To refresh the list of active Secure Agents, click the Refresh icon.
4Enter data values in the test panel, or import the data to test. To import the data, click the Import option in the test panel.
Consider the following guidelines before you add or import data to the test panel:
- Configure the asset as completely as you can before you enter or import the test data. The asset may discard your test data if you make further changes to the configuration.
- Verify that the test data structure matches the column structure in the test panel. Provide test data for each mandatory field in the test panel. If the test panel includes any required fields, provide test data for at least one required field.
- If the objective accepts a date value, you can use the calendar and clock options to add a date and time to an input row. The calendar and clock options are synchronized. You can update the date or time after you add either value.
5Optionally, save the asset to preserve the current test data and configuration.
6Click Test.
You can sort, search, and filter the test results in the following ways:
- Click the Up and Down arrows to select a field on which to sort the test data. To reverse the sort order, select the field a second time. open a menu of the sorting categories. The categories reflect the configuration of the asset. The test results refresh to show the values for the category that you choose.
- Click the Filter icon to add a filter option to the test panel. Select a field on which to filter the data, and add a data filter for the field. The test results refresh to show any row that contains the filter value in the field that you specify.
- You can also enter a value in the Find field to search the test results.
The sort, search, and filter options work together. For example, if you apply a filter to the test data and you enter a value in the Find field, the asset displays any row that meets both the filter and search criteria.
7Verify that the deduplication process analyzes the test data in the manner that you expect.
8Optionally, select the Consolidation tab.
Use the options on the tab to verify that the consolidation process generates preferred records in the manner that you expect.
The test panel on the Consolidation tab contains the results of any test that you ran on the Deduplication tab.
Note: If you add one or more fields to the Consolidation tab, bear in mind that several field names are reserved on the deduplicate asset. To read the list of reserved field names, see Configuring the consolidation process.
9Click Test, and review the results of the test. You can test the consolidation options in row-based mode and field-based mode.
Understanding the test results
When you run a test on the Deduplication or Consolidation tab, the test results include a number of predefined fields. The input data and the test results on the Deduplication tab form the basis of the test that you can perform on the Consolidation tab.
The test results on the Deduplication tab include the following predefined fields:
Cluster ID
Contains the identifier of the cluster to which the input record belongs.
In the deduplication process, a cluster is a set of records whose data values match each other to a degree that exceeds the duplicate threshold. Records in the same set are likely to identify the same identity. A set may contain a single record, as every unique record is a perfect match with itself.
Cluster Size
Contains the number of records in the set to which the current record belongs. When a set contains a unique record, the cluster size is 1.
The test results on the Consolidation tab include the following predefined fields:
Cluster ID
Contains the identifier of the cluster to which the input record belongs. The Cluster ID fields on the Deduplication and Consolidation tabs contain identical information for a given test.
Preferred Record
Contains the values in the preferred record that the test creates for the current input.
The Deduplication and Consolidation tabs also display the mandatory and required fields that apply for the objective and index key that you select. In addition, the Consolidation tab displays all of the fields that the objective can use and any custom field that you add.
Rules and guidelines for test data
You can import data to the test panel in the deduplicate asset and save the test data in the asset configuration.
Consider the following rules and guidelines when you add data to the test panel:
•The import option supports CSV and Microsoft Excel files.
• You can import up to 200 consecutive rows of data from a delimited file. You can specify the row at which the import starts.
Note: Before you import, check the file for column headings. If the first row in the import file contains column headings, start the import at line 2 or lower.
•You can import or enter an input string of up to 255 characters.
•If you import a CSV file that contains multiple columns or uses a text qualifier, verify that the file uses a delimiter or a text qualifier that the Secure Agent recognizes. By default, the Comma option is the delimiter for the column data. By default, the No quotes option is the text qualifier for the data. You can update the delimiter and text qualifier characters when you select the data to import. The Delimiter and Text Qualifier options are not required when you import a Microsoft Excel file.
•The Secure Agent saves the data that you import to the asset when you save the asset. If you change an option in the asset configuration, you may lose any unsaved test data.
The following rules and guidelines apply to the deduplicate asset:
•The test panel structure can change based on the objective and the index key that you select. The structure of the test data that you import must match the structure in the test panel.
The test panel can include two types of input field:
- Mandatory: You must populate all mandatory fields with test data.
- Required: You must populate at least one required field with test data.