Data Quality Assets > Part III: Deduplicate assets > Introduction to deduplicate assets > Deduplication objectives
  

Deduplication objectives

The Objective option on the Deduplication tab defines the type of identity that the Deduplicate transformation analyzes when you run a mapping with the transformation. The objective also indicates the types of information that the transformation expects to read for the identity. When you configure the transformation, you map the input fields that contain the identity information to the most appropriate fields on the deduplicate asset.
Each objective supports one or more index key fields. You select an index key on the Deduplication tab in the asset. Some objectives define similar identity types, for example Family and Household. In each case, the deduplication process uses unique comparison logic.
The set of potential input fields on an objective contains at least one mandatory field and may also identify one or more required fields. To get the most accurate results from the deduplication operations, map each mandatory field on the asset to an appropriate input field on the transformation. Likewise, map at least one required field on the asset to an appropriate input field on the transformation. The mandatory and required input field names indicate the types of information that the input fields must contain.
Each objective also supports a set of input fields that can contain additional data about the identity. Optionally, map any additional asset field to a transformation field that contains the appropriate information. To optimize the analysis of the identity data, ensure that the Deduplicate transformation reads as many of the fields as possible.
For more information about mandatory and required fields on each objective, see Field types and identity objectives.
The following table describes the objectives that you can select and identifies the fields that you can select as the index key field:
Identity objective
Description
Index Keys
Address
Identifies records that share an address.
Address Part 1
Code
Date
Exact
Geocode
Telephone Number
Address Code
Identifies records that share an address code.
Address Part 1
Author ISBN
Identifies records that share information about an author who published work with an ISBN number.
Email
Exact
Person Name
ISBN10
ISBN13
CC Issuer
Identifies records that share information about a credit card issuer.
Company Name
Credit Card
Exact
Organization Name
CC Owner
Identifies records that share information about a credit card holder.
Address Part 1
Credit Card
Date
Email
Exact
Geocode
Person Name
Telephone Number
Contact
Identifies records that share a contact at a single organization and location.
Address Part 1
Code
Company Name
Date
Email
Exact
Geocode
Person Name
Organization Name
Telephone Number
Corp Entity
Identifies records that share corporate identification data.
Address Part 1
Code
Company Name
Date
Exact
Geocode
Organization Name
Telephone Number
Division
Identifies records that share an office location within an organization.
Address Part 1
Code
Company Name
Exact
Geocode
Organization Name
Telephone Number
Family
Identifies individuals that belong to the same family.
Address Part 1
Code
Email
Exact
Geocode
Person Name
Telephone Number
Fields
Identifies records that share identity data across multiple fields that you select.
Address Part 1
Code
Company Name
Credit Card
Date
Email
Exact
Geocode
ISBN10
ISBN13
Organization Name
Person Name
Product Description
Product Name
Telephone Number
VIN
[Generic field]
Generic
Identifies records that share identity information when different types of identity are represented in a single field. Select the objective for records that contain Australia, United Kingdom, or United States data.
Note: The Generic field that you select might contain person names, organization names, or addresses. The objective looks for common values across these information types in the field.
[Generic field]
Geocode
Identifies records that share geocode data.
Geocode
Household
Identifies individuals that belong to the same household.
Address Part 1
Code
Email
Exact
Geocode
Person Name
Telephone Number
Individual
Identifies duplicate individuals.
Date
Code
Email
Exact
Person Name
Organization
Identifies records that share organization data.
Address Part 1
Code
Company Name
Date
Exact
Geocode
Organization Name
Telephone Number
Person Name
Identifies records that share information about a person.
Address Part 1
Code
Date
Email
Exact
Geocode
Person Name
Telephone Number
Product
Identifies records that share information about a product.
Code
Company Name
Exact
Organization Name
Product Description
Product Name
Publisher ISBN
Identifies records that share information about a publishing company. The information includes ISBN data for published works.
Address Part 1
Company Name
Exact
Geocode
ISBN10
ISBN13
Organization Name
Resident
Identifies duplicate individuals at the same address.
Address Part 1
Code
Date
Email
Exact
Geocode
Person Name
Telephone Number
VIN Manufacturer
Identifies records that share information about a vehicle manufacturer.
Address Part 1
Company Name
Exact
Geocode
Organization Name
VIN
VIN Owner
Identifies records that share information about a vehicle owner.
Address Part 1
Code
Company Name
Date
Email
Exact
Geocode
Organization Name
Person Name
VIN
Wide Contact
Identifies records that share a contact at an organization.
Code
Company Name
Email
Exact
Organization Name
Person Name
Wide Household
Identifies individuals that belong the same household.
Address Part 1
Code
Email
Exact
Geocode
Person Name
Telephone Number

Field types and identity objectives

Each objective that you select on the Deduplication tab contributes data from one or more input fields to the process of identity analysis.
An objective can contribute data from one or more of the following field types:
Mandatory field
A field that contains essential information for the type of analysis that the objective specifies. Map every mandatory field to an input on the Deduplicate transformation.
Required fields
A field that contains high-priority information for the type of analysis that the objective specifies. Map at least one required field to an input on the Deduplicate transformation.
Optional fields
Any field that contains additional information about the identity and that might be useful during analysis. Optionally, map the fields to inputs on the Deduplicate transformation.
When you select an objective on the Deduplication tab, the test pane displays the mandatory and required fields on the objective. The test pane displays an asterisk (*) beside any mandatory field. The test pane displays a plus symbol (+) beside any required field.
An objective can specify any combination of one or more mandatory, required, or optional fields. The combination of fields can depend on the index key that you select for the objective. The test panel on the Deduplication tab displays the current mandatory and required fields for the objective and index key that you select. The index key is always a mandatory field for an objective.
Tip: If you browse the objectives, the index keys for the objectives can change. Before you save and close a deduplicate asset, verify that the Deduplication tab displays the index key that you intend for the objective that you select.
For more information about the fields that you can choose in each objective, see Mandatory, required, and optional fields.

Mandatory, required, and optional fields

Each objective on the Deduplication tab specifies a set of fields that the deduplication process can analyze at run time. To get the best results from the deduplication operation, provide input data for every mandatory field that the objective specifies and provide data for at least one required field. You can also provide data for the other fields that the objective specifies.
Note: The mandatory and required fields can vary according to the index key that you select for the objective. The current index key on any objective is always a current mandatory field. For example, the deduplicate asset specifies Organization Name as a single index key option for the CC Issuer objective, which means that Organization Name is mandatory in all cases for CC Issuer.
The following tables list the default sets of mandatory, required, and optional fields on each objective:

Address

Field
Field Type
Address Part1
Mandatory
Code
Optional
Date
Optional
Exact
Optional
Geocode
Optional
Telephone Number
Optional

Address Code

Field
Field Type
Address Part1
Mandatory

Author ISBN

Field
Field Type
Email
Optional
Exact
Optional
Person Name
Mandatory
ISBN10
Required, unless you specify ISBN13
ISBN13
Required, unless you specify ISBN10

CC Issuer

Note: As the Organization Name is the single index key option on the CC Issuer objective, Organization Name becomes a mandatory key in addition to Credit Card.
Field
Field Type
Company Name
Required, unless you specify Organization Name
Credit Card
Mandatory
Exact
Optional
Organization Name
Required, unless you specify Company Name

CC Owner

Field
Field Type
Address Part1
Optional
CreditCard
Mandatory
Date
Optional
Email
Optional
Exact
Optional
Geocode
Optional
Person Name
Mandatory
Telephone Number
Optional

Contact

Field
Field type
Address Part1
Mandatory
Code
Optional
Company Name
Required, unless you specify Organization Name
Date
Optional
Email
Optional
Exact
Optional
Geocode
Optional
Person Name
Mandatory
Organization Name
Required, unless you specify Company Name
Telephone Number
Optional

Corp Entity

Note: As Organization Name is the single required field on the Corp Entity objective, Organization Name becomes a mandatory field.
Field
Field type
Address Part1
Optional
Code
Optional
Company Name
Required, unless you specify Organization Name
Date
Optional
Exact
Optional
Geocode
Optional
Organization Name
Required, unless you specify Company Name
Telephone Number
Optional

Division

Field
Field type
Address Part1
Mandatory
Code
Optional
Company Name
Required, unless you specify Organization Name
Exact
Optional
Geocode
Optional
Organization Name
Required, unless you specify Company Name
Telephone Number
Optional

Family

Field
Field type
Address Part1
Mandatory
Code
Optional
Email
Optional
Exact
Optional
Geocode
Optional
Person Name
Mandatory
Telephone Number
Mandatory

Fields

Note: In the core field list for the Fields objective, all fields are optional. The deduplicate asset identifies the current index key as a manadatory field.
Field
Field type
Address Part1
Optional
Code
Optional
Company Name
Optional
Credit Card
Optional
Date
Optional
Email
Optional
Exact
Optional
Geocode
Optional
ISBN10
Optional
ISBN13
Optional
Organization Name
Optional
Person Name
Optional
Product Description
Optional
Product Name
Optional
Telephone Number
Optional
VIN
Optional
Generic Field
Optional

Generic

Field
Field Type
Generic Field
Mandatory

Geocode

Field
Field Type
Geocode
Mandatory

Household

Field
Field type
Address Part1
Mandatory
Code
Optional
Email
Optional
Exact
Optional
Geocode
Optional
Person Name
Mandatory
Telephone Number
Optional

Individual

Field
Field type
Date
Required, unless you specify ID
Code
Optional
Email
Optional
Exact
Optional
Person Name
Mandatory

Organization

Field
Field type
Address Part1
Optional
Code
Optional
Company Name
Required, unless you specify Organization Name
Date
Optional
Exact
Optional
Geocode
Optional
Organization Name
Required, unless you specify Company Name
Telephone Number
Optional

Person Name

Field
Field type
Address Part1
Optional
Code
Optional
Date
Optional
Email
Optional
Exact
Optional
Geocode
Optional
Person Name
Mandatory
Telephone Number
Optional

Product

Field
Field type
Code
Optional
Company Name
Optional
Exact
Optional
Organization Name
Optional
Product Description
Required
Product Name
Required

Publisher ISBN

Field
Field Type
Address Part1
Optional
Company Name
Required, unless you specify Organization Name
Exact
Optional
Geocode
Optional
ISBN10
Required, unless you specify ISBN13
ISBN13
Required, unless you specify ISBN10
Organization Name
Required, unless you specify Company Name

Resident

Field
Field type
Address Part1
Mandatory
Code
Optional
Date
Optional
Email
Optional
Exact
Optional
Geocode
Optional
Person Name
Mandatory
Telephone Number
Optional

VIN Manufacturer

Field
Field Type
Address Part1
Optional
Company Name
Required, unless you specify Organization Name
Exact
Optional
Geocode
Optional
Organization Name
Required, unless you specify Company Name
VIN
Mandatory

VIN Owner

Field
Field Type
Address Part1
Mandatory
Code
Optional
Company Name
Required, unless you specify Organization Name
Date
Optional
Email
Optional
Exact
Optional
Geocode
Optional
Organization Name
Required, unless you specify Company Name
Person Name
Optional
VIN
Mandatory

Wide Contact

Field
Field type
Code
Optional
Company Name
Required, unless you specify Organization Name
Email
Optional
Exact
Optional
Organization Name
Required, unless you specify Company Name
Person Name
Mandatory

Wide Household

Field
Field type
Address Part1
Mandatory
Code
Optional
Email
Optional
Exact
Optional
Geocode
Optional
Person Name
Optional
Telephone Number
Mandatory