Salesforce Connector Guide > Data Assessment Task > Quality Metrics
  

Quality Metrics

You can add one or more of the following quality metrics to a Data Assessment task:

Address Validation Quality Metric

Use the Address Validation quality metric to validate United States and Canada address information for a Salesforce object. The metric determines the percentage of address-related fields for a Salesforce object that have valid address information.
When you create the Data Assessment task, indicate if you want the Data Assessment task to perform address validation. If you include address validation, select the type of addresses to be validated. By default, the plan validates shipping and billing addresses. You can validate addresses in custom Salesforce fields by mapping them to the billing or shipping address plan fields.
When validating addresses, the Data Assessment task compares the address information in each field selected for address validation against address reference datasets provided with Informatica Cloud. The Data Assessment task assumes the address value in the field is not valid if the value does not match a value in the address reference. The validation check is not case sensitive. The Data Assessment task counts null values in address fields as not valid.

Completeness Quality Metric

Use the Completeness quality metric to verify that each field does not contain blank or null values. The metric determines the percentage of fields for a Salesforce object that do not have blank or null values. The Data Assessment task can validate completeness of all types of fields for a Salesforce object.
When you create the Data Assessment task, indicate if you want the Data Assessment task to perform a completeness check. If you include the Completeness quality metric, you can select the fields to check for completeness. For example, you omit a field that is rarely populated in Salesforce and that is not important to your organization.

Conformance Quality Metric

Use the Conformance quality metric to determine the percentage of fields for a Salesforce object that conform to a predefined format. Conformance applies to particular types of fields. When you create the Data Assessment task, indicate if you want the Data Assessment task to perform a conformance check. If you include the Conformance quality metric, you can select the fields to be verified for conformance.

Duplicates Quality Metric

Use the Duplicates quality metric to determine whether there are duplicate records for a Salesforce object. The metric determines the percentage of duplicate records for a given Salesforce object. The Data Assessment task determines whether records in the same group are duplicates based a patented, fuzzy matching algorithm, field weights, and a threshold. The matching algorithm is not case sensitive.
Before comparing records, the Data Assessment task groups the records of each Salesforce object based on a field. The Data Assessment task then compares records within each group based on the matching algorithm, field weights, and threshold. If two records are in different groups, the Data Assessment task assumes the records are not duplicates of each other.
The following table lists the field used to group records for each type of Salesforce object:
Salesforce Object
Field to Group By
Account
BillingPostalCode
Contact
MailingPostalCode
Lead
PostalCode
Opportunity
Probability
When you create the Data Assessment task, specify the fields of a record to compare to determine whether two records are duplicates. You can configure the Data Assessment task to compare all or some of the fields in a record.
Customize the weights assigned to fields to specify how significant the fields are in determining duplicate records. You can also specify the threshold at which records are considered duplicates. The Data Assessment task uses a fuzzy matching algorithm to determine duplicates.
When comparing two records, the Data Assessment task compares the values for each field that is included in the Duplicates quality metric. The Data Assessment task assigns a score to each field based on how closely the values match. Each score ranges from 0 to 1, where 0 indicates no match and 1 indicates an exact match. The Data Assessment task adds the product of each score and the corresponding weight to determine a matching score for the two records. If the matching score exceeds or is equal to the threshold, the Data Assessment task considers the rows to be duplicates. The Data Assessment task provides the percentage of all records that are duplicates in the scorecard for the Data Assessment task.
Note: The Data Assessment task uses a matching algorithm that is not case sensitive.

Example of Records in Different Groups

Two records for the Salesforce Account object have the same values for all fields, except the BillingPostalCode field. Regardless of field weights, threshold, and matching algorithm, the Data Assessment task organizes the records in different groups. The Data Assessment task assumes the records are not duplicates because they are in different groups.

Example of Records in Same Group

You create a Data Assessment task to determine the percentage of duplicate records for the Salesforce Account object.
You configure the following custom weights for the Data Assessment task fields for the Salesforce Account object:
Field
Weight
AccountNumber
50
Name
40
BillingPostalCode
10
You set the Threshold Value field to 65.
The Data Assessment task determines that the two records are in the same group because they have the same BillingPostalCode. Next, the Data Assessment task compares the values for the AccountNumber, Name, and BillingPostalCode fields for two records.
The Data Assessment task assigns the following scores to the fields:
Field
Score
Description
AccountNumber
1
Account numbers are the same.
Name
0.3
Account names are similar.
BillingPostalCode
1
Billing postal codes are the same.
The Data Assessment task uses the following calculation to determine the matching score for the two records:
1*50 + 0.3*40 + 1*10 = 72
The Data Assessment task compares the matching score, 72, against the Threshold Value, 65. The Data Assessment task considers the two records duplicates because the matching score for the two records is higher than or equal to the threshold.
Note: You can set different weights or a different threshold and have a different outcome with the same two records.