Association Transformation Overview
The Association transformation processes output data from a Match transformation. It creates links between duplicate records that are assigned to different match clusters, so that these records can be associated together in data consolidation and master data management operations.
The Association transformation generates an AssociationID value for each row in a group of associated records and writes the ID values to an output port.
The Consolidation transformation reads the output from the Association transformation. Use a Consolidation transformation to create a master record based on records with common association ID values.
The Association transformation accepts string and numerical data values on input ports. If you add an input port of another data type, the transformation converts the port data values to strings.
The AssociationID output port writes integer data. The transformation can write string data on an AssociationID port if the transformation was configured in an earlier version of Informatica Data Quality.
Example: Associating Match Transformation Outputs
The following table contains three records that could identify the same individual:
ID | Name | Address | City | State | ZIP | SSN |
---|
1 | David Jones | 100 Admiral Ave. | New York | NY | 10547 | 987-65-4321 |
2 | Dennis Jones | 1000 Alberta Ave. | New Jersey | NY | - | 987-65-4321 |
3 | D. Jones | Admiral Ave. | New York | NY | 10547-1521 | - |
A duplicate analysis operation defined in a Match transformation does not identify all three records as duplicates of each other, for the following reasons:
- •If you define a duplicate search on name and address data, records 1 and 3 are identified as duplicates but record 2 is omitted.
- •If you define a duplicate search on name and Social Security number data, records 1 and 2 are identified as duplicates but record 3 is omitted.
- •If you define a duplicate search on all three attributes (name, address, and Social Security number), the Match transformation may identify none of the records as matches.
The Association transformation links data from different match clusters, so that records that share a cluster ID are given a common AssociationID value. In this example, all three records are given the same AssociationID, as shown in the following table:
ID | Name | Address | City | State | Zip | SSN | Name and Address Cluster ID | Name and SSN Cluster ID | Association ID |
---|
1 | David Jones | 100 Admiral Ave. | New York | NY | 10547 | 987-65-4320 | 1 | 1 | 1 |
2 | Dennis Jones | 1000 Alberta Ave. | New Jersey | NY | - | 987-65-4320 | 2 | 1 | 1 |
3 | D. Jones | Alberta Ave. | New York | NY | 10547-1521 | - | 1 | 2 | 1 |
You can consolidate the duplicate record data in the Consolidation transformation.