Match Transformation Ports
The Match transformation includes a set of predefined input ports and output ports that contain metadata for the match analysis operations that you define. The transformation selects or clears the ports when you configure the match type and match output options.
When you configure the transformation, review the metadata ports. When you add the transformation to a mapping, verify that you connect the metadata ports to the correct ports on the upstream and downstream mapping objects.
Match Transformation Input Ports
The predefined input ports contain the metadata that the transformation requires for match analysis.
After you create a Match transformation, you can configure the following input ports:
- SequenceId
- Unique identifier for each record in the mapping source data set. Every record in an input data set must include a unique sequence identifier. If a data set contains duplicate sequence identifiers, the Match transformation cannot identify duplicate records correctly. Use the Key Generator transformation to create unique identifiers if none exist in the data.
- When you create an index data store for identity data, the Match transformation adds the sequence identifier for each record to the data store. When you configure the transformation to compare a data source with the index data store, the transformation might find a common sequence identifier in both data sets. The transformation can analyze the sequence identifiers if they are unique in the respective data sets.
- GroupKey
- Key value that identifies the group to which the record belongs.
Note: To improve mapping performance, configure the GroupKey input port and the output port that connects to it with the same precision value.
Match Transformation Output Ports
The predefined output ports contain metadata about the analysis that the transformation performs.
After you create a Match transformation, you can configure the following output ports:
- GroupKey
- Key value that identifies the group to which the record belongs.
- Downstream transformations such as the Association transformation can read the group key value.
- ClusterId
- The identifier of the cluster to which the record belongs. Used in cluster output.
- ClusterSize
- The number of records in the cluster to which a record belongs. When a cluster contains a unique record, the cluster size is 1. Used in cluster output.
- RowId and RowId1
- A unique row identifier for the record. The Match transformation uses the row identifier to identify the row during the match analysis operations. The identifier might not match the row number in the input data.
- DriverId
- The row identifier of the driver record in a cluster. Used in cluster output. The driver record is the record in the cluster with the highest value on the SequenceID input port.
- DriverScore
- The transformation assigns a driver score in matched pair output and clustered output. In a matched pair, the driver score is the match score between the pair of records. In a cluster, the driver score is the match score between the current record and the driver record in the cluster.
- LinkId
- The row identifier of the record that matched with the current record and linked it to the cluster. Used in cluster output.
- LinkScore
- The match score between two records that results in the creation of a cluster or the addition of a record to a cluster. The LinkID port identifies the record with which the current record shares the link score. Used in cluster output.
- PersistenceStatus
- An eight-character code that represents the results of the match analysis on an input record. Used in single-source identity analysis when the transformation compares the data source to an index data store.
- The transformation populates the first three characters in the code. The transformation can return different characters at each position. The transformation returns 0 for positions four through eight.
- When you configure the transformation to generate matched pair output, the transformation creates a PersistenceStatus port and a PersistenceStatus1 port.
- PersistenceStatusDesc
- A text description of the persistence status code values. Used in single-source identity analysis when the transformation compares the data source to an index data store.
- When you configure the transformation to generate matched pair output, the transformation creates a PersistenceStatusDesc port and a PersistenceStatusDesc1 port.
Persistence Status Codes and Persistence Status Descriptions
The persistence status codes and the persistence status descriptions describe the relationship between the different types of index data that the Match transformation analyzes. The transformation generates the status codes and the status descriptions when you configure the transformation to read a persistent identity data store.
The transformation writes the persistence status code to the PersistenceStatus port. The code contains eight characters. The transformation populates the first three positions in the string with code values. The transformation returns 0 for positions four through eight.
The transformation writes the persistence status description to the PersistenceStatusDesc port. The description contains three comma-separated text strings that describe the values in the first three positions in the persistence status code.
The transformation uses the sequence identifier values from the source data records to compare the index data for the two data sets.
The following table describes the types of information that the transformation writes at each position in the status description and the status code:
Position | Description |
---|
1 | Identifies the data set that contains the record. |
2 | Indicates the duplicate status of the record. The transformation looks for common sequence identifiers between the transformation input data and the index data store. |
3 | Describes any action that the transformation performs on the data. |
4-8 | The status code contains 0 at each position. The status description does not contain text for the positions. |
Status Code Values and Status Description Values
The persistence status codes and the persistence status descriptions describe the relationship between the transformation input records and the records that the data store represents. The transformation uses sequence identifier values to identify the records and to determine the relationship between the records in the data sets.
The persistence status code and the persistence status descriptions have a common structure. The status codes and the status descriptions contain the same information at each position in the output data string.
Data Set Status
The first value in the status code and in the status description identifies the data set that contains the record.
The following table describes the status codes and the status descriptions that the transformation can return in the first position:
Status Code | Status Description |
---|
S | Store. The current record originates in the index data store. |
I | Input. The current record originates in the transformation input data. |
Duplicate Record Status
The second value in the status code and in the status description describes the relationship between the transformation index data and the persistent data store.
The following table describes the status codes and the status descriptions that the transformation can return in the second position:
Status Code | Status Description |
---|
A | Absent. The index data store does not contain data for the current record. |
E | Exists. The current record is present in the index data store and in the transformation input data. |
I | Invalid. The transformation cannot analyze the current record. For example, the transformation cannot generate index data for the record because the key field on the Match Type tab is not compatible with the record data. |
N | New. The record is present in the data source. |
0 | [Dash] The record is present in the index data store. |
Data Store Status
The third value in the status code and in the status description describes any action that the transformation performs on the index data tables.
The following table describes the status codes and the status descriptions that the transformation can return in the third position:
Status Code | Status Description |
---|
A | Added. The transformation adds the index data for the current input record to the persistent data store. The transformation input data and the persistent index data have different sequence identifiers. |
I | Ignored. The transformation does not add any index data for the current input record to the persistent data store. |
N | The transformation returns one of the following descriptions: - - No change.
The current record originates in the persistent data store, and the transformation takes no action. - - Not added.
The transformation does not update the persistent data store with any data for the current input record because of the match policy that you defined.
|
R | Removed. The transformation removes the index data for the record from the index data store. |
U | Updated. The transformation updates the rows in the persistent data store with index data from the transformation input record. The transformation input data and the persistent index data have common sequence identifiers. |
Persistence Status Description Example
The persistence status code INA00000 has the following persistence status description:
Input, New, Added
The status code and the status description contain the following information about the record:
- •The record originates in the transformation input data.
- •The persistent data store does not contain a copy of the record.
- •The transformation adds the index data for the record to the persistent data store.
Output Ports and Match Output Selection
The match output options that you select determine the output ports on the transformation. For example, the transformation creates a ClusterId port and ClusterSize port when you select a clustered output type.
Select the type of transformation output that you need, and review the ports on the transformation.
If you update the match output type, verify the output port configuration on the transformation after you do so. If you use the transformation in a mapping, you might need to reconnect the output ports to the downstream objects in the mapping.