Primary Key Discovery
Primary key discovery generates primary key candidates from the columns you specify.
A primary key is a column or combination of columns that uniquely identify a row in a data source. Primary key discovery identifies the columns and combinations of columns that meet a specific confidence level. You can edit the confidence level, as well as the maximum number of columns to combine for primary key identification.
Primary key discovery can highlight potential data quality issues by identifying the non-unique rows in a primary key candidate. This is especially useful in cases where primary key discovery combines many columns, since non-conforming records are likely to contain duplicate information.
Primary Key Inference Properties
When you create a single data object profile, you can use the Primary Key Profiling view to configure the primary key inference properties.
The following table describes the primary key inference properties in the Primary Key Profiling view:
Property | Description |
---|
Override the default inference options | Allows you to configure custom settings for primary key inference. |
Max Key Columns | Maximum number of columns that can make up a primary key. |
Max Rows | Number of rows to profile. |
Conformance Criteria | Minimum percentage or maximum number of rows of key violations that the profile allows when determining primary keys. |
Exclude data objects with documented, user defined key | Excludes data objects with documented primary keys or user-defined primary keys. |
Excludes data objects with approved key | Excludes data objects with approved primary keys. |
Inferred Primary Key Properties
After you run a single data object profile, you can use the Primary Key Profiling view to view the details of the inferred primary keys in the data source.
The following table describes the inferred primary keys properties in the Primary Key Profiling view:
Property | Description |
---|
Column | Name of the column in the profile. |
% Conforming | Percentage of unique values in the column. |
% Duplicates | Percentage of duplicate values for the column. |
% Null | Percentage of null values for the column. |
Verified | Determines whether the column is a primary key column. |
Inference Status | Inference status of the column. |
Last Run Time | The date and time that the primary key profile last ran. |
Key Violations Properties
After you run a single data object profile, you can use the Primary Key Profiling view to view the details of the primary key violations in the data source.
The following table describes the key violations properties in the Primary Key Profiling view:
Property | Description |
---|
Column(s) | Name of the column(s) from which the profile infers a candidate primary key. |
Number of Key Violations | Number of key violations in the primary key candidate. |