Data Discovery Guide > Part II: Data Discovery with Informatica Analyst > Data Domain Discovery in Informatica Analyst > Data Domain Discovery Options in Informatica Analyst
  

Data Domain Discovery Options in Informatica Analyst

Use the data domain discovery options to choose the columns, data domains, and inference options for data domain discovery. Inference options include choosing whether you want to run data domain discovery based on a rule on column data, column name, or both.

Data Domain Column Selection in Informatica Analyst

You can click Edit in the Specify Settings screen to choose the columns you want to run as a part of data domain discovery. You can view all the columns in the data source in the Select Source screen in the profile wizard. You can choose different columns for column profile and data domain discovery.
The following table describes the Edit dialog box properties for data domain discovery:
Option
Description
Name
Displays the column name.
Type
Displays the documented data type of the column.
Precision
Displays the maximum precision for the column.
Scale
Displays the scale of the column.
Nullable
Indicates a column that can have null values.
Key
Indicates whether the column is documented as a primary key or foreign key.

Data Domain Selection in Informatica Analyst

The Data Domain pane in the Specify Settings screen lists all the data domains from the data domain glossary. You can choose the data domains you want to run as a part of data domain discovery.
The following table describes the Data Domain properties for data domain discovery:
Option
Description
Name
Displays the data domain name. You can choose one or more data domains or data domain group.
Description
Displays the description for the data domain.
DomainGroups
Displays the name of the data domain group to which the data domain belongs.

Data Domain Inference Options in Informatica Analyst

Inference options determine whether data domain discovery must run on column data, column name, or both. You can also specify the maximum number of rows the profile can analyze and the minimum conformance percentage for data domain match. You can set the data domain inference options in the Specify Settings screen in the profile wizard.
The following table describes the inference options for data domain discovery:
Option
Description
Data
Runs the profile on column data.
Column name
Runs the profile on column titles.
Data and column name
Runs the profile on both column data and column titles.
Minimum Conformance Percentage
The minimum conformance percentage of the column data to be eligible for data domain match. The conformance percentage is the ratio of number of matching rows divided by the total number of rows.
Note: The Analyst tool considers null values as nonmatching rows. Columns containing a high number of null values might not result in data domain inference unless you specify a low value for minimum conformance percentage.
Edit
Select the columns for data domain discovery
All Rows
Runs the profile on all rows from the source.
Sample first
Choose maximum number of rows the profile can run on. The Analyst tool chooses the rows starting from the first row in the source.
Random sample
Choose a random sample of rows from the data source.
Random sample (auto)
The Analyst tool chooses a random sample of rows based on the size of the data source.
Exclude approved data types and data domains from the data type and data domain inference in the subsequent profile runs.
Assume that you approved a data type or data domain in a profile run. When you select this option, the approved data type or data domain is excluded from data type and data domain inference in the subsequent profile runs.