Part III: Data Discovery with Informatica Developer > Enterprise Discovery in Informatica Developer > Profile Options for Enterprise Discovery
  

Profile Options for Enterprise Discovery

Set up profile options before you run a profile to perform enterprise discovery. Profile options include data domain discovery options, column profile sampling options, and inference options for primary keys and foreign keys.
You can choose to run the enterprise discovery profile after you set up the profile options. You can also choose to create profile tasks after the setup without running the profile.

Data Domain Selection for Enterprise Discovery

Inference options determine whether data domain discovery must run on column data, column name, or both. You can also specify whether the profile needs to process all the rows in the data source and set a minimum conformance percentage.
The following table describes the data domain inference options that you configure for enterprise discovery:
Option
Description
Override the default inference options
Changes the predefined inference options.
Data
The profile runs on column data.
Column name
The profile runs on column titles.
Data and column name
The profile runs on both column data and column titles.
All rows
The profile runs on all rows of the data source.
Maximum rows to profile
The maximum number of rows the profile can run on. The Developer tool chooses rows starting from the first row in the source.
Minimum conformance percentage
The minimum conformance percentage of the column data to be eligible for data domain match. The conformance percentage is the ratio of number of matching rows divided by the total number of rows.
Note: The Developer tool considers null values as nonmatching rows.
Exclude columns with approved data domains
Excludes columns with approved data domains from the data domain inference of the profile run.

Column Profile Sampling Options for Enterprise Discovery

The sampling options determine whether the Developer tool runs a column profile on all rows of the data sources or limited number of rows.
The following table describes the column profile sampling options that you configure for enterprise discovery:
Option
Description
All Rows
Chooses all rows in the data source.
First <number> Rows
The number of rows that you want to run the column profile on. The Developer tool chooses the rows starting from the first row in the data source.
Exclude datatype inference for columns with an approved datatype
Excludes columns with an approved datatype from the datatype inference of the column profile run.

Primary Key Inference Options for Enterprise Discovery

You can override the default primary key inference options for enterprise discovery. The options include the maximum number of rows you can run the profile on and minimum conformance percentage.
The following table describes the primary key inference options that you configure for enterprise discovery:
Options
Description
Override the default inference options
Allows you to configure custom settings for primary key inference.
Max Key Columns
Maximum number of columns that can make up a primary key.
Max Rows
Maximum number of rows you can run the profile on.
Minimum Percent
The minimum conformance percentage of the column data to be eligible for primary key match.
Maximum Violation Rows
The maximum number of rows with key violations that the profile allows when determining primary keys.

Foreign Key Inference Options for Enterprise Discovery

Set up the foreign key inference options to define the column settings for discovering foreign key relationships between data objects. The foreign key inference results depend on the primary key inference options you set up for enterprise discovery, documented primary keys, and user-defined primary keys.
The following table describes the foreign key inference options that you configure for enterprise discovery:
Options
Description
Override the default inference options
Changes the predefined inference options.
Datatypes used in comparisons
The datatype used in primary key and foreign key comparisons.
Note: This option applies if you run a column profile on the data source before the foreign key inference.
Comparison case-sensitivity
Includes case-sensitivity when comparing column data.
Trim values before comparison
Determines whether the Developer tool includes leading or trailing spaces in column data while processing.
Inferred primary keys used in comparisons
Use top _ ranked keys
The number of top-ranking primary keys used in foreign key inference when the Developer tool runs a foreign key profile across all the data sources. The Developer tool uses the top-ranking method along with documented primary keys and user-defined primary keys to infer the foreign key relationships.
Top ranking of inferred keys is based on the descending conformance percentage rounded to a single decimal precision. For example, the Developer tool considers a conformance percentage of 99.75 as 99.8 and 99.74 as 99.7.
The default value is 1. Set the value to -1 if you want the Developer tool to use all inferred keys in foreign key inference.
Note: If the primary key data sources have approved primary keys, the Developer tool does not use inferred primary keys for foreign key inference.
Max foreign keys between data objects
The maximum number of inferred columns that the Developer tool returns after the profile run that are eligible for foreign key discovery.
Minimum conformance percentage
The minimum eligibility value in percentage for including columns in the foreign key results.
Regenerate signature
Reloads column signatures if the source data changes.