Creating a Profile to Perform Data Domain Discovery in Informatica Analyst
You need to create at least one data domain before you can create a profile to perform data domain discovery in the Analyst tool. The profile can discover both column name and column data that match predefined data domains.
1. In the Discovery workspace, click Profile, or select New > Profile from anywhere in the Analyst tool.
The New Profile wizard appears.
2. The Single source option is selected by default. Click Next.
3. In the Specify General Properties screen, enter a name and an optional description for the profile. In the Location field, select the project or folder where you want to create the profile. Click Next.
4. In the Select Source screen, click Choose to select a data object, or click New to import a data object. Click Next.
5. In the Specify Settings screen, choose to run a column profile, data domain discovery, or a column profile with data domain discovery. By default, column profile option is selected.
- - Choose Run column profile to run a column profile.
- - Choose Run data domain discovery to perform data domain discovery. In the Data domain pane, select the data domains that you want to discover, enter a minimum percentage match for data domains, and select the columns for data domain discovery in the Edit columns selection for data domin discovery dialog box.
- - Choose Run column profile and Run data domain discovery to run the column profile with data domain discovery. Select the data domain options in the Data domain pane.
Note: By default, the columns that you select for column profile is also applicable to data domain discovery. Click Edit to select or deselect columns for data domain discovery irrespective of the columns that you select for column profile.
- - Choose Data, Columns, or Data and Columns to run data domain discovery on.
- - Choose a sampling option in the Run profile on pane.
- - Choose a drilldown option in the Drilldown pane. Optionally, click Select Columns to select columns to drill down on. You can choose to omit data type and data domain inference for columns with approved data type or data domain.
- - Choose a connection type as Native or Hive. When you choose Hive, click Choose to select a Hive connection in the Select a Hive Connection dialog box.
The Hive connection helps the Data Integration Service communicate with the Hadoop cluster to push down the profile execution from the Data Integration Service to the Hadoop cluster.
6. In the Specify Rules and Filters screen, you can add, edit, or delete rules and filters for the profile.
7. Click Save and Finish to create the profile, or click Save and Run to create and run the profile.