Data Discovery Guide > Part II: Data Discovery with Informatica Analyst > Column Profiles in Informatica Analyst > Profile Options
  

Profile Options

Profile options include data sampling options and data drill-down options. You can configure these options when you create or edit a column profile for a data object.
You use the Discovery workspace to configure the profile options. You can choose to create a profile with the default options for columns, sampling, and drill-down options. Use the drill-down option to choose between live data and staged data.

Sampling Options

Sampling options determine the number of rows that the Analyst tool chooses to run a profile on. You can configure sampling options when you define a profile or when you run a profile.
The following table describes the sampling options for a profile:
Option
Description
All rows
Runs a profile on all the rows in the data object.
Supported on Native, Blaze, and Spark run-time environment.
Sample first <number> rows
Runs a profile on the sample rows from the beginning of the rows in the data object. You can choose a maximum of 2,147,483,647 rows.
Supported on Native and Blaze run-time environment.
Random sample <number> rows
Runs a profile on a randomly picked number of the rows in the data object. You can choose a maximum of 2,147,483,647 rows.
Supported on Native and Blaze run-time environment.
Random sample (auto)
Runs a profile on the sample rows computed on the basis of the number of rows in the data object.
Supported on Native and Blaze run-time environment.
Limit n <number> rows
Runs a profile based on the number of rows in the data object. When you choose to run a profile in the Hadoop validation environment, Spark engine collects samples from multiple partitions of the data object and pushes the samples to a single node to compute sample size. The Limit n sampling option supports Oracle, SQL Server, and DB2 databases. You cannot apply the Advanced filter with the Limit n sampling option.
Supported on Spark run-time environment.
Random percentage
Runs a profile on a percentage of rows in the data object.
Supported on Spark run-time environment.
Exclude approved data types and data domains from the data type and data domain inference in the subsequent profile runs
Excludes the approved data type or data domain from data type and data domain inference from the next profile run.
After you choose to run the profile on a random sample of rows, the random sample algorithm chooses the rows at random in the data object to run the profile on. When you choose a random sampling option for column profiles, the Analyst tool performs drilldown on the staged data. This can impact the drill-down performance. When you choose a random sampling option for data domain discovery profiles, the Analyst tool performs drill down on live data.

Drilldown Options

You can configure drilldown options when you define a profile or when you edit a profile.
The following table describes the drilldown options for a profile:
Options
Description
Live
Drills down on live data to read current data in the data source.
Staged
Drills down on staged data to read profile data that is staged in the profiling warehouse.
Select Columns
Identifies columns for drilldown that you did not select for profiling.