Option | Description |
---|---|
All rows | Runs a profile on all the rows in the data object. Supported on Native, Blaze, and Spark run-time environment. |
Sample first <number> rows | Runs a profile on the sample rows from the beginning of the rows in the data object. You can choose a maximum of 2,147,483,647 rows. Supported on Native and Blaze run-time environment. |
Random sample <number> rows | Runs a profile on a randomly picked number of the rows in the data object. You can choose a maximum of 2,147,483,647 rows. Supported on Native and Blaze run-time environment. |
Random sample (auto) | Runs a profile on the sample rows computed on the basis of the number of rows in the data object. Supported on Native and Blaze run-time environment. |
Limit n <number> rows | Runs a profile based on the number of rows in the data object. When you choose to run a profile in the Hadoop validation environment, Spark engine collects samples from multiple partitions of the data object and pushes the samples to a single node to compute sample size. The Limit n sampling option supports Oracle, SQL Server, and DB2 databases. You cannot apply the Advanced filter with the Limit n sampling option. Supported on Spark run-time environment. |
Random percentage | Runs a profile on a percentage of rows in the data object. Supported on Spark run-time environment. |
Exclude approved data types and data domains from the data type and data domain inference in the subsequent profile runs | Excludes the approved data type or data domain from data type and data domain inference from the next profile run. |
Options | Description |
---|---|
Live | Drills down on live data to read current data in the data source. |
Staged | Drills down on staged data to read profile data that is staged in the profiling warehouse. |
Select Columns | Identifies columns for drilldown that you did not select for profiling. |