Data Profiling > Profile results > Compare profile runs
  

Compare profile runs

You can compare the results for two profile runs to analyze and compare the content and statistics. After you select the profile runs to compare, the comparison results appear on the Compare Runs tab.
The later profile run results are compared to the previous profile run results. If a column was added in the later run, the column name appears with the term Added. If a column was removed in the later run, the column name appears with the term Removed.
When you change the source object after multiple runs, Data Profiling retains the profile results for all the profile runs in the profiling warehouse. You can compare the profile results for the previous and current source object. The columns of the previous source object appears as Removed and the columns of the current source object appears Added on the Compare Runs tab.

Example

You are a data steward. You create a profile on the Customer table. You need to identify the customers who were added to or deleted from a subscription in a month.
To accomplish the task, perform the following tasks:
  1. 1Run the profile on the Customer table on a monthly basis.
  2. 2Compare the latest profile results with the previous one or as required.
  3. 3Analyze the compare run results.
The Compare Runs tab displays a tree previewer to help you navigate to the profile runs of the nested columns for profiles that you create with Avro or Parquet source objects.
The following image displays a sample Compare Runs tab with a tree previewer:

Comparing profile runs

You can select two profile runs to compare the profile results.
    1Open a profile and view the Results tab.
    2Click Actions > Compare Profile Runs.
    The following sample image shows the Compare Profile Runs dialog box:
    The image shows the compare profile runs dialog box.
    3Choose two profile runs, and click Compare.

Compare run results

When you compare the results for two profile runs, the comparison results appear on the Compare Runs tab.
The following sample image shows the areas that you can view on the Compare Runs tab:
The image shows the compare runs tab.
  1. 1Header
  2. 2Filter or find
  3. 3Compare statistics
  4. 4Details

Header

The header area shows the profile run details which include the profile run numbers, profile run timestamps, and number of rows in the earlier run as compared to the later run.

Filter or find

The following table explains the options that appear in the Filter and find area:
Option
Description
View
Shows the following options:
  • - Columns and Rules. View the results for all the columns and rules in the profile run.
  • - Columns. View the results for the columns in the profile run.
  • - Rules. View the results for the rules in the profile run.
With
Shows the following options:
  • - Compare All Runs. View the comparison results for both the runs.
  • - Differences. View the differences in results in both the runs.
  • - Matches. View the results that match in both the runs.
  • - Added. Vew the results for columns that was added in the later run.
  • - Removed. View the results for columns that was removed in the later run.
Choose a filter in the With option after you choose a filter in the View option.
Find
Enter a keyword to view the relevant search results.
Menu
Choose Comfortable, Cozy, or Compact to adjust the row width in the profile results area.

Compare statistics

The compare statistics area shows the columns and rules in collapsible sections. The column statistics in both the runs are compared and displayed in the compare statistics area. An up arrow with a numeric count displays an increase in value for the statistic from the earlier run to later run. A down arrow with a numeric count displays a decrease in value for a statistic. You can choose the statistics that you want to view in the area. To add or remove a statistic, right-click a statistic name and select or clear the statistic.
The following sample image shows the compare statistics area:
The image shows a sample compare statistics area where four columns are compared and the statistics for each column is displayed.
The compare statistics area shows column statistics, such as the value distribution, percentage and number of values, data types, patterns, and the minimum and maximum values.
When you click a column, the statistics for the column appear in the Details area for the later run.

Details

In the Details area, you can view the statistics and comparison results. The comparison results include the number of rows in both the runs, difference in row count and row percentage in the later run.
The following sample image shows the Details area:
The sample image shows the details area where the values, data types, and patterns section shows sample values.
In this area, you can view the following statistics in collapsible sections:
Values in <later_run>
Shows the comparison results for null values, distinct values, and non-distinct values.
Data Types in <later_run>
Shows the comparison results for inferred data types.
Patterns in <later_run>
Shows the comparison results for inferred patterns.