Consider the following rules and guidelines when working with the Create Uniqueness Data Quality Rule Occurrences in Data Governance and Catalog recipe:
•The score is calculated using profiling results retrieved through an API from CDGC.
NULL values are never counted as unique. However, if all the values are identical, for example, all the values are ABCD, exactly one of them is considered as unique.
Examples:
- If all the profiled values are NULL, the total number of rows equals the number of NULLs, resulting in a score of 0%.
- If all the profiled values are non-NULL, for example, if the total number of rows is 10 and all the values are ABC, then one ABC value is treated as unique, and the remaining nine are treated as non-unique, giving a score of 10%.
NULL values are always treated as non-distinct.
•When you invoke the process using a web browser or the Run Using option, you must enter the input field values correctly. For example, if you provide an incorrect catalog source name or specify a data element that is not profiled or does not exist, the process will complete without any errors, but it will not create or update the rule occurrence.
However, if you enter an incorrect Target or Threshold value, the process will return an appropriate error.
Consider the following guidelines when entering the Target or Threshold values:
- Neither value can be negative or greater than 100.
- The Threshold value cannot exceed the Target value.
- Both values can have up to two decimal places.
If you provide a data element that does not exist, the recipe ignores the value and does not create a rule occurrence for it. It does not return an error.
If you enter an invalid value for Criticality, the system value defaults to Medium.
It is possible for a catalog source to contain a dataset with the same name as the catalog source, and both might include a data element with the same name. If both the data elements are profiled, the process creates a rule occurrence for each data element. This does not occur when using the guide, as the user selects a specific dataset from the list.
•The process runs on the Cloud Server. If it takes longer than 95 seconds, the session will time out and display an error message.
The default timeout value for Application Integration Cloud Server is set to 95 seconds. This value is standardized across all organizations hosted in the cloud and cannot be modified by user. For more information, see the Knowledge Base article 552590.
Even if a timeout occurs, rule occurrences will still be created or updated. Although the results won't be displayed immediately, the process will complete the creation or update of rule occurrences in CDGC.
If you change the tracing level in the process from None to Verbose and run the process using a web browser or the Run Using option, you can view the result in Application Integration Console.
To view the result in Application Integration Console, perform the following steps:
1In Application Integration Console, on the Processes tab, search for the most recent run of the Process Selected Data Elements process.
2Click the process run ID to open it.
3Click Advanced View in the top-right corner to access the Active Process Detail page.
4From the left panel, select variables > output.
You can view the result in the Variable Instance Data section on the right panel.