Consider the following rules and guidelines when you configure data observability:
•Data observability identifies anomalies on the data that you configure and filter for the catalog source. If you apply metadata extraction filters and you further apply profiling filters, data observability identifies anomalies on a subset of the entire data.
•Before you enable data observability for a catalog source, you must first enable data profiling.
•Each time a data observability job runs, Metadata Command Center profiles the data on which metadata is extracted and then detects anomalies on the profiled data
•To generate accurate data observability results, set the Profiling Scope option to Full when you configure data profiling.
•Perform a minimum of three profile runs for data observability to detect anomalies in the data.
•However, for the following anomalies, one profiling run is required at minimum:
- Drop from Maximum anomalies
- Surge from Minimum anomalies
- Schema-based anomalies
Data observability detects anomalies from the second profiling run.
•If you modify the profiling filters after data observability has run a few jobs, the resulting profiled data changes. Historic profiled data and historic anomalies are lost. Data observability then runs a job on the new data. To accurately detect anomalies, keep the profiling filters constant over several runs.
•Data observability can observe data that contains up to 50,000 profiled data elements.