•Explore. Manage project and project folders. Find and open data quality assets, data profiling tasks, and exception tasks.
•My Jobs. View the status of the jobs that run in your organization.
•My Import/Export Logs. View the status of your imports and exports.
When you switch from Data Quality to another service, the panels and the options in the navigation bar change to suit the service.
Data quality life cycle
The assets that you configure for your data quality projects constitute a set of operations that you can perform across Informatica Intelligent Cloud Services.
To understand and improve the quality of your data, you can move the data through the following stages:
1Discover. Analyze the content and structure of your source data.
To analyze the content and structure, create a profile in Data Profiling.
Note: You can open and run profiles from the Explore page in both Data Profiling and Data Quality.
2Design. Create assets to address the issues that you find in the source data.
Create the assets in Data Quality.
3Apply. Add the assets to one or more mappings, and run the mappings on the data.
Design and run the mappings in Data Integration.
4Measure. Run profiles to review the results of the mappings.
Optionally, update the assets that you created in Data Quality and run the mappings again to optimize the quality of the data.
Data Quality dimensions
Your organization may target a range of objectives when you build data quality initiatives into your data systems. For example, you may need to eradicate duplicate records in order to comply with regulatory standards. Or, you might recognize that postal address accuracy is sub-optimal across your records. Or, you might decide to mine additional information and value from your current data.
The needs of every organization are unique, but the data quality issues that your data may demonstrate can fall into a range of common categories. Data Quality assets can identify these categories as dimensions.
You can set the Dimension option in data quality assets that correspond to passive transformations in Data Integration. Set the option to specify the data quality issue that you want the asset to address. You can set the Dimension option in a cleanse, labeler, parse, rule specification, and verifier asset. A scorecard can read the dimension that you set on a rule specification asset.
The following dimensions are built-in Data Quality:
Accuracy
Select Accuracy when the asset logic is primarily concerned with establishing the accuracy of data values. Data is accurate when it matches a known data fact that the asset can verify.
For example, a business rule may require that each employee in an organization has the correct data security clearance for their role. The organization maintains a set of personnel records that includes the security clearance level and job title of each employee. You can configure an asset to compare the security clearance data to the job title data in each record and to verify that the values match accurately.
You might use dictionaries that contain the job titles and the security clearance levels to verify that the respective data values are correct.
Validity
Select Validity when the asset logic is primarily concerned with establishing the validity of the data. Data is valid when it meets the formal and structural requirements of a business rule that your organization defines. For example, valid data might use the data type and conform to the character length that the business rule expects.
Note: Validity and consistency are similar dimensions. However, data values can be consistent but not valid. Consistency is a measure of the similarity in form between the data values in a column. Validity is a measure of the correspondence between the formal aspects of the column data and the format that your organization requires.
Completeness
Select Completeness when the asset logic is primarily concerned with establishing the completeness of the data.
For example, a business rule in your organization might require that one or more data columns do not contain null data. You can configure a rule specification with one or more rule statements that search the relevant columns for null data.
Consistency
Select Consistency when the asset logic is primarily concerned with establishing the consistency of the data within one or more columns. The data in a column is consistent when the column values conform to a uniform character format. Additionally, column data can be consistent in the use of an agreed set of terms for different pieces of information. For example, you might configure a cleanse asset to standardize street descriptors such as Street and Road to ST and RD.
Uniqueness
Select Uniqueness when the asset logic is primarily concerned that a data set does not contain duplicate records. Two or more records are duplicates of each other when they refer to the same data entity with substantially the same data. To report on the uniqueness of the records, use a deduplicate asset.
A deduplicate asset applies a threshold score to the results of the comparisons that it makes between pairs of records in a data set. You can feed the output from a Deduplicate transformation to a Rule Specification transformation in a mapping, and you can configure the Rule Specification transformation to apply a status value to records according to their threshold scores. You can assign the Uniqueness dimension to the rule specification asset in the Rule Specification transformation.
Timeliness
Select Timeliness when the primary purpose of the asset is to verify that the record data is current. Current data represents the most recent version of a data fact.
For example, a retail organization might require that warehouse inventory records are updated every day. You can define a rule specification to check that the date stamp on each inventory record matches the current date.
In addition to built-in dimensions, you might see additional custom dimensions. Custom dimensions that you create in Metadata Command Center also appear in Data Quality.
Rules and guidelines for dimensions
Consider the following rules and guidelines when you add a dimension to an asset:
•You select a dimension as an optional step when you configure an asset. By default, an asset does not specify a dimension.
•An asset may contain steps that examine data in multiple dimensions. Select the dimension that best describes the purpose of the asset.
•The dimension that you set is a metadata value. It does not affect the asset logic or the analysis that the asset performs.
•If a custom dimension is deleted in Metadata Command Center, you'll see a warning message when you review a Data Quality asset that uses the deleted dimension.
•The list of issues on the Dimensions menu does not represent a complete range of the data quality issues that you can investigate. Likewise, the meaning and impact of the dimensions may be different in your organization. Select the dimensions that are right for your data.