Out-of-the-Box DQ rules for Product 360
Out-of-the-Box DQ Rules for Product 360
The Out-of-the-Box data quality rule package enables customers to use pre-built templates in order to get started quickly.
This document describes the use cases and set up for all standard data quality rules that are part of the Informatica Data Quality for Product 360 delivery.
Project Structure within Informatica Data Quality (Design Environment)
The out-of-the-box DQ rules for Product 360 package is structured into two files that make up the Informatica Data Quality project that can be used in Product 360:
Informatica_PIM_Content.xml
This file contains the rules itself in the defined structure. See details on the rules and their functionality under "Rule Types"Informatica_PIM_Content.zip
This file contains all the meta data like demo classifiers for the auto classification of items and reference tables used by some of the rules. See details on the reference tables and their functionality under "Reference Tables"
Importing the Product 360 rules into Informatica Data Quality is only necessary if you want to add additional custom rules for Product 360.
When importing the XML and the ZIP file into Informatica Data Quality it should reflect the following structure:
Only Mapplets included in following 2 paths will be surfaced through the Product 360 UI:
Informatica_PIM_Content\Custom_Rules: Used for custom rules built by customer
Informatica_PIM_Content\PreDefined_Rules: Contains all pre-built / shipped rules for Product 360
Please make sure to put custom rules always under Custom_Rules as they won't be available through the UI if structured differently!
Without filtering of views, a Product 360 user is overwhelmed with the amount of mapplets displayed and has no indication which ones are ready for consumption in Product 360 and which ones are nested mapplets.
Place mapplets / objects that should not be surfaced to the user selection screen in Product 360 in Rule_Dependencies folder or other project.
On import of the ZIP file all content for the sub-folder Rule_Dependencies is transferred into the Informatica Data Quality project.
Used for Product 360 specific rule dependencies – mapplets not shown to users, reference tables, content sets, etc.
Use of other projects (e.g. Informatica_DQ_Content, custom project, etc.)
Allowed and recommended to support single source of truth for rule design
Limitation in Product 360 only applies to viewing rules from XML in the Product 360 User interface, all other dependencies are required during execution and included in exports
Make sure custom rules are exported with a different XML filename than the standard rules (Informatica_PIM_Content.xml). You can have either one XML for all custom rules (e.g. “Custom_Rules_of_MyEnterprise.xml”) or even better one XML for each of them (“my_color_standardization.xml”, etc.). The export of custom rules should not include the PreDefined_Rules folder of the project so you can update the pre-built rules with any Product 360 upgrade and without any additional effort just by using the XML shipped with the install packages.
Upgrading from one IDQ Version to another
In case the embedded IDQ SDK deployed inside Product 360 has received a version upgrade it is also needed to upgrade the IDQ design environment containing your custom DQ rules (Informatica Developer). This can be achieved by performing an upgrade of the IDQ instance itself to the higher version while having the rules used for Product 360 in the IDQ repository. The upgrade of Informatica Developer will automatically upgrade the rules to that higher version as well.
For more information on upgrading the Informatica Developer (IDQ Server) to a higher version please refer to the standard documentation of Informatica Data Quality.
Next step is now to export your custom rules and potentially additional reference data as XML and ZIP.
Make sure custom rules are exported with a different XML filename than the standard rules (Informatica_PIM_Content.xml). You can have either one XML for all custom rules (e.g. “Custom_Rules_of_MyEnterprise.xml”) or even better one XML for each of them (“my_color_standardization.xml”, etc.). The export of custom rules should not include the PreDefined_Rules folder of the project so you can update the pre-built rules with any Product 360 upgrade and without any additional effort just by using the XML shipped with the install packages.
On the Product 360 Server the "dataquality" folder should be backed up now.
To upload the new XML version of your rules open the Desktop UI and navigate to the DQ perspective to open up the rule configuration dialog via the “…” button:
The dialog will allow you to upload the new XML files:
If you would have updated meta data as well you can additionally use the “Add reference data…” button from above screen to import them. If not you can skip that step (only needed if the reference data has changed).
Upload all custom rule XML files from the upgraded IDQ environment.
Do this also with the pre-built rules XML (“Informatica_PIM_Content.xml”) shipped with the version of Product 360 that carries the IDQ SDK update so you have the latest and greatest of the out of the box rules as well.
Lastly go to your "dataquality" folder on the server and check for any potential duplicate or dated XML files to remove them and avoid duplicate rules (remember to backup first). The Date modified may serve as hint on whether this is a new XML or a dated one. In case you found duplicates or dated rules it is recommended to perform a server restart in the end so any memory load of rules on the server is also cleaned up.
Importing Rules into Product 360
In order to use the rules in Product 360 it is needed to upload them through the UI. In order to upload the files it is needed to create a Rule Configuration within the "Data quality configuration" view. When selecting a rule the following dialog pops up:
Add the XML file via the "Add rules from file" button and add the ZIP file via the "Add reference data from file" button.
Rule types and reference tables
Generic rules
Basic data quality rules that can be used in multiple use cases and scenarios to validate any data of products, variants and items in Product 360.
Check_DependentFields
Check_Equal
Check_isEmpty
Check_isEmpty_SubentityLevel (since 8.1)
Check_MaxLength
Check_MinLength
Structure Feature / Attribute rules
Data quality rules that can be used specifically for feature / attribute relational checks of products, variants and items stored in Product 360.
Check_DataTypes
Check_MandatoryValues
Check_PresetValues
Check_MissingAttributes
Product code validation rules
Data quality rules that can be used specifically to validate product code specific norms.
Validate_GTIN
Validate_UPC
Classification rules
Data quality rules that can be used specifically to classify objects based on the individual input values.
Classify Product
Identify_Language
Lookup and standardization rules
Data Quality rules that can be used specifically to standardize input values based on reference table lookups.
Check_Profanity
Parse_Color
Standardize_Color
Standardize_CompanyName
Standardize_UOM
Reference tables
Reference tables are being used to store look up values and static information that will be accessed by the DQ rules during their runtime. Reference Tables that consist of one or two columns can be directly imported into Product 360 and maintained through the "Dictionary" perspective of the Desktop UI. Reference tables that consist of more than two columns will not show up on the Product 360 UI but directly be used by the DQ rules during runtime.