Classification Rules

Data quality rules that can be used specifically to classify objects based on their individual input values.

Classify_Product

General description

Classifies a product to a structure group based on its "Long Description". A threshold (default value of 70%) defines the score needed to decide whether the classification should happen or not.

NOTE: This is a DEMO rule and classifies based on the "Sample Data Consumer Electronics" demo data. For own implementations, create a new classifier and change the configuration of the classifier transform (cls_get_category) to point to your classification model.

NLP Classifiers need to be trained with actual live data. This happens by mapping e.g. long descriptions of products to structure groups of Product 360. The broader the variety of long descriptions that have been mapped to a specific structure group value the better is the classification of new items working.

Input ports

Long_Description

Long description that will be used to classify the object against a certain structure group.

Output ports

StructureGroup_ID

Returns the identifier of the structure group found for the product.

StructureGroup_Name

Returns the English name of the structure group found for the product.

StructureGroup_Score

Returns the probability score for the classification to the structure group found. (QualityStatusEntry.Score)

Out_Status_Code

Returns the overall Status Code after the rule execution (OK or Failed). (QualityStatusEntry.Status)

Out_Status_Message

Returns the overall Status Message after the rule execution. (QualityStatusEntry.Message)

Meta data and reference tables

cs_classify_products_demo

Sample classifier based on the Consumer Electronics "Sample Data Consumer Electronics" demo data.

Error_messages_by_Language

Contains a 4 digit error code plus a language to indicate the preferred error message to be output.

Example usage

Check an item for a proposed structure group of the "Sample Data Consumer Electronics" structure system.

The "English Long Description" (Long_Description) of the item is:

"46 inch multimedia television set (Full-HD) with high-quality aluminum front frame and contrast pane - comes equipped with Dual HDTV multi-tuner and digital video recorder (integrated hard drive)"

Since the classifier of the rule has been trained with multiple possible long descriptions for all the structure groups of the system it suggests that the item should be assigned to the structure group called "HDTV Series"

Example output

StructureGroup_ID

1267688379455

StructureGroup_Name

HDTV Series

StructureGroup_Score

1.0000

Status_Code

OK

Status_Message

No Error

Identify_Language

General description

Identifies the language of a text field by a probability score given after the identification. A threshold (default value of 70%) defines the score needed to decide whether the language classification was successful or not.

Input ports

Text_Value

Text field that will be checked for its language.

Output ports

Language

Returns the language identified for the text field.

Language_Score

Returns the probability score for the identified language. (QualityStatusEntry.Score)

Out_Status_Code

Returns the overall Status Code after the rule execution (OK or Failed). (QualityStatusEntry.Status)

Out_Status_Message

Returns the overall Status Message after the rule execution. (QualityStatusEntry.Message)

Meta data and reference tables

Error_messages_by_Language

Contains a 4 digit error code plus a language to indicate the preferred error message to be output.

Example usage

Check whether the content of the "English Long Description" is actually written in English

The "English Long Description" (Text_Value) of the item is:

"46 inch multimedia television set (Full-HD) with high-quality aluminum front frame and contrast pane - comes equipped with Dual HDTV multi-tuner and digital video recorder (integrated hard drive)"

Example output

Language

en

Language_Score

0.9487

Out_Status_Code

OK

Out_Status_Message

No Error