Classifier Models Overview
A classifier model is a reference data object in a content set. Use a classifier model to analyze long text strings that contain multiple values. A classifier model identifies the most common type of information in each string.
You add a classifier model to a Classifier transformation. The transformation searches for common values between the classifier model data and the data in each input row. The transformation uses the common values to categorize the type of information that each row represents.
You use a classifier model when the input data has the following characteristics:
- •The input data contains text. Classifier models apply natural language processes to text data to identify the types of information in the text. Natural language processes detect relevant words in the input string. Natural language processes disregard words that are not relevant.
- •The input data strings contain multiple values. For example, you can create a data column that contains the contents of an email message in each field.
The Classifier transformation reads string datatypes. The transformation imposes no limit on the length of the input strings.
You compile classifier models in the Developer tool. When you compile a model, you create associations between similar data values in the model. The Classifier transformation uses the compiled data to search for information in the input data.
Classifier Transformation Example
You can use a classifier model and a Classifier transformation to categorize email messages based on the text that they contain.
For example, you work in a customer support center, and you review the email messages that the organization receives from customers. The organization has customers in many countries, and it receives emails in many languages. You want to sort the emails by language, so that you can send each email to the center that can best reply to the customer.
You complete the following steps to sort the emails:
- 1. You write the email messages to a single file or a database table.
- 2. You create a classifier model that contains sample text for each language.
Note: You can use sample data from the email messages data as source data for the model. Copy the email message text to a file or database table, and create a data source from the file or table in the Model repository.
- 3. You add the classifier model to a Classifier transformation.
- 4. You add the transformation to a mapping, and you connect the transformation ports to the data source and data targets. You create a data target for each language.
When you run the mapping, the Classifier transformation analyzes the email messages and writes the email text to the correct data target. You can share the data target with the team members in the appropriate support center.