Probabilistic Models Overview
A probabilistic model is a reference data object that you create in a content set. Use a probabilistic model to analyze a data string that contains multiple data values. A probabilistic model identifies the type of information in each value in the string. You can add a probabilistic model to a Labeler transformation and a Parser transformation.
Use a probabilistic model in a Labeler transformation to assign a descriptive label to each value in an input string. The Labeler transformation writes the labels to a single output port. Use a probabilistic model in a Parser transformation to write each value in an input string to a port that represents the information in the value. The Parser transformation creates an output port for each type of information.
You design and compile a probabilistic model in the Developer tool. When you define a probabilistic model, you add a series of data rows to the model and you assign a label to each value in each row. When you compile a probabilistic model, the Developer tool creates associations between the data values and the labels that you added. The Labeler transformation and Parser transformation use natural language processes to compare the probabilistic model data to the input port data.
Natural language processes use the following techniques to identify the types of information in data values:
- •Natural language processes can recognize similar data values and assign the same label to the values.
- •Natural language processes can compare a data value to the adjacent values in the string. Natural language processes analyze the sequence of values to understand the usage of each string and to verify the types of information that the strings represent.