Probabilistic Models > Probabilistic Model Structure

Probabilistic Model Structure

A probabilistic model contains a set of data rows and a corresponding set of labels. The data rows contain examples of the different values that might appear in the transformation input data. The labels identify the types of information that you expect the input values to contain.

When you define a probabilistic model, you assign each label to one or more values in the data rows. When you compile the model, the Developer tool generates a reference data object that represents the relationships between the label values and the data values.

A data row can contain a single value or multiple values. Each data row can have a different structure. You can assign the same label to multiple values in a data rows. Alternatively, you can assign a different label to identical values that appear in different positions in a row. Assign each label to at least one data value before you compile the probabilistic model.

Note: To optimize the capabilities of the probabilistic model, verify that each data row contains multiple values. The order of the values in each row must correspond as closely as possible to the order in which the values occur in the source data. If the data rows contain single values, the Labeler transformation or Parser transformation cannot apply natural language processes in the input data analysis.