Probabilistic Model Label Data

The label values in a probabilistic model represent the types of information that the reference data values might contain. When you add reference data rows to a model, assign a label to each value in each row. The labels that you add to the model appear in the Label view and in the menu options in the Data view.

You can assign any label in the model to any reference data value. If the same value has different meanings in different rows of reference data, you can assign a different label to each value in each row.

The range of label values can correspond to the range of input ports that the Labeler transformation or the Parser transformation reads during probabilistic analysis. The probabilistic model must contain at least one label value that the transformation can apply to the data values on each input port.

For example, a warehouse might store inventory data in a comma-delimited file that defines eight columns. You design a mapping that parses the inventory data to a database table. You create a probabilistic model with a label value for each data column. When you run the mapping, the Parser transformation writes each value in the input data to the correct column in the target table.

The following table shows the columns of inventory data and the label values that you might create in a probabilistic model:

Inventory Column Name	Label Name
Product_Name	Product_Name
Quantity	Quantity
Location	Location
Barcode	Barcode
SKU	Stock_Keeping_Unit
Arrival_Date	Arrival_Date
Cost_Price	Cost_Price

Note: You can use the input column names, or you can use other names. The names do not need to match.

Overflow Label

When a transformation cannot apply a label to an input data value, the transformation treats the data value as overflow data. The Labeler transformation applies an overflow label to any data value that it cannot identify. The Parser transformation writes any data value that it cannot identify to an overflow port.

The following table shows how a Parser transformation might use an overflow port to parse address data elements that a probabilistic model does not recognize:

Input Data	Street_Name port	Street_Descriptor port	Overflow port
Park Place	Park	Place	No overflow data
Park Avenue	Park	Avenue	No overflow data
Madison Avenue	Madison	Avenue	No overflow data
Central Park	Central	Park	No overflow data
Washington Square Park	Washington	Square	Park
Madison Square Garden	Madison	Square	Garden

The Parser transformation also writes values to an overflow port when the number of input values is greater than the number of labels in the model. Before you use a probabilistic model in a transformation, review the input data and verify that the model contains the correct number of label values.