A rule specification asset contains options on a Definition tab and a Configuration tab. Use the Definition tab options to enter a name for the asset, optionally enter a description for the asset, and to select the folder in which to store the asset. Use the Configuration tab options to configure the rule specification logic.
Configuration tab options in basic mode
In basic mode, the Configuration tab contains the Design panel and the Properties panel. You can expand or collapse the Design panel to show or hide the rule sets that you add for configuration. You might collapse the Design panel when you configure a rule set with a rule statement, so that the rule set configuration options are easier to view.
When you create a rule specification, you configure a series of shapes in the Configuration tab. The shapes are called rule sets. Each rule set describes an aspect of the business rule.
A rule set contains one or more rule statements that define the business rule requirements at a low level. Each rule statement reads a column of input data and verifies that the input data meets the conditions that you specify.
The top-level rule set is the primary rule set. The primary rule set summarizes the business rule. The output from the primary rule set specifies whether each row of input data meets the overall requirement of the business rule.
The following image shows a rule specification in the Configuration tab in basic mode:
The rule specification contains the following elements:
1Primary rule set.
2Child rule sets of the primary rule set.
3Property options on the rule set that you select.
4Rule statements in the rule set that you select.
5Rule statement that the system defines.
Configuration tab options in advanced mode
In advanced mode, use the Configuration tab to write rule logic in simple expressions or in IF-THEN-ELSE syntax.
The Configuration tab in advanced mode contains an editor that allows you to directly write your business rule in expression logic.
The following image shows a rule specification in the Configuration tab in advanced mode:
The Configuration tab includes the following options:
1Manage Inputs and Outputs.
Opens the input and output configuration options. Use the options to add one or more inputs and outputs.
2Rule Editor.
Provides a canvas in which you can write the rule logic.
3Search and replace.
Searches and replaces a value that you specify.
4Select all.
Selects the entire script in the editor panel.
5Test input field.
Contains the input data that the Secure Agent uses to test the expression logic that you enter.
6Test output field.
Contains the result of the test.
Rule sets
Rule sets define the logical flow of data through the rule specification. You organize the rule logic in one or more rule sets when you configure a rule specification in basic mode. Data flows upward through the rule specification from the lowest rule set to the primary rule set. The number of rule sets in the rule specification depends on the logical requirements of the business rule. A valid rule specification might have a single rule set.
You can add a rule set below any rule set in the rule specification. The rule sets have a parent-to-child relationship. When you add a rule set, the output of the child rule set becomes an input to the parent rule set.
Note: You must select the output from any child rule set as an input in a rule statement in the parent rule set.
Rule sets contain the rule statements that analyze and update the input data. You can configure a rule set with a single rule statement, or you can add multiple rule statements to the rule set. Within a rule set, data flows from the first rule statement to the final rule statement.
You can copy or move a rule set to another location in a rule specification, and you can copy or move a rule set to another rule specification.
Inputs and outputs
An input describes a field of data that a rule statement can analyze in basic mode or that expression logic can analyze in advanced mode. An output represents the result of an operation within the rule specification. Inputs and outputs are specific to the rule specification in which you create them.
Define an input to represent a column in a data set that the rule specification will analyze at run time. Define an output to represent a column of data that the rule specification will generate at run time. In basic mode, a rule set generates an output. You define the outputs in advanced mode. The outputs that you define in advanced mode correspond to the results of a rule set in basic mode.
When you define an input or output, you can set the following properties:
•The input or output name.
•The data type of the data that the input or output represents. You can create an input with a string, date/time, float, or integer data type. You can additionally create an input of double data type in advanced mode.
•The maximum number of characters that a value in the input or output field can contain.
You can specify an integer data type for numbers in the range -2147483648 through 2147483647. To read numbers that are outside the integer range, use the float or double data type.
You can set the scale, or the number of digits that can follow a decimal point, in basic mode. In a float value, the default scale is 4. In a date/time value, the scale is preset to 9. You cannot change the scale in a date/time value.
The properties in basic mode also include a Usage value. The Usage value indicates the number of times that the input appears in a rule statement in the rule specification.
To view the inputs that a rule set uses in basic mode, select the rule set on the Configuration tab.
Rules and guidelines for inputs
Consider the following rules and guidelines for rule specification inputs:
•Create an input in basic mode to add the input to a rule statement.
•When you create a child rule set, the output from the rule set becomes an input to the parent rule set. You must use the input in a rule statement in the parent rule set.
•An input in a rule specification asset does not store information about business data, such as the name of a source column, table, or file.
•When you add a rule specification asset to a Rule Specification transformation in a mapping, you can connect a rule specification input to any column of data that is compatible with the input properties.
Input data types and Amazon S3 connections
You might use a Rule Specification transformation in a mapping that connects to a file source over an Amazon S3 V2 connection. For example, you might run such a mapping in Data Integration Elastic.
Consider the following factors before you configure the rule specification asset that the Rule Specification transformation will use:
•A mapping that uses an Amazon S3 V2 connector cannot process a date/time input on a rule specification.
•A mapping that reads a source file over an Amazon S3 V2 connector can read a string data type without additional configuration, as the string data type corresponds to the Amazon S3 STRING data type.
For more information on connectors, see the documentation for Connectors in the Data Integration online help.
Rule statements
In basic mode, a rule statement is a set of operators, conditions, and actions that analyze a column of data and generate an output based on the result of the analysis. You add a rule statement to a rule set.
A condition is a data operation that determines a single fact about a data value. You can add multiple conditions to a rule statement. An action is a data operation that generates a potential output from the rule set. An action generates data when the input that you add to the rule statement satisfies the conditions that you define.
A rule specification reads the rule statements in a rule set from top to bottom. For a given row of input data, the rule set accepts the output from the first rule statement that generates output data.
Each rule set contains a system-defined rule statement that specifies the action to take if no other rule statement generates output data. The system-defined rule statement is the final rule statement in the rule set. You can edit the action in the system-defined rule statement. By default, the rule statement specifies that the rule set does not generate any output data if the other rule statements do not generate output data.
In advanced mode, you can write the equivalent rule in expression logic.
Status values in rule statements
The actions that a rule statement can perform include the generation of status values. A status value is a predefined value that a rule statement can generate as the output from an action. You can configure an action to generate a status value as an alternative to user-defined value.
A rule statement can return Valid or Invalid as status values. You cannot modify the status values.
You can use status values to achieve the following objectives:
Provide data to profiles
Scorecards in Data Profiling can recognize status values. When a rule specification returns a status value as an output, the scorecard can report the number of status values as a data quality category. To enable the scorecard to read the rule specification output, add the rule specification as a rule to the profile that generates the scorecard.
Provide information to downstream users about exception records
You can configure a rule specification to identify a record as an exception. An exception is a record that contains unresolved data quality issues.
To identify records as exceptions in basic mode, configure a rule statement to return the status value Invalid. Define the exception properties on a rule statement in the primary rule set.
To identify records as exceptions in advanced mode, configure exception indicators for one or more status values that you define in the rule logic that you write. In advanced mode, you can define custom status values that more closely describe the data quality issue in each exception record. You can associate each status value with a corresponding data quality issue. You can configure a different set of exception properties for each status value.