REST Data Quality API

Rule Execution (since 8.0.03)

Executes all specified rules for a given set of items. The rules to be executed can be defined either by a list of single rule configurations, rule configuration groups, channels or a combination of all options. Please note that this is a synchronous call. No job will be triggered and the call will return as soon as the execution is finished. It's not recommended to use this method for large data sets as there might be HTTP based connection issues during the call in case it takes too long. On the contrary, it definitly is recommended to use this method for small data sets as it does not impose the overhead which comes with the job framework and it's easier to handle (especially in workflow situations)

URL Pattern

/manage/dataquality/executions

Method

POST

Content types

application/json, application/xml

Media types

application/json, application/xml

Result

A rule execution result object containing the protocol of the execution as well as the detail status information for each item and rule

Content

The content has to be a DataQualityProfile JSON object which is described in the following table

Field

Required

Default

Datatype

Parameter description

rules

no

String Array

Optional parameter which specifies individual rule names in form of a string array. Either rules, ruleGroups, channels or a combination of them must be specified!

ruleGroups

no

String Array

Optional parameter which specifies rule group names in form of a string array. Either rules, ruleGroups, channels or a combination of them must be specified!

channels

no

ENTITY_ITEM

Optional parameter which specifies channels in form of an entity item array. Either rules, ruleGroups, channels or a combination of them must be specified!

reportQuery

yes

ReportQuery

the report query which defines the input data set for the data quality check.

entityIdentifier

yes

String

the entity identifier which describes the data type for which the data quality check will be executed. It must correspond to the entity identifier of the rule configuration group.

Result

A data quality result object which contains all relevant information about the execution for each item.

Properties of the returned object

Field

Data type

Description

ruleIds

Map

A map of rule names to rule ID's. The ID's are only valid in this result object and are needed to reference the results of each item

numberOfSuccessfulItems

Integer

The number of items which completed all rules successfully

numberOfFailedItems

Integer

The number of items which failed for at least one rule

items

An array of item objects which provide the results for all executed rules per item

entityItem

ENTITY_ITEM

The reference to the entity item for which the rules have been executed

status

String

The overall status for all rules. Might be SUCCESSFUL, FAILED or UNKNOWN.
SUCCESSFUL means that all rules have completed successfully
FAILED means that at least a single rule failed
UNKNOWN means that for at least one rule no results are available. Usually this can only happen in case some unexpected error happens - please check the protocol in this case.

failedRuleIds

Array of Integer

The id's of all rules which have failed for this item (see ruleIds above)

successfulRuleIds

Array of Integer

The id's of all rules which have succeeded for this item (see ruleIds above)

protocol

The protocol (also known as problem log) of the execution.

infoCounter

Integer

number of protocol entries with the INFO severity

warningCounter

Integer

number of protocol entries with the WARNING severity

errorCounter

Integer

number of protocol entries with the ERROR severity

entries

Array of protocol entries

severity

String

The severity of the protocol entry. Might be INFO, WARNING or ERROR

category

String

The category of the protocol entry

message

String

The message of the protocol entry

logDate

Date

The date when the protocol entry has been created

logTime

Time

The time when the protocol entry has been created

Examples

In the following examples we assume that there are several rule configurations configured in the PIM system.

Java Rest Client Example

Rest Client Java Code
EntityItemReference toolsCatalog = EntityItemReferenceFactory.createByIdentifier( "TOOLS" );
EntityItemReference webShopChannel = EntityItemReferenceFactory.createByIdentifier( "WebShop" );
EntityItemReference erpChannel = EntityItemReferenceFactory.createByIdentifier( "ERP" );
ReportQuery reportQuery = new ReportQuery( "byCatalog" ); //$NON-NLS-1$
reportQuery.addParameterValue( "catalog", toolsCatalog );
 
 
DataQualityProfile profile = new DataQualityProfile();
profile.setReportQuery( reportQuery );
profile.setEntityIdentifier( "Article" );
profile.addRuleConfigurations( "item_description_rule1", "item_description_rule3" );
profile.addRuleConfigurationGroups( "Item Texts", "Item Attributes" );
profile.addChannels( webShopChannel, erpChannel );
 
DataQualityRequest request = getRestClient().createDataQualityRequest();
DataQualityResult result = request.execute( profile );

JSON Examples

Execute individual rules
//POST to http://localhost:1501/rest/V1.0/manage/dataquality/executions
{
"rules":["item_description_rule1","item_description_rule3"],
"ruleGroups":["Item Texts","Item Attributes"],
"channels":["'WebShop'","'ERP'"],
"entityIdentifier":"Article",
"reportQuery":{
"identifier":"byCatalog",
"parameterList":[
{
"key":"Catalog",
"value":"'TOOLS'"
}
]
}
}
DataQuality Result object in JSON format
{
"ruleIds": {
"CheckGtinMED": 12
},
"numberOfSuccessfulItems": 5,
"numberOfFailedItems": 0,
"items": [
{
"entityItem": {
"id": "15@1"
},
"status": "SUCCESSFUL",
"failedRuleIds": [],
"successfulRuleIds": [
12
]
},
{
"entityItem": {
"id": "137@1"
},
"status": "SUCCESSFUL",
"failedRuleIds": [],
"successfulRuleIds": [
12
]
},
{
"entityItem": {
"id": "121@1"
},
"status": "SUCCESSFUL",
"failedRuleIds": [],
"successfulRuleIds": [
12
]
},
{
"entityItem": {
"id": "19@1"
},
"status": "SUCCESSFUL",
"failedRuleIds": [],
"successfulRuleIds": [
12
]
},
{
"entityItem": {
"id": "149@1"
},
"status": "SUCCESSFUL",
"failedRuleIds": [],
"successfulRuleIds": [
12
]
}
],
"protocol": {
"infoCounter": 3,
"warningCounter": 0,
"errorCounter": 0,
"entries": [
{
"severity": "INFO",
"category": "SUMMARY",
"message": "1 Regel wird auf 5 Objekte des Typs 'Artikel' angewendet",
"logDate": "2016-01-04",
"logTime": "14:52:00"
},
{
"severity": "INFO",
"category": "SUMMARY",
"message": "Ausgeführte Regeln: CheckGtinMED",
"logDate": "2016-01-04",
"logTime": "14:52:00"
},
{
"severity": "INFO",
"category": "SUMMARY",
"message": "Verarbeitung der Regeln beendet.",
"logDate": "2016-01-04",
"logTime": "14:52:00"
}
]
}
}

Schedule Rule Execution

Executes all rules for a given amount of items. The rules to be executed can be defined either by a list of single rule configurations, rule configuration groups, channels or a combination of all options. Please note that this method replaces all other methods which are now deprecated (see below).

URL Pattern

/manage/dataquality/jobs

Method

POST

Content types

application/json, application/xml

Media types

application/json, application/xml

Result

The job object of the scheduled data quality job

Query Parameters

Parameter

Required

Default

Datatype

Parameter description

workflowServiceEndpoint

no

String

Informatica BPM callback parameter. Defines the name of the service endpoint which must be available in an attached Informatica BPM instance.

workflowCorrelationId

no

String

Informatica BPM callback parameter. An arbitrary id which is used by the Informatica BPM workflow to identify the correct workflow process.

workflowCommunicationMode

no

REST

REST/QUEUE

Informatica BPM callback parameter. Defines the communication mode which can be using JMS message queue and REST communication.

workflowQueueId

no

First trigger queue id in server.properties

String

Informatica BPM callback parameter. An queue id defined in the server properties in the message queue section which is used as response queue

Content

The content has to be a DataQualityProfile JSON object which is described in the following table

Field

Required

Default

Datatype

Parameter description

rules

no

String Array

Optional parameter which specifies individual rule names in form of a string array. Either rules, ruleGroups, channels or a combination of them must be specified!

ruleGroups

no

String Array

Optional parameter which specifies rule group names in form of a string array. Either rules, ruleGroups, channels or a combination of them must be specified!

channels

no

ENTITY_ITEM

Optional parameter which specifies channels in form of an entity item array. Either rules, ruleGroups, channels or a combination of them must be specified!

reportQuery

yes

ReportQuery

the report query which defines the input data set for the data quality check.

entityIdentifier

yes

String

the entity identifier which describes the data type for which the data quality check will be executed. It must correspond to the entity identifier of the rule configuration group.

Result

An object reference to the data quality job.

Properties of the returned object

Field

Data type

Description

id

Integer

The job ID

Workflow Callback (since 8.0.03)

If a dataquality run is executed with the workflowServiceEnpoint parameter given, the job will create a callback request to the Informatica BPM server with additional information by the time of finish.

Additional information include job information, as well as report information about the filtered entity objects. Based on the report ids it is even possible using the list api to retrieve the items regarding the different status groups.

Workflow Callback (since 10.0)

It is possible to add the query parameters workflowServiceEndpoint and workflowQueueId and workflowCommunicationMode which allows to specify that the response is send via the message queue. The workflowServiceEndpoint is send back as JMS property P360TargetService which allows BPM to call workflow endpoints. The workflowQueueId parameter specifies a queue configured in the server.properties with syntax "queue.[queueId].name".

Workflow Callback Example

In the following example two items have been processed with two rules (GTINRule,ManufacturerRule).

The report results lists reports for following entry keys:

  • SUCCESSFUL

  • FAILED

  • ManufacturerRule_FAILED
  • GTINRule_SUCCESSFUL
  • GTINRule_FAILED

The first two categories (SUCCESSFUL, FAILED) shows how many items did fail with any rule (FAILED) or were sucessful for all rules (SUCCESFUL).

The further categories are rule specific (ManufacturerRule_FAILED,GTINRule_SUCCESSFUL,GTINRule_FAILED). They show e.g. for the GTINRule_FAILED entry, how many items did fail specific for the 'GTINRule' rule.

With the report result id it is possible the retrieve the affected items via the byReport query of the List API.

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<jobFinished>
<jobId>24</jobId>
<stateIdentifier>finished.info</stateIdentifier>
<stateLabel>Completed</stateLabel>
<reportResults>
<entry key="SUCCESSFUL"/>
<entry key="FAILED">
<reportResult>
<id>30</id>
<dataSource>PCM_MASTER</dataSource>
<type>1</type>
<purpose>1</purpose>
<resultTableName>ReportStoreTempB7</resultTableName>
<count>2</count>
<entityIdentifier>Article</entityIdentifier>
</reportResult>
</entry>
<entry key="UNKNOWN"/>
<entry key="GTINRule_FAILED">
<reportResult>
<id>21</id>
<dataSource>PCM_MASTER</dataSource>
<type>1</type>
<purpose>1</purpose>
<resultTableName>ReportStoreTempB8</resultTableName>
<count>1</count>
<entityIdentifier>Article</entityIdentifier>
</reportResult>
</entry>
<entry key="ManufacturerRule_FAILED">
<reportResult>
<id>30</id>
<dataSource>PCM_MASTER</dataSource>
<type>1</type>
<purpose>1</purpose>
<resultTableName>ReportStoreTempB7</resultTableName>
<count>2</count>
<entityIdentifier>Article</entityIdentifier>
</reportResult>
</entry>
<entry key="GTINRule_SUCCESSFUL">
<reportResult>
<id>32</id>
<dataSource>PCM_MASTER</dataSource>
<type>1</type>
<purpose>1</purpose>
<resultTableName>ReportStoreTempB9</resultTableName>
<count>1</count>
<entityIdentifier>Article</entityIdentifier>
</reportResult>
</entry>
</reportResults>
</jobFinished>

Examples

Executing a DQ checks for the catalog TOOLS

In the following examples we assume that there are several rule configurations configured in the PIM system.

Java Rest Client Example

Rest Client Java Code
EntityItemReference toolsCatalog = EntityItemReferenceFactory.createByIdentifier( "TOOLS" );
EntityItemReference webShopChannel = EntityItemReferenceFactory.createByIdentifier( "WebShop" );
EntityItemReference erpChannel = EntityItemReferenceFactory.createByIdentifier( "ERP" );
ReportQuery reportQuery = new ReportQuery( "byCatalog" ); //$NON-NLS-1$
reportQuery.addParameterValue( "catalog", toolsCatalog );
 
 
DataQualityProfile profile = new DataQualityProfile();
profile.setReportQuery( reportQuery );
profile.setEntityIdentifier( "Article" );
profile.addRuleConfigurations( "item_description_rule1", "item_description_rule3" );
profile.addRuleConfigurationGroups( "Item Texts", "Item Attributes" );
profile.addChannels( webShopChannel, erpChannel );
 
 
DataQualityRequest request = getRestClient().createDataQualityRequest();
EntityItemReference job = request.scheduleExecution( profile );

JSON Examples

Execute individual rules
//POST to http://localhost:1501/rest/V1.0/manage/dataquality/jobs
{
"rules":["item_description_rule1","item_description_rule3"],
"ruleGroups":["Item Texts","Item Attributes"],
"channels":["'WebShop'","'ERP'"],
"entityIdentifier":"Article",
"reportQuery":{
"identifier":"byCatalog",
"parameterList":[
{
"key":"Catalog",
"value":"'TOOLS'"
}
]
}
}
Execute all rules for the rule groups
//POST to http://localhost:1501/rest/V1.0/manage/dataquality/jobs
{
"ruleGroups":["Item Texts","Item Attributes"],
"entityIdentifier":"Article",
"reportQuery":{
"identifier":"byCatalog",
"parameterList":[
{
"key":"Catalog",
"value":"'TOOLS'"
}
]
}
}
Execute all rules which are mapped to the
//POST to http://localhost:1501/rest/V1.0/manage/dataquality/jobs
{
"channels":["'WebShop'","'ERP'"],
"entityIdentifier":"Article",
"reportQuery":{
"identifier":"byCatalog",
"parameterList":[
{
"key":"Catalog",
"value":"'TOOLS'"
}
]
}
}

Deprecated

Please use /manage/dataquality/jobs instead of the following old URL's.

Execute a data quality rule configuration group immediately

General Info

URL Pattern

/manage/dataquality/job/{rule-configuration-group-name}

Method

POST

Parameters

-

Content types

application/json

Media types

application/json

Result

The job object of the scheduled data quality job

Content

The content has to be a JSON object which includes the properties listed below. It's called DataQualityProfile.

DataQualityProfile Properties

Field

Required

Default

Datatype

Parameter description

reportQuery

yes

ReportQuery

the report query which defines the input data set for the data quality check.

entityIdentifier

yes

String

the entity identifier which describes the data typ for which the data quality check will be executed. It must correspond to the entity identifier of the rule configuration group.

Result

An object reference to the data quality job.

Properties of the returned object

Field

Data type

Description

id

Integer

The job ID

Execute a set of data quality rule configurations

General Info

URL Pattern

/manage/dataquality/execution

Method

POST

Parameters

rules=<rule conf name1>,<conf name 2>,...

Content types

application/json

Media types

application/json

Result

The job object of the scheduled data quality job

Content and result are the same as for the configuration group.

Execute channel's set of data quality rule configurations

General Info

URL Pattern

/manage/dataquality/execution

Method

POST

Parameters

channel=<channelId>

Content types

application/json

Media types

application/json

Result

The job object of the scheduled data quality job

Content and result are the same as for the configuration group. ChannelId is an internal id as it appears in

http://localhost:1501/rest/V1.0/list/Channel/bySearch?query=%20not%20(Channel.Identifier%20is%20empty)&fields=ChannelLang.Name(en)

or an external id in single quotes.

Examples

Executing a dq check on the item descriptions of the catalog TOOLS

In the following example we assume that there is a rule configuration group which is called "item_descriptions" configured within the PIM system. To run this rule configuration immediately against the catalog "TOOLS" you need to execute the following code snippet.

Rest Client Java Code
EntityItemReference catalog = EntityItemReferenceFactory.createByIdentifier( "TOOLS" );
 
ReportQuery reportQuery = new ReportQuery( "byCatalog" );
reportQuery.addParameterValue( "catalog", catalog );
 
 
DataQualityProfile dataQualityProfile= new DataQualityProfile();
dataQualityProfile.setReportQuery( reportQuery );
dataQualityProfile.setEntityIdentifier( "Article" );
 
EntityItemReference result = getRestClient().createDataQualityRequest().scheduleRuleConfigurationGroup( "item_descriptions",
dataQualityProfile );

To run individual rules item_description_rule1 and item_description_rule3 on the same catalog with JSON

JSON example
//POST to http://localhost:1501/rest/V1.0/manage/dataquality/execution?rules=item_description_rule1,item_description_rule3
{
"reportQuery":{
"identifier":"byCatalog",
"parameterList":[{"key":"Catalog","value":"'TOOLS'"}]
},
"entityIdentifier":"Article"
}