REST Data Quality API
Rule Execution (since 8.0.03)
Executes all specified rules for a given set of items. The rules to be executed can be defined either by a list of single rule configurations, rule configuration groups, channels or a combination of all options. Please note that this is a synchronous call. No job will be triggered and the call will return as soon as the execution is finished. It's not recommended to use this method for large data sets as there might be HTTP based connection issues during the call in case it takes too long. On the contrary, it definitly is recommended to use this method for small data sets as it does not impose the overhead which comes with the job framework and it's easier to handle (especially in workflow situations)
URL Pattern |
/manage/dataquality/executions |
Method |
POST |
Content types |
application/json, application/xml |
Media types |
application/json, application/xml |
Result |
A rule execution result object containing the protocol of the execution as well as the detail status information for each item and rule |
Content
The content has to be a DataQualityProfile JSON object which is described in the following table
Field |
Required |
Default |
Datatype |
Parameter description |
rules |
no |
String Array |
Optional parameter which specifies individual rule names in form of a string array. Either rules, ruleGroups, channels or a combination of them must be specified! |
|
ruleGroups |
no |
String Array |
Optional parameter which specifies rule group names in form of a string array. Either rules, ruleGroups, channels or a combination of them must be specified! |
|
channels |
no |
ENTITY_ITEM |
Optional parameter which specifies channels in form of an entity item array. Either rules, ruleGroups, channels or a combination of them must be specified! |
|
reportQuery |
yes |
|
ReportQuery |
the report query which defines the input data set for the data quality check. |
entityIdentifier |
yes |
|
String |
the entity identifier which describes the data type for which the data quality check will be executed. It must correspond to the entity identifier of the rule configuration group. |
Result
A data quality result object which contains all relevant information about the execution for each item.
Properties of the returned object |
||
Field |
Data type |
Description |
ruleIds |
Map |
A map of rule names to rule ID's. The ID's are only valid in this result object and are needed to reference the results of each item |
numberOfSuccessfulItems |
Integer |
The number of items which completed all rules successfully |
numberOfFailedItems |
Integer |
The number of items which failed for at least one rule |
items |
An array of item objects which provide the results for all executed rules per item |
|
entityItem |
ENTITY_ITEM |
The reference to the entity item for which the rules have been executed |
status |
String |
The overall status for all rules. Might be SUCCESSFUL, FAILED or UNKNOWN. |
failedRuleIds |
Array of Integer |
The id's of all rules which have failed for this item (see ruleIds above) |
successfulRuleIds |
Array of Integer |
The id's of all rules which have succeeded for this item (see ruleIds above) |
protocol |
The protocol (also known as problem log) of the execution. |
|
infoCounter |
Integer |
number of protocol entries with the INFO severity |
warningCounter |
Integer |
number of protocol entries with the WARNING severity |
errorCounter |
Integer |
number of protocol entries with the ERROR severity |
entries |
Array of protocol entries |
|
severity |
String |
The severity of the protocol entry. Might be INFO, WARNING or ERROR |
category |
String |
The category of the protocol entry |
message |
String |
The message of the protocol entry |
logDate |
Date |
The date when the protocol entry has been created |
logTime |
Time |
The time when the protocol entry has been created |
Examples
In the following examples we assume that there are several rule configurations configured in the PIM system.
Java Rest Client Example
EntityItemReference toolsCatalog = EntityItemReferenceFactory.createByIdentifier(
"TOOLS"
);
EntityItemReference webShopChannel = EntityItemReferenceFactory.createByIdentifier(
"WebShop"
);
EntityItemReference erpChannel = EntityItemReferenceFactory.createByIdentifier(
"ERP"
);
ReportQuery reportQuery =
new
ReportQuery(
"byCatalog"
);
//$NON-NLS-1$
reportQuery.addParameterValue(
"catalog"
, toolsCatalog );
DataQualityProfile profile =
new
DataQualityProfile();
profile.setReportQuery( reportQuery );
profile.setEntityIdentifier(
"Article"
);
profile.addRuleConfigurations(
"item_description_rule1"
,
"item_description_rule3"
);
profile.addRuleConfigurationGroups(
"Item Texts"
,
"Item Attributes"
);
profile.addChannels( webShopChannel, erpChannel );
DataQualityRequest request = getRestClient().createDataQualityRequest();
DataQualityResult result = request.execute( profile );
JSON Examples
//POST to http://localhost:1501/rest/V1.0/manage/dataquality/executions
{
"rules"
:[
"item_description_rule1"
,
"item_description_rule3"
],
"ruleGroups"
:[
"Item Texts"
,
"Item Attributes"
],
"channels"
:[
"'WebShop'"
,
"'ERP'"
],
"entityIdentifier"
:
"Article"
,
"reportQuery"
:{
"identifier"
:
"byCatalog"
,
"parameterList"
:[
{
"key"
:
"Catalog"
,
"value"
:
"'TOOLS'"
}
]
}
}
{
"ruleIds"
: {
"CheckGtinMED"
:
12
},
"numberOfSuccessfulItems"
:
5
,
"numberOfFailedItems"
:
0
,
"items"
: [
{
"entityItem"
: {
"id"
:
"15@1"
},
"status"
:
"SUCCESSFUL"
,
"failedRuleIds"
: [],
"successfulRuleIds"
: [
12
]
},
{
"entityItem"
: {
"id"
:
"137@1"
},
"status"
:
"SUCCESSFUL"
,
"failedRuleIds"
: [],
"successfulRuleIds"
: [
12
]
},
{
"entityItem"
: {
"id"
:
"121@1"
},
"status"
:
"SUCCESSFUL"
,
"failedRuleIds"
: [],
"successfulRuleIds"
: [
12
]
},
{
"entityItem"
: {
"id"
:
"19@1"
},
"status"
:
"SUCCESSFUL"
,
"failedRuleIds"
: [],
"successfulRuleIds"
: [
12
]
},
{
"entityItem"
: {
"id"
:
"149@1"
},
"status"
:
"SUCCESSFUL"
,
"failedRuleIds"
: [],
"successfulRuleIds"
: [
12
]
}
],
"protocol"
: {
"infoCounter"
:
3
,
"warningCounter"
:
0
,
"errorCounter"
:
0
,
"entries"
: [
{
"severity"
:
"INFO"
,
"category"
:
"SUMMARY"
,
"message"
:
"1 Regel wird auf 5 Objekte des Typs 'Artikel' angewendet"
,
"logDate"
:
"2016-01-04"
,
"logTime"
:
"14:52:00"
},
{
"severity"
:
"INFO"
,
"category"
:
"SUMMARY"
,
"message"
:
"Ausgeführte Regeln: CheckGtinMED"
,
"logDate"
:
"2016-01-04"
,
"logTime"
:
"14:52:00"
},
{
"severity"
:
"INFO"
,
"category"
:
"SUMMARY"
,
"message"
:
"Verarbeitung der Regeln beendet."
,
"logDate"
:
"2016-01-04"
,
"logTime"
:
"14:52:00"
}
]
}
}
Schedule Rule Execution
Executes all rules for a given amount of items. The rules to be executed can be defined either by a list of single rule configurations, rule configuration groups, channels or a combination of all options. Please note that this method replaces all other methods which are now deprecated (see below).
URL Pattern |
/manage/dataquality/jobs |
Method |
POST |
Content types |
application/json, application/xml |
Media types |
application/json, application/xml |
Result |
The job object of the scheduled data quality job |
Query Parameters
Parameter |
Required |
Default |
Datatype |
Parameter description |
workflowServiceEndpoint |
no |
String |
Informatica BPM callback parameter. Defines the name of the service endpoint which must be available in an attached Informatica BPM instance. |
|
workflowCorrelationId |
no |
String |
Informatica BPM callback parameter. An arbitrary id which is used by the Informatica BPM workflow to identify the correct workflow process. |
|
workflowCommunicationMode |
no |
REST |
REST/QUEUE |
Informatica BPM callback parameter. Defines the communication mode which can be using JMS message queue and REST communication. |
workflowQueueId |
no |
First trigger queue id in server.properties |
String |
Informatica BPM callback parameter. An queue id defined in the server properties in the message queue section which is used as response queue |
Content
The content has to be a DataQualityProfile JSON object which is described in the following table
Field |
Required |
Default |
Datatype |
Parameter description |
rules |
no |
String Array |
Optional parameter which specifies individual rule names in form of a string array. Either rules, ruleGroups, channels or a combination of them must be specified! |
|
ruleGroups |
no |
String Array |
Optional parameter which specifies rule group names in form of a string array. Either rules, ruleGroups, channels or a combination of them must be specified! |
|
channels |
no |
ENTITY_ITEM |
Optional parameter which specifies channels in form of an entity item array. Either rules, ruleGroups, channels or a combination of them must be specified! |
|
reportQuery |
yes |
|
ReportQuery |
the report query which defines the input data set for the data quality check. |
entityIdentifier |
yes |
|
String |
the entity identifier which describes the data type for which the data quality check will be executed. It must correspond to the entity identifier of the rule configuration group. |
Result
An object reference to the data quality job.
Properties of the returned object |
||
Field |
Data type |
Description |
id |
Integer |
The job ID |
Workflow Callback (since 8.0.03)
If a dataquality run is executed with the workflowServiceEnpoint parameter given, the job will create a callback request to the Informatica BPM server with additional information by the time of finish.
Additional information include job information, as well as report information about the filtered entity objects. Based on the report ids it is even possible using the list api to retrieve the items regarding the different status groups.
Workflow Callback (since 10.0)
It is possible to add the query parameters workflowServiceEndpoint and workflowQueueId and workflowCommunicationMode which allows to specify that the response is send via the message queue. The workflowServiceEndpoint is send back as JMS property P360TargetService which allows BPM to call workflow endpoints. The workflowQueueId parameter specifies a queue configured in the server.properties with syntax "queue.[queueId].name".
Workflow Callback Example
In the following example two items have been processed with two rules (GTINRule,ManufacturerRule).
The report results lists reports for following entry keys:
SUCCESSFUL
FAILED
ManufacturerRule_FAILED
GTINRule_SUCCESSFUL
GTINRule_FAILED
The first two categories (SUCCESSFUL, FAILED) shows how many items did fail with any rule (FAILED) or were sucessful for all rules (SUCCESFUL).
The further categories are rule specific (ManufacturerRule_FAILED,GTINRule_SUCCESSFUL,GTINRule_FAILED). They show e.g. for the GTINRule_FAILED entry, how many items did fail specific for the 'GTINRule' rule.
With the report result id it is possible the retrieve the affected items via the byReport query of the List API.
<?xml version=
"1.0"
encoding=
"UTF-8"
standalone=
"yes"
?>
<jobFinished>
<jobId>
24
</jobId>
<stateIdentifier>finished.info</stateIdentifier>
<stateLabel>Completed</stateLabel>
<reportResults>
<entry key=
"SUCCESSFUL"
/>
<entry key=
"FAILED"
>
<reportResult>
<id>
30
</id>
<dataSource>PCM_MASTER</dataSource>
<type>
1
</type>
<purpose>
1
</purpose>
<resultTableName>ReportStoreTempB7</resultTableName>
<count>
2
</count>
<entityIdentifier>Article</entityIdentifier>
</reportResult>
</entry>
<entry key=
"UNKNOWN"
/>
<entry key=
"GTINRule_FAILED"
>
<reportResult>
<id>
21
</id>
<dataSource>PCM_MASTER</dataSource>
<type>
1
</type>
<purpose>
1
</purpose>
<resultTableName>ReportStoreTempB8</resultTableName>
<count>
1
</count>
<entityIdentifier>Article</entityIdentifier>
</reportResult>
</entry>
<entry key=
"ManufacturerRule_FAILED"
>
<reportResult>
<id>
30
</id>
<dataSource>PCM_MASTER</dataSource>
<type>
1
</type>
<purpose>
1
</purpose>
<resultTableName>ReportStoreTempB7</resultTableName>
<count>
2
</count>
<entityIdentifier>Article</entityIdentifier>
</reportResult>
</entry>
<entry key=
"GTINRule_SUCCESSFUL"
>
<reportResult>
<id>
32
</id>
<dataSource>PCM_MASTER</dataSource>
<type>
1
</type>
<purpose>
1
</purpose>
<resultTableName>ReportStoreTempB9</resultTableName>
<count>
1
</count>
<entityIdentifier>Article</entityIdentifier>
</reportResult>
</entry>
</reportResults>
</jobFinished>
Examples
Executing a DQ checks for the catalog TOOLS
In the following examples we assume that there are several rule configurations configured in the PIM system.
Java Rest Client Example
EntityItemReference toolsCatalog = EntityItemReferenceFactory.createByIdentifier(
"TOOLS"
);
EntityItemReference webShopChannel = EntityItemReferenceFactory.createByIdentifier(
"WebShop"
);
EntityItemReference erpChannel = EntityItemReferenceFactory.createByIdentifier(
"ERP"
);
ReportQuery reportQuery =
new
ReportQuery(
"byCatalog"
);
//$NON-NLS-1$
reportQuery.addParameterValue(
"catalog"
, toolsCatalog );
DataQualityProfile profile =
new
DataQualityProfile();
profile.setReportQuery( reportQuery );
profile.setEntityIdentifier(
"Article"
);
profile.addRuleConfigurations(
"item_description_rule1"
,
"item_description_rule3"
);
profile.addRuleConfigurationGroups(
"Item Texts"
,
"Item Attributes"
);
profile.addChannels( webShopChannel, erpChannel );
DataQualityRequest request = getRestClient().createDataQualityRequest();
EntityItemReference job = request.scheduleExecution( profile );
JSON Examples
//POST to http://localhost:1501/rest/V1.0/manage/dataquality/jobs
{
"rules"
:[
"item_description_rule1"
,
"item_description_rule3"
],
"ruleGroups"
:[
"Item Texts"
,
"Item Attributes"
],
"channels"
:[
"'WebShop'"
,
"'ERP'"
],
"entityIdentifier"
:
"Article"
,
"reportQuery"
:{
"identifier"
:
"byCatalog"
,
"parameterList"
:[
{
"key"
:
"Catalog"
,
"value"
:
"'TOOLS'"
}
]
}
}
//POST to http://localhost:1501/rest/V1.0/manage/dataquality/jobs
{
"ruleGroups"
:[
"Item Texts"
,
"Item Attributes"
],
"entityIdentifier"
:
"Article"
,
"reportQuery"
:{
"identifier"
:
"byCatalog"
,
"parameterList"
:[
{
"key"
:
"Catalog"
,
"value"
:
"'TOOLS'"
}
]
}
}
//POST to http://localhost:1501/rest/V1.0/manage/dataquality/jobs
{
"channels"
:[
"'WebShop'"
,
"'ERP'"
],
"entityIdentifier"
:
"Article"
,
"reportQuery"
:{
"identifier"
:
"byCatalog"
,
"parameterList"
:[
{
"key"
:
"Catalog"
,
"value"
:
"'TOOLS'"
}
]
}
}
Deprecated
Please use /manage/dataquality/jobs instead of the following old URL's.
Execute a data quality rule configuration group immediately
General Info
URL Pattern |
/manage/dataquality/job/{rule-configuration-group-name} |
Method |
POST |
Parameters |
- |
Content types |
application/json |
Media types |
application/json |
Result |
The job object of the scheduled data quality job |
Content
The content has to be a JSON object which includes the properties listed below. It's called DataQualityProfile.
DataQualityProfile Properties |
|||||
Field |
Required |
Default |
Datatype |
Parameter description |
|
reportQuery |
yes |
ReportQuery |
the report query which defines the input data set for the data quality check. |
||
entityIdentifier |
yes |
String |
the entity identifier which describes the data typ for which the data quality check will be executed. It must correspond to the entity identifier of the rule configuration group. |
Result
An object reference to the data quality job.
Properties of the returned object |
|||
Field |
Data type |
Description |
|
id |
Integer |
The job ID |
Execute a set of data quality rule configurations
General Info
URL Pattern |
/manage/dataquality/execution |
Method |
POST |
Parameters |
rules=<rule conf name1>,<conf name 2>,... |
Content types |
application/json |
Media types |
application/json |
Result |
The job object of the scheduled data quality job |
Content and result are the same as for the configuration group.
Execute channel's set of data quality rule configurations
General Info
URL Pattern |
/manage/dataquality/execution |
Method |
POST |
Parameters |
channel=<channelId> |
Content types |
application/json |
Media types |
application/json |
Result |
The job object of the scheduled data quality job |
Content and result are the same as for the configuration group. ChannelId is an internal id as it appears in
http://localhost:1501/rest/V1.0/list/Channel/bySearch?query=%20not%20(Channel.Identifier%20is%20empty)&fields=ChannelLang.Name(en)
or an external id in single quotes.
Examples
Executing a dq check on the item descriptions of the catalog TOOLS
In the following example we assume that there is a rule configuration group which is called "item_descriptions" configured within the PIM system. To run this rule configuration immediately against the catalog "TOOLS" you need to execute the following code snippet.
EntityItemReference catalog = EntityItemReferenceFactory.createByIdentifier(
"TOOLS"
);
ReportQuery reportQuery =
new
ReportQuery(
"byCatalog"
);
reportQuery.addParameterValue(
"catalog"
, catalog );
DataQualityProfile dataQualityProfile=
new
DataQualityProfile();
dataQualityProfile.setReportQuery( reportQuery );
dataQualityProfile.setEntityIdentifier(
"Article"
);
EntityItemReference result = getRestClient().createDataQualityRequest().scheduleRuleConfigurationGroup(
"item_descriptions"
,
dataQualityProfile );
To run individual rules item_description_rule1 and item_description_rule3 on the same catalog with JSON
//POST to http://localhost:1501/rest/V1.0/manage/dataquality/execution?rules=item_description_rule1,item_description_rule3
{
"reportQuery"
:{
"identifier"
:
"byCatalog"
,
"parameterList"
:[{
"key"
:
"Catalog"
,
"value"
:
"'TOOLS'"
}]
},
"entityIdentifier"
:
"Article"
}