Web Search Configuration How to

Short description to accomplish a some tasks for Web Search Configuration.

Tip

Please use the sample export templates as the starting point for defining full-text search indices. If there are some issues, please have a look to the PIM - Web Search Troubleshooting page.

How to perform delta indexing

You can schedule the Full-text search profiles to run on regular intervals.

E.g.

images/download/attachments/342818863/image2020-2-25_11-38-40.png

How to add new items to the search index

If you update the search index by delta indexing, new items will be added automatically.

How to deal with deleted supplier catalogs?

If you want to delete a supplier catalog which is part of an existing multi-catalog search index, you need to rebuild the whole index. But if that search index contains all items from all supplier catalogs and you use the "Use all supplier catalogs" option in the corresponding export template, all items of the deleted supplier catalog will automatically be deleted from the search index.

images/download/attachments/342818863/image2021-11-16_18-51-58.png

How to add further entity fields

  • Best practice is to open the repository and copy & paste the right entity-, subentity- and fieldname.

  • Define a new export format template

  • Add the correct entity-, subentity- and fieldname and add some parameters to the configuration json.

  • Add the correct entity-, subentity- and fieldname into the data modules and sub-modules.

  • You have to rebuild the index with creating and executing a Full-text search profile from PIM - Desktop

E.g.

Entity and Sub-entity Fields in Configuration JSON
{
"rootEntities": [{
"identifier": "Article",
"fields": [{
"identifier": "SupplierAID",
"dataType": "text",
"searchProperties": {
"searchable": true,
"sortable": true
}
},
{
"identifier": "EAN",
"dataType": "text",
"searchProperties": {
"searchable": true,
"sortable": true
}
},
{
"identifier": "CurrentStatus",
"dataType": "text",
"searchProperties": {
"searchable": true,
"sortable": true,
"facetable": true
}
},
{
"identifier": "DeliveryTime",
"dataType": "text",
"searchProperties": {
"searchable": true,
"sortable": true,
"facetable": true
}
},
{
"identifier": "ManufacturerName",
"dataType": "text",
"searchProperties": {
"searchable": true,
"sortable": true,
"facetable": true
}
}
],
"subEntities": [{
"identifier": "ArticleLang",
"fields": [{
"identifier": "DescriptionShort",
"dataType": "text",
"searchProperties": {
"searchable": true,
"sortable": true
},
"qualifications": [
"english"
]
},
{
"identifier": "DescriptionLong",
"dataType": "text",
"searchProperties": {
"searchable": true,
"sortable": true
},
"qualifications": [
"english"
]
},
{
"identifier": "Keyword",
"dataType": "text",
"searchProperties": {
"searchable": true,
"sortable": true,
"facetable": true
},
"qualifications": [
"english"
]
}
]
},
{
"identifier": "ArticlePriceValueSales",
"fields": [{
"identifier": "Amount#1",
"dataType": "double",
"searchProperties": {
"searchable": true,
"sortable": true,
"facetable": true
},
"qualifications": ["Public", "3", "EUR", "US", "2013-03-27", "1.0"]
},
{
"identifier": "Amount#2",
"dataType": "double",
"searchProperties": {
"searchable": true,
"sortable": true,
"facetable": true
},
"qualifications": ["Public", "3", "EUR", "DE", "2013-03-27", "1.0"]
}
]
}
]
}]
}

How to make fields sortable, searchable, facetable

Every field (either entity field or sub-entity field) has got searchProperties, which has properties like

  • sortable

  • searchable

  • facetable

E.g.

{
"identifier": "ManufacturerName",
"dataType": "text",
"searchProperties": {
"searchable": true,
"sortable": true,
"facetable": true
}
}

How to add facet ordering

Every field (either entity field or sub-entity field) has got searchProperties, which has property to provide facetordervalue.

E.g.

{
"identifier": "CurrentStatus",
"dataType": "text",
"searchProperties": {
"searchable": true,
"sortable": true,
"facetable": true,
"facetordervalue":2
}
}

How to add attributes

The configuration json should have a keyValue entry in the subEntities.

E.g.

Attribute Configuration JSON
{
"rootEntities": [
{
"identifier": "Product2G",
"fields": [
{
"identifier": "ProductNo",
"dataType": "keyword",
"searchProperties": {
"searchable": true,
"sortable": true
}
},
{
"identifier": "CurrentStatus",
"dataType": "text",
"searchProperties": {
"searchable": true,
"sortable": true,
"facetable": true
}
}
{
"identifier": "ManufacturerName",
"dataType": "text",
"searchProperties": {
"searchable": true,
"sortable": true,
"facetable": true
}
}
],
"subEntities": [
{
"identifier": "Product2GAttribute",
"keyValue": {
"dataType": "keyword",
"searchProperties": {
"searchable": true,
"sortable": true,
"facetable": true
}
}
}
]
}
]
}

The data sub-module for Attributes should have a Key field and a Value field.

E.g.

images/download/attachments/342818863/image2020-2-25_12-8-53.png

How to configure and index field for a sub-entity with different qualifications

  • In the config file, for the sub-entity, add the desired field as many time as there are required number of qualifications. So for each desired qualification, there will be a field for it. The first field identifier should be added as is and then appropriately fill the qualification. For the next qualification of the same field, add the field with same identifier but append it with #1. For example, for the first field it would be "identifier": "Field" and then for the next one, "identifier": "Field#1" and so on.

  • In the data file, add a submodule corresponding to each qualification. Then call all the submodules from the main module taking care of the appropriate formatting along with separator.

Example:

Adding structure groups for more than 1 Structure Systems

Config file :

Entity and Sub-entity Fields in Configuration JSON
{
"identifier": "Product2GStructureMap",
"fields": [{
"identifier": "ManualMap",
"dataType": "text",
"searchProperties": {
"searchable": true,
"sortable": true,
"facetable": true
},
"qualifications": ["'Structure System 1'"]
},
{
"identifier": "ManualMap#1",
"dataType": "text",
"searchProperties": {
"searchable": true,
"sortable": true,
"facetable": true
},
"qualifications": ["'Structure System 2'"]
}]
}

Data file :

Main module

Entity and Sub-entity Fields in Configuration JSON
{?JSONObjectElement "Product2GStructureMap",
{?JSONArray {$Product2GStructureMap1}{?IfNotEmptyThen {$Product2GStructureMap1}, {?IfNotEmptyThen {$Product2GStructureMap2}, ","}}{$Product2GStructureMap2}}
}

Sub module

Product2GStructureMap1

Entity and Sub-entity Fields in Configuration JSON
{?Compare {?LoopCounter}, 0 ,"", "," }{!
}{?JSONObject
{?JSONStringElement "ManualMap",{&Structure assignments.Structure groups.Name (English)}}
}

Product2GStructureMap2

Entity and Sub-entity Fields in Configuration JSON
{?Compare {?LoopCounter}, 0 ,"", "," }{!
}{?JSONObject
{?JSONStringElement "ManualMap#1",{&Structure assignments.Structure groups.Name (English)}}
}

How to configure a multi value field

Pre-requisite is a field with upper-bound property in repository is set to -1. This means that this field can have multi values.

To activate multi-value fields, you have to use function SplitMultiValuesToCSV in the modules or sub-modules for that field.

E.g.

images/download/attachments/342818863/Multivaluefield.png

How to configure fields based on different logical key combinations

The configuration json should have a field identifier entry having "#<number>" in the subEntities.

E.g.

Configuration JSON for different logical key combinations
{
"rootEntities": [{
"identifier": "Article",
"fields": [{
"identifier": "SupplierAID",
"dataType": "text",
"searchProperties": {
"searchable": true,
"sortable": true
}
},
{
"identifier": "EAN",
"dataType": "text",
"searchProperties": {
"searchable": true,
"sortable": true
}
},
{
"identifier": "CurrentStatus",
"dataType": "text",
"searchProperties": {
"searchable": true,
"sortable": true,
"facetable": true
}
}
],
"subEntities": [{
"identifier": "ArticlePriceValueSales",
"fields": [{
"identifier": "Amount#1",
"dataType": "double",
"searchProperties": {
"searchable": true,
"sortable": true,
"facetable": true
},
"qualifications": ["Public", "3", "EUR", "US", "2013-03-27", "1.0"]
},
{
"identifier": "Amount#2",
"dataType": "double",
"searchProperties": {
"searchable": true,
"sortable": true,
"facetable": true
},
"qualifications": ["Public", "3", "EUR", "DE", "2013-03-27", "1.0"]
}
]
}]
}]
}

The same field identifier entry is used in the data modules and sub-modules.

E.g. Field#1 (note that concerned sub-module is qualified in the same way as in the configuration)

images/download/attachments/342818863/image2020-2-25_13-20-30.png

E.g. Field#2 (note that concerned sub-module is qualified in the same way as in the configuration)

images/download/attachments/342818863/image2020-2-25_13-21-8.png

How to configure language analyzers for fields

Elasticsearch provides a lot of in-built language analyzers like german, english, etc.

These can be configured in the configuration json.

E.g.

Configuration JSON having language analyzer
{
"indexSettings": {
"analysis": {
"analyzer": {
"tri_gram_analyzer": {
"tokenizer": "tri_gram_tokenizer"
}
},
"filter": {
"filter_stop": {
"type": "stop"
}
},
"tokenizer": {
"tri_gram_tokenizer": {
"type": "ngram",
"min_gram": "3",
"max_gram": "4",
"token_chars": [
"letter",
"digit"
]
}
}
},
"rootEntities": [{
"identifier": "Article",
"fields": [{
"identifier": "SupplierAID",
"dataType": "text",
"searchProperties": {
"searchable": true,
"sortable": true
}
},
{
"identifier": "EAN",
"dataType": "text",
"searchProperties": {
"searchable": true,
"sortable": true
}
},
{
"identifier": "CurrentStatus",
"dataType": "text",
"searchProperties": {
"searchable": true,
"sortable": true,
"facetable": true
}
}
],
"subEntities": [{
"identifier": "ArticleLang",
"fields": [{
"identifier": "DescriptionShort",
"dataType": "text",
"searchProperties": {
"searchable": true,
"sortable": true
},
"qualifications": [
"7"
],
"analyzers": [{
"name": "german"
},
{
"name": "tri_gram_analyzer",
"dataType": "text",
"boostFactor": "0.5"
}
]
},
{
"identifier": "DescriptionLong",
"dataType": "text",
"searchProperties": {
"searchable": true,
"sortable": true
},
"qualifications": [
"7"
],
"analyzers": [{
"name": "german"
},
{
"name": "tri_gram_analyzer",
"dataType": "text",
"boostFactor": "0.5"
}
]
},
{
"identifier": "Keyword",
"dataType": "text",
"searchProperties": {
"searchable": true,
"sortable": true,
"facetable": true
},
"qualifications": [
"7"
],
"analyzers": [{
"name": "german"
},
{
"name": "tri_gram_analyzer",
"dataType": "text",
"boostFactor": "0.5"
}
]
}
]
}]
}]
}
}

How to configure number of shards, replicas, other index related settings

There is indexSettings entry available in the configuration json, where you can add any Elasticsearch related index level settings.

E.g.

IndexSettings in the Configuration JSON
{
"indexSettings": {
"index": {
"number_of_shards": "1",
"number_of_replicas": "1"
},
"analysis": {
"analyzer": {
"bi_gram_analyzer": {
"tokenizer": "bi_gram_tokenizer"
}
},
"tokenizer": {
"bi_gram_tokenizer": {
"type": "ngram",
"min_gram": 2,
"max_gram": 2,
"token_chars": [
"letter",
"digit"
]
}
}
}
},
"rootEntities": [{
"identifier": "Product2G",
"fields": [{
"identifier": "ProductNo",
"dataType": "text",
"searchProperties": {
"searchable": true,
"sortable": true,
"facetable": false
}
}],
"subEntities": [{
"identifier": "Product2GLang",
"fields": [{
"identifier": "DescriptionShort",
"dataType": "text",
"searchProperties": {
"searchable": true,
"sortable": true,
"facetable": true,
"facetordervalue": 5
},
"qualifications": ["9"],
"analyzers": [{
"name": "english"
}, {
"name": "bi_gram_analyzer"
}]
},
{
"identifier": "DescriptionLong",
"dataType": "text",
"searchProperties": {
"searchable": true,
"sortable": true,
"facetable": false
},
"qualifications": ["9"],
"analyzers": [{
"name": "english"
}, {
"name": "bi_gram_analyzer"
}]
}
]
}]
}]
}

How to add an entity like Variant

Keep in mind that the 3PPD hierarchy has to be considered.

  • Define a new export format template

  • In the configuration json, defined the parentEntityIdentifier

Configuration JSON for a 3PPD Hierarchy Search Index
{
"rootEntities": [{
"identifier": "Product2G",
"fields": [{
"identifier": "ProductNo",
"dataType": "text",
"searchProperties": {
"searchable": true,
"sortable": true
}
},
{
"identifier": "CurrentStatus",
"dataType": "text",
"searchProperties": {
"searchable": true,
"sortable": true,
"facetable": true
}
},
{
"identifier": "ManufacturerAID",
"dataType": "text",
"searchProperties": {
"searchable": true,
"sortable": true
}
},
{
"identifier": "ManufacturerName",
"dataType": "text",
"searchProperties": {
"searchable": true,
"sortable": true,
"facetable": true
}
}
],
"subEntities": [{
"identifier": "Product2GLang",
"fields": [{
"identifier": "DescriptionShort",
"dataType": "text",
"searchProperties": {
"searchable": true,
"sortable": true
},
"qualifications": ["9"]
},
{
"identifier": "DescriptionLong",
"dataType": "text",
"searchProperties": {
"searchable": true,
"sortable": true
},
"qualifications": ["9"]
}
]
}]
},
{
"identifier": "Variant",
"parentEntityIdentifier": "Product2G",
"fields": [{
"identifier": "VariantNo",
"dataType": "text",
"searchProperties": {
"searchable": true,
"sortable": true
}
},
{
"identifier": "CurrentStatus",
"dataType": "text",
"searchProperties": {
"searchable": true,
"sortable": true,
"facetable": true
}
},
{
"identifier": "ManufacturerAID",
"dataType": "text",
"searchProperties": {
"searchable": true,
"sortable": true
}
},
{
"identifier": "ManufacturerName",
"dataType": "text",
"searchProperties": {
"searchable": true,
"sortable": true,
"facetable": true
}
}
],
"subEntities": [{
"identifier": "VariantLang",
"fields": [{
"identifier": "DescriptionShort",
"dataType": "text",
"searchProperties": {
"searchable": true,
"sortable": true
},
"qualifications": ["9"]
},
{
"identifier": "DescriptionLong",
"dataType": "text",
"searchProperties": {
"searchable": true,
"sortable": true
},
"qualifications": ["9"]
}
]
}]
},
{
"identifier": "Article",
"parentEntityIdentifier": "Variant",
"fields": [{
"identifier": "SupplierAID",
"dataType": "text",
"searchProperties": {
"searchable": true,
"sortable": true
}
},
{
"identifier": "EAN",
"dataType": "text",
"searchProperties": {
"searchable": true,
"sortable": true
}
},
{
"identifier": "CurrentStatus",
"dataType": "text",
"searchProperties": {
"searchable": true,
"sortable": true,
"facetable": true
}
},
{
"identifier": "ManufacturerName",
"dataType": "text",
"searchProperties": {
"searchable": true,
"sortable": true,
"facetable": true
}
}
],
"subEntities": [{
"identifier": "ArticleLang",
"fields": [{
"identifier": "DescriptionShort",
"dataType": "text",
"searchProperties": {
"searchable": true,
"sortable": true
},
"qualifications": ["9"]
},
{
"identifier": "DescriptionLong",
"dataType": "text",
"searchProperties": {
"searchable": true,
"sortable": true
},
"qualifications": ["9"]
}
]
}]
}
]
}
  • In the data modules and sub-modules two things are very important

    • Routing

    • RecordJoin

The items of the Variant and variants of the Product should have the same routing key which the Parent Product is having.

  • You have to rebuild the index with creating and executing a Full-text search profile from PIM - Desktop

How to change the number of search result

The maximum number of search results for an index 'max_result_window' is limited by Elasticsearch to 10000. We recommend to use 100000 for 'max_result_window'. Search requests take heap memory and time proportional to 'max_result_window' value, so it should not increase too much.

The 'max_result_window' value can be change in the export template in the config section in the indexsettings.

Example: 'max_result_window' value is set to 500000:

{
"indexSettings": {
"index": {
"number_of_shards": "4",
"max_result_window" : "500000"
},
"analysis": {
...
}
...
}
}

How to configure and index field for a sub-entity with Enumeration in the qualifications

We can use the export function to get the values of Enumerations and it can be assigned to the field properties.

Example: To assign ISO code of language, IsoCodeLanguage function can be used. '?IsoCodeLanguage Portugese (Brazil)' will return 'pbr'.

Config file :

Sub-entity Fields in Configuration JSON
{
"identifier": "ArticleLang",
"fields": [{
"identifier": "DescriptionShort",
"dataType": "text",
"searchProperties": {
"searchable": true,
"sortable": false,
"facetable": false
},
"qualifications": ["{?IsoCodeLanguage {%language}}"],
"analyzers": [{
"name": "ngram_analyzer",
"dataType": "text",
"boostFactor": "0.5"
}]
}