Analyze requirements

Introduction

This chapter describes how to analyze your GDSN requirements. First, you need to compare the needed GDSN attributes with the "Appendix E: Field List" to get a rough overview of how many new attributes need to be added. Please take care that you might need to adjust existing fields as well. This applies to new GDSN versions as well as implementing a new module and moving existing parts or fields respectively in the new module. As result of the analysis you should be able to call out which changes are needed and in case of data model changes, how the according entity and fields must be designed. Based on this analysis, the chapter "Data model" describes how to add new GDSN modules and GDSN attributes as well as creating or adjusting GDSN valid value lists.

In order to be able to analyze your customer's GDSN requirements, you need to have a good understanding of the Product 360 repository and the architecture it is based on.

Questions

The first step is to ask the right questions:

  • Which scenario is used?

    • Data source or data recipient?

    • IM or DSE?

  • Which modules are already available?

  • What kind of change has to be implemented?

    • Is it a missing data field of an existing module?

    • Is it a missing module?

    • Is it a missing validation?

    • Is it a missing value of a valid value list?

    • ...

You can find all supported GDSN attributes by the GDSN Accelerator in the chapter "GDSN Accelerator field list".

Resources

It is essential to have all necessary documents available. These are the GDSN specification documents you can get from the corresponding GS1 home page.

There are several PDF and Excel files describing fields, data types, validations. And there are XSD files you should use to get detailed information about the structure of the GDSN message files, especially of the files containing item data.

In the following examples we use these files:

  • XML Schemas

  • IM_Participant_Dictionary_R7.1.0_v1 (especially tabs IM Participant dictionary and IM Valid Values)

  • GDSN module PDF files

  • Data_Source_1WS_XML_Guide_IM7.0v6.pdf called "Data Source XML Guide" in the rest of the document

  • IM_Validations_Document_R7.1.0_v1.xlsx

Data model - analyze the module

So let's work together on a first "module"... We've been told by the customer, that he needs to store microbiological information and wants to make it available using GDSN IM. Where do we start?

Collect information

The customer is a manufacturer and uses the data source scenario, so let's have a look at "1WorldSync Item Management - Data Source 1WS XML Guide".

When we search for "Microbiological" we first find this line

1WS Catalogue Request Attribute

1WS XML Structure Type

1WS Item Structure Type

Module

FLX

foodAndBevMicrobiological

AGM

O

FoodAndBeveragePropertiesInformation

Y

What does it mean?

1WS Catalogue Request Attribute: This means, that a structure element with the name foodAndBevMicrobiological exists. The structure type is "AGM" which stands for "attribute group many" and indicates that it contains a group of attributes and can occur multiple times in the XML of the Catalogue Request file that is sent to the 1WS Pool.

1WS XML Structure Type: The structure type "AGM" corresponds to the information "FLX: Y" which means that the attribute is a flex attribute. Flex attributes are a generic way to include attributes in the XML structure.

This is an example of how a flex attribute will look like in the XML of the Catalogue Request file:

Flex attribute
<flex>
. . .
<attrGroupMany name ="shipFromPartyInformation">
<row>
<attr name="glnOfShipFromParty">6701115112308</attr >
<attr name="nameOfShipFromParty" qual=”USD”>GLN Name</attrQual>
<row>
<row>
<attr name="glnOfShipFromParty">6701115112308</attr >
<attr name="nameOfShipFromParty" qual=”USD”>GLN Name</attrQual>
<row>
</attrGroupMany>
. . .
</flex>

Here's a list of other structure types, just to give you an idea:

  • A - attribute

  • AM - attribute many

  • AQ - attribute qualified

  • AGM - attribute group many

For further information on flex attributes and structure types see explanations and examples in "1WorldSync Item Management - Data Source 1WS XML Guide".

1WS Item Structure Type: The "O" tells us, that the attribute group is optional.

Module: In the "Module" column it says "FoodAndBeveragePropertiesInformation", so there is no separate module for microbiological information. We should have a look at what else is contained in the "FoodAndBeveragePropertiesInformation" module in order to decide how to design our entities. We note that down and go on for now.

Next we find four lines that seem to be fields

1WS Catalogue Request Attribute

1WS XML
Structure
Type

1WS Item
Structure
Type

1WS
Item
Data
Type

1WS
Item
Data
Length
(Min)

1WS
Item
Data
Length
(Max)

Qualifier

Module

FLX

foodAndBevMicrobiological/

organismCode

A

O

string

1

80

FoodAndBeveragePropertiesInformation

Y

foodAndBevMicrobiological/

organismMaximumValue

AQ

O

ufloat

15

15

uom

FoodAndBeveragePropertiesInformation

Y

foodAndBevMicrobiological/

organismReferenceValue

AQ

O

ufloat

33

2

uom

FoodAndBeveragePropertiesInformation

Y

foodAndBevMicrobiological

/organismWarningValue

AQ

O

ufloat

33

2

uom

FoodAndBeveragePropertiesInformation

Y

What information do we get here?

In the group "foodAndBevMicrobiological" there are four fields

  1. The attribute "organismCode" which is a simple string attribute with a max. length of 80 that is optional.

  2. The qualified attribute "organismMaximumValue" which is a decimal value with a min. length of 15 and a max. length of 15. The Qualifier "uom" is an indicator that we need another field to store the value for the qualifier. In some cases, like for example the language, the qualifier can be a logical key.

  3. and 4. are the qualified fields "organismReferenceValue" and "organismWarningValue", both qualified with a unit, both decimal, both optional, both with a min. length of 33 and a max. length of 2. Well, the length can't be correct. We should check that later in the Participant dictionary. Write that down and go on.

"Uom" stands for "unit of measure". We use the terms "uom" and "unit" interchangeably.

Measurement values always consist of a pair of value and uom.

If we keep searching we find these lines

1WS Catalogue Request Attribute

1WS
XML
Structure
Type

1WS
Item
Structure
Type

1WS
Item
Data
Type

1WS
Item
Data
Length
(Min)

1WS
Item
Data
Length

(Max)

Qualifier

Module

FLX

componentInformation/
foodAndBeveragePropertiesInformation/
foodAndBevMicrobiological/organismCode

A

O

string

1

80

ComponentInformation

Y

componentInformation/
foodAndBeveragePropertiesInformation/
foodAndBevMicrobiological/
organismMaximumValue

AQ

O

ufloat

15

15

uom

ComponentInformation

Y

componentInformation/
foodAndBeveragePropertiesInformation/
foodAndBevMicrobiological/
organismReferenceValue

AQ

O

ufloat

15

15

uom

ComponentInformation

Y

componentInformation/
foodAndBeveragePropertiesInformation/
foodAndBevMicrobiological/
organismWarningValue

AQ

O

ufloat

15

15

uom

ComponentInformation

Y

If you compare this set of fields with the ones we found before you will notice that the attribute names are the same but the paths are different. The module is different, too. It's "ComponentInformation".

Components are a MjR3 feature that we don't support at the moment, so ignore these fields. If you want to know more about components have a look at the GDSN homepage (http://www.gs1.org/gdsn).

Get a better idea of the structure

Since the fields we found are flex attributes, we won't see much of them in the IM XSDs. Sometimes it's hard to imagine the complete structure of a module based only on the textual information given in the table. But we can get a little help from the GDSN XSDs to get a better idea how the structure might look like.

images/download/attachments/109982758/DSE_XSD_FABMI_1.PNG images/download/attachments/109982758/DSE_XSD_FABMI_2.PNG images/download/attachments/109982758/DSE_XSD_FABMI_3.PNG

If we look at the "FoodAndBeverageInformationType", marked in the green we see the occurrences are "0..*", so we can have multiple entries for "FoodAndBeverageMicrobiologicalInformation". This means we have to find a logical key for our data model. We have the organismCode. Since the rest are measurement values, it seems to make sense to have one set of values for each organism. So this is a good candidate for a logical key.

In GDSN there is only the "organismMaximumValue". We also found that field in the "1WorldSync Item Management - Data Source 1WS XML Guide" but additionally there were "organismReferenceValue" and "organismWarningValue". In the XSD we can see that there is an additional "unitOfMeasure" belonging to the "organismMaximumValue". This is no surprise, we already saw that the fields are qualified with a unit and knew that we needed to store this information somewhere.

So let's recap:

We have the "foodAndBeverageMicrobiological" module, which is an attribute group many. This sounds like an entity, doesn't it?

We have identified the fields

  • organismCode

  • organismMaximumValue

  • organismMaximumValueUOM

  • organismReferenceValue

  • organismReferenceValueUOM

  • organismWarningValue

  • organismWarningValueUOM

We also know that "organismCode" is a candidate for a logical key.

Design the entity

You don't know yet, but it will be described in the section "Data model" that there is the entity type ArticleDomainType that is suitable for implementing new modules and that has a sub entity for measurement values. There is an additional key UOMType. Possible values are: METRIC and IMPERIAL. We need to be able to store multiple values, but since units are convertible we don't need to store each and every value we might be using in an output. That's the reason we don't use the unit as the logical key.

So what we have to do is to create an entity like this:

  • Entity: ArticleMicrobiological (based on ArticleDomainType)

    • Logical key: organismCode

    • Field: organismCode

    • Entity: ArticleMicrobiologicalUOM (based on ArticleDomainUOMType)

      • Logical key: UOMType

      • Field: UOMType

      • Field: organismMaximumValue

      • Field: organismMaximumValueUOM

      • Field: organismReferenceValue

      • Field: organismReferenceValueUOM

      • Field: organismWarningValue

      • Field: organismWarningValueUOM

Now, we have the basic structure of our sub entity.

Check the details

Now, we have to check for the details. The details can be found in the Participant dictionary.

images/download/attachments/109982758/Participant_dictionary.PNG

Search for the attributes we found earlier:

foodAndBevMicrobiological/organismCode

GUI Name: Organism Code

This is your English display label. By convention in Product 360, the first letter of the first word is upper case, all following words start with a lower case character except it is a name or another word which is correctly spelled with an upper case character in English.

IM XML Name: foodAndBevMicrobiological/organismCode

This is where you find the path given in the Data Source XML Guide

GDSN XML Name:

Where you find it in the XSDs of the GDSN XML structure.

Mandatory/Optional:

Most of the fields are optional, some of the fields are "M within O group" which means they are mandatory if you form the group. These fields can be set to mandatory in the repository (upper and lower bound = 1). Check if these fields are suitable as logical keys.

Definition:

This is your description in English. Read it, correct it, if it is a whole sentence add a '.' at the end and if there is no valuable descriptive content in there, don't use it.

Datatype: VV/FNBOrganismCode

"VV" means there is a valid value list. "FNBOrganismCode" is the name of the list. The values will be found on the tab "IM Valid Values". Valid value lists most likely contain strings.
Other common data types apart from valid value lists are dateTime, uinteger and ufloat.
→ We need to add an enumeration to the repository as well.

Min. length and max. length:

The field can have values of strings with a length up to 80. Even they say the min. length is 1, since the value is optional I would use a lower bound of 0.

Global/Target market specific: Target Market

Defines if it is possible to maintain different values for different target markets.
→ We need an additional target market key.

Occurrence: 0..1

This can be a problem if we want to use organism code as a logical key. Logical keys are mandatory.

foodAndBevMicrobiological/organismMaximumValue

Most of the information is similar to the above.

The data type is ufloat and the columns of the min./max. length are called "Length (All floats), Min. Length (All non-float Data Types)" and "Precision (All floats), Max. Length (All non-float)". This explains the values 33/2 which made no sense earlier.

Example: Imagine an attribute defined with 15/15. What this means is that 1234567890,12345 is valid and 12345,1234567890 is valid but 12345678,12345678 is not valid because the complete length is greater than 15.

Product 360 can't persist such huge numbers. A BigDecimal16/6 is always used which means the complete number is at most 16 places long - 10 places before the decimal separator and 6 decimal places. However, if the definition is for example 5/2 the max. range should be set to 99999,99 with scale 2. Be aware that this means you can store 88888,888888 in the database anyway because the scale is just a matter of formatting.

foodAndBevMicrobiological/organismReferenceValue and foodAndBevMicrobiological/organismWarningValue

Most of the information is similar to the above.

Have a look at the length and precision. Here in the Participant dictionary it says 15/15 not 33/2 as it did in the Data Source XML Guide. This is an example of conflicting documentation. Note this on your test list and send dummy data to the data pool later. Determine what is correct on the error messages you get back from the data pool.

UOM

We know we need the unit fields as well. We won't find them as separate lines in the participant dictionary, only as qualifier in the line of the attribute they belong to. At a first glance you might wonder how you will be able to create the field with so little information. However since units will always be stored as UnitProxies and corresponding field types are already provided in the ArticleDomainUOMType, there is not much left to be configured.

The only question we have to answer is, which units should be available to the user or in other words which enumeration do we need to add to the unit field. If we go back to the Participant dictionary there is no VV entry in the datatype column, which makes sense because this line is about the measurement value that is a numeric value.

images/download/attachments/109982758/qualifier.PNG

But there are two other columns which will give us the answer we need. There is the qualifier type "uom" and "Qual Valid Value List" "uom". If you go to the tab "IM Valid Values" you will find a list with that name. In the" Attribute Name" column of that tab you will also find the "organismWarningValue" attribute and the other value fields. Check the existing enumerations if there is already one with the matching values or create your own. See section "Data model" for information how to do that.

It has proven to be useful to create an excel sheet with all the information relevant to you. This may include:

  • GDSN attribute name

  • PIM display label

  • Field identifier

  • Data type in GDSN

  • Data type in PIM

  • Valid values

  • Is field mandatory?

  • ...

Logical keys

Let's come back to the hardest decision. What do we use as logical key(s)?

  • Do we need a logical key?
    Yes, foodAndBevMicrobiological is AGM, so we need to be able to store more than one set of values per target market.

  • Why should we use organismCode as logical key?
    It makes sense to have one set of measurement values per organism. It doesn't make much sense to have multiple warning values for the same organism from a business point of view.

  • Why shouldn't we use organismCode as logical key?
    Because organismCode is optional. This means GDSN allows to have measurement values not belonging to one of the organisms in the valid values list.

  • Is there an alternative?
    Can't think of one.

What is the solution then?

The solution is to use "organismCode" as logical key but tweak the valid values a little. Make an enumeration "with optional code". This enumeration has one or more additional values. Most standard enumerations with optional code have one additional value, for example "NONE". This allows the user to store additional value sets, it works with all the generic mechanisms in Product 360 and in the export there is a mechanism which will ensure that this value is not sent to the GDSN data Pool. How many additional entries (if at all) you need depends on the requirements of the customer.

See how to create an enumeration with optional code in the chapter "Data model".

See how to handle enumerations with optional codes in the export in section "Technical details".

Compacting the structure

At the beginning we saw that the attributes related to microbiological information belong to the module "FoodAndBeveragePropertiesInformation". We ignored that up to now. But you might ask yourself if you have to fit a complete module into one Article sub entity.

The answer is definitely 'no'.

Let's see what else is contained in the module "FoodAndBeveragePropertiesInformation": What we find are physiochemical properties.

1WS Catalogue Request Attribute

1WS XML Structure Type

1WS Item Structure Type

Module

FLX

physioChemicalProperties

AGM

O

FoodAndBeveragePropertiesInformation

Y

1WS Catalogue Request Attribute

1WS
XML
Structure
Type

1WS
Item
Structure
Type

1WS
Item
Data
Type

1WS
Item
Data
Length
(Min)

1WS
Item
Data
Length
(Max)

Qualifier

Module

FLX

physioChemicalProperties/

physioChemicalCharacteristicCode

A

O

string

1

80

FoodAndBeveragePropertiesInformation

Y

physioChemicalProperties/

physioChemicalCharacteristicValue

AQM

O

ufloat

15

15

uom

FoodAndBeveragePropertiesInformation

Y

Does this information have a relation to microbiological information?
No. So we probably can implement the physiochemical properties in its own sub entity. When we look at the GDSN XSDs, we get a confirmation of our assumption. FoodAndBeverageMicrobiologicalInformationType and FoodAndBeveragePhysioChemicalCharacteristicType are two separate types on the same level as FoodAndBeverageAllergyRelatedInformation and FoodAndBeverageDietRelatedInformation.

images/download/attachments/109982758/DSE_XSD_FAB.png

Deep structures

Some modules, for example the ingredient information, have a pretty deep structure with up to ~ 10 nested levels of XML tags. The ArticleDomainType has a depth of 3 (+1 for the item itself).

If you encounter a module with a deep structure you have to get creative and shrink it down to 3 levels.

To get you started here are two possible ways to do that:

  1. Extraction
    Certifications are an example of Extraction. You can add a certification for the item itself and you can add a certification for a specific diet. The XML sub structure that stores the certification information is the same in both cases. Furthermore if you think object oriented, a certification is a self-contained complete object. So in Product 360 that part was extracted into its own root entity and only the entity proxy is stored in the corresponding sub entities of the item.

  2. Compacting

    images/download/attachments/109982758/CutUpperLevels.PNG

    As you can see "fishMeatPoultryContentInformation" is contained in "FoodAndBeverageMarketingInformationExtension". But it has no relation to the rest of the information in the marketing information. This is a case where the container is not really needed for the logical structure of the information and we can skip this layer. The occurrence of "0..1" confirms that no sub entity layer with logical keys is needed at this point.

Ways to ensure data consistency

In the course of creating the data model, you should start thinking about data consistency or in other words validations.

There are two sources for information about validations. The Participant dictionary contains a lot of basic validations like the max. field length. Complex validations, you have to take into account for the design of your entity, are listed in the "IM Validations Document".

Data model

There are lower level validations only affecting a single field that can be configured in the repository.

Valid value lists

Description: Some fields only allow a certain set of values.

Where to find: Information can be found in the Participant dictionary in column "DataType", for qualifiers in column "Qual Valid Value List". List entries are found in tab "IM Valid Values".

How to implement: In Product 360 this kind of validation is ensured by the enumeration you add to a field.

Example: "organismCode" has valid value list "FNBOrganismCode" and is implemented as Enum.OrganismCode.WithOptionalCode at logical key ArticleMicrobiologics.LK.OrganismCode and field ArticleMicrobiologics.OrganismCode

Min. and max. values

Description: Numeric values, especially measurement values can be restricted to a certain range.

Where to find: Information can be found either in the Participant dictionary, in the columns "Length (All floats), Min. Length (All non-float Data Types)" and "Precision (All floats), Max. Length (All non-float)" or in the Validations document.

How to implement: In Product 360 this kind of validation is ensured by the entries in the field properties "Min. Range" and "Max. Range"

Example: "OrganismMaximumValue" has a max. length of 15 places

Other examples from the validations document:

  • The value in - Qty of Next Level Item(s) (formerly Pack) is greater than 1 and less than 999999.

  • If fatPercentageInDryMatter is not empty then value must be greater than or equal to 0 and less than or equal to 100.00.

Closely related to min. and max. values is the max. length of string values.

Example: GTIN Name: Value must be between 1 and 40 characters.

In this case use properties min. and max. length.

Mandatory fields

Description: Some fields are mandatory globally or mandatory in an optional or mandatory group.

Where to find: Information can either be found in the Participant dictionary, in column "Mandatory/Optional", or in the validations document.

How to implement: Set the lower bound to 1 in order to make a field mandatory within an entry of a sub entity.

Hint: Logical keys are always mandatory in Product 360.

Examples: GTINName is a required field

More information on configuration of the repository can be found in the section "Data model"

You should think about these kind of validations now!

Data Quality

Then there are more complex validations that are affecting multiple fields at once, depend on a specific value or target market. Most likely they will be implemented using DQ rule configurations.

Examples:

  • If promotionalTypeCode is populated, then isConsumerUnit must be true.

  • For each occurrence of the Loopgroup “promotional”, attributes freeQtyOfNextLowerLevel and freeQtyOfProduct cannot both be populated.

  • If targetMarketCountryCode is equal to '752' then packagingMaterialTypeCode and packagingMaterialCompositionQuantity are used in pairs. I.e. if one is populated the other one must be populated, too.

  • If grossWeight and netWeight are provided on the same record, grossWeight must be greater than or equal to netWeight

  • There must be at most one iteration of minimumFishMeatPoultryContent per Unit Of Measure

For further information see chapter "Data validations"

Export

When you have to output data into export files you should ensure to create well-formatted values, details can be found in the chapter "Data validations".

Summary

Analyze Module Summary

  1. Get your documentation documents

  2. Collect the fields of your module

  3. Get an idea of the structure intended by GDSN

  4. Try to fit the structure in an existing entity type

  5. Find your logical keys

  6. Implement the entity with the information from the section "Data Model"

  7. Think about validations

  8. In the process note all assumptions, discrepancies and open questions for later testing.