Richtext conversion

NOTE: this article relates to the PIM version 7.1.03

Media-neutral format

To support the "media-neutral" storage of the rich-texts, the text edited within the RichText-Editor will be converted before saving.

Internal markup

On the conversion some of HTML tags (outputted by the RichText-Editor) will be replaced by our own tags, which are more "media-neutral". Following internal tags are available:

HTML tag

Internal markup

Description

Remarks

<br>

<CRLF>

Line break

see also Rich Text editing

<p>

<PAR>

Paragraph

see also Rich Text editing

On the other side on the loading of a saved richt-text into the RichText-Editor the internal tags will be replaced by the HTML tags again to provide a proper rendition.

Overview of single conversion steps

While conversion there are following steps (executed in the order as below):

  • From XHTML to NEUTRAL:

    1. Carriage returns

      1. all "\n&nbsp;\n" are replaced with "<CRLF>" (related to the CKEditor bug http://dev.ckeditor.com/ticket/12879)

      2. all "\r\n" are replaced with ""

      3. all "\n\t" are replaced with ""

      4. all "\n" are replaced with ""

    2. TABs

      1. all 4-times non-breaking spaces (&nbsp;&nbsp;&nbsp;&nbsp; ) are replaced with "\t"

    3. Non-breaking spaces

      1. all remaining &nbsp; are removed (if the corresponding setting com.heiler.ppm.richtext.server/remove-nbsp is set to true)

    4. Paragraphs

      1. if the paragraph support is enabled (the server-side setting com.heiler.ppm.richtext.server/enable-paragraphs is true):

        1. all "<p>" tags are replaced with "<PAR>"

        2. all "</p>" are replaced with "</PAR>"

      2. if the paragraph support is disabled:

        1. the "<p>" tag at the beginning is removed

        2. the "</p>" at the end is removed

        3. if a paragraph is followed by another paragraph - it is replaced by "<CRLF>"

        4. finally all remaining "<p>" and "</p>" tags are removed

    5. BRs

      1. all "<br />" tags are repalced with "<CRLF>" tags

    6. Ampersands

      1. nothing happens by default (is only a "hook" for customizings)

  • From NEUTRAL to XHTML

    1. Ampersands

      1. nothing happens by default (is only a "hook" for customizings)

    2. BRs

      1. all "<CRLF>" tags are repalced with "<br />" tags

    3. Paragraphs

      1. all "<PAR>" tags are replaced with "<p>"

      2. all "</PAR>" are replaced with "</p>"

      3. all empty paragraphs are replaced with "<p>&nbsp;</p>"

    4. Non-breaking spaces

      1. nothing happens by default (is only a "hook" for customizings)

    5. TABs

      1. all "\t" are replaced with 4-times non-breaking spaces (&nbsp;&nbsp;&nbsp;&nbsp; )

    6. Carriage returns

      1. nothing happens by default (is only a "hook" for customizings)

Customization of individual conversion steps

Each of the conversion steps listed above may be customized. To be able to customize the rich-text conversion, it is necessary to implement and contribute a RichTextMarkupConversionHandler. The conversion handler will be used each time a rich-text should be converted from "XHTML" to "NEUTRAL" and vice versa. There is a default implementation of the conversion handler - DefaultRichTextMarkupConversionHandler - which just contains the standard conversion logic (as described above). It is recommend to use the DefaultRichTextMarkupConversionHandler as a basis-class for the custom implementation. In this way the custom conversion handler have only to override those conversion steps which should be modified. The default conversion handler is located in a public package - com.heiler.ppm.richtext.core.conversion - and can be used as a template for customizings.

To contribute a custom conversion handler just use the extension point com.heiler.ppm.richtext.core.richTextMarkupConversionExtensions:

<extension
point="com.heiler.ppm.richtext.core.richTextMarkupConversionExtensions">
<conversionHandler
class="com.heiler.ppm.customizing.ckeditor.internal.CustomRichTextMarkupConversionHandler"
id="hlr.customizing.ckeditor.internal.customRichTextMarkupConversionHandler"
rank="100">
</conversionHandler>
</extension>

If more then one conversion handler is registered - the one with the higher rank will be used. The DefaultRichTextMarkupConversionHandler is contributed with the rank "1". So use a higher rank for your custom conversion handler to force the usage of this one.

IMPORTANT: put the custom plugin to the client (to make it work in PIM Desktop) or/and to the server (to make it work in PIM Web).

Sample implementation of a custom conversion handler

As a sample implementation of a custom conversion handler and as a potential solution for customers which reported the issue images/infajira.informatica.com/secure/viewavatar.svg HPM-20457 - Problem with non-breaking spaces in rich text Closed the CustomRichTextMarkupConversionHandler was implemented and is located in SDK examples (com.heiler.ppm.customizing.ckeditor).

This sample conversion handler implements a kind of special treatment for non-breaking spaces (NBSPs). At first it differentiates between "wanted" and "unwanted" NBSPs. Wanted NBSPs are real "non-breaking spaces" which were put between two words to prevent a line-break. This NBSPs might be copy&pasted from the other software (like PDF reader) or added somehow else. Important is: the user wants to keep them! Another kind of NBSPs are NBSPs which were added by the CKEditor for example to escape leading/trailing spaces. So the implementation of the sample custom conversion handler tries to find such "unwanted" NBSPs and either remove them or replace them with spaces. Therefore there are two modes: "REMOVE" or "REPLACE". Please set the desired mode for the customer. See the mentioned class in SDK exampes for more implementaion details.