Administrator Guide > Understanding Globalization > Case Study: Processing ISO 8859-1 Data
  

Case Study: Processing ISO 8859-1 Data

This case study describes how you might set up an environment to process ISO 8859-1 multibyte data. You might want to configure your environment this way if you need to process data from different Western European languages with character sets contained in the ISO 8859-1 code page. This example describes an environment that processes English and German language data.
For this case study, the ISO 8859-1 environment consists of the following elements:

Configuring the ISO 8859-1 Environment

Use the following guidelines when you configure an environment similar to this case study for ISO 8859-1 data processing:
  1. 1. Verify code page compatibility between the PowerCenter repository database client and the database server.
  2. 2. Verify code page compatibility between the PowerCenter Client and the PowerCenter repository, and between the PowerCenter Integration Service process and the PowerCenter repository.
  3. 3. Set the PowerCenter Integration Service data movement mode to ASCII.
  4. 4. Verify session code page compatibility.
  5. 5. Verify lookup and stored procedure database code page compatibility.
  6. 6. Verify External Procedure or Custom transformation procedure code page compatibility.
  7. 7. Configure session sort order.

Step 1. Verify PowerCenter Repository Database Client and Server Compatibility

The database client and server hosting the PowerCenter repository must be able to communicate without data loss.
The PowerCenter repository resides in an Oracle database. Use NLS_LANG to set the locale (language, territory, and character set) you want the database client and server to use with your login:
NLS_LANG = LANGUAGE_TERRITORY.CHARACTERSET
By default, Oracle configures NLS_LANG for the U.S. English language, the U.S. territory, and the 7-bit ASCII character set:
NLS_LANG = AMERICAN_AMERICA.US7ASCII
Change the default configuration to write ISO 8859-1 data to the PowerCenter repository using the Oracle WE8ISO8859P1 code page. For example:
NLS_LANG = AMERICAN_AMERICA.WE8ISO8859P1
For more information about verifying and changing the PowerCenter repository database code page, see your database documentation.

Step 2. Verify PowerCenter Code Page Compatibility

The PowerCenter Integration Service and PowerCenter Client code pages must be subsets of the PowerCenter repository code page. Because the PowerCenter Client and PowerCenter Integration Service each use the system code pages of the machines they are installed on, you must verify that the system code pages are subsets of the PowerCenter repository code page.
In this case, the PowerCenter Client on Windows systems were purchased in the United States. Thus the system code pages for the PowerCenter Client machines are set to MS Windows Latin1 by default. To verify system input and display languages, open the Regional Options dialog box from the Windows Control Panel. For systems purchased in the United States, the Regional Settings and Input Locale must be configured for English (United States).
The PowerCenter Integration Service is installed on a UNIX machine. The default code page for UNIX operating systems is ASCII. In this environment, change the UNIX system code page to ISO 8859-1 Western European so that it is a subset of the PowerCenter repository code page.

Step 3. Configure the PowerCenter Integration Service for ASCII Data Movement Mode

Configure the PowerCenter Integration Service to process ISO 8859-1 data. In the Administrator tool, set the Data Movement Mode to ASCII for the PowerCenter Integration Service.

Step 4. Verify Session Code Page Compatibility

When you run a workflow in ASCII data movement mode, the PowerCenter Integration Service enforces source and target code page relationships. To guarantee accurate data conversion, the source code page must be a subset of the target code page.
In this case, the environment contains source databases containing German and English data. When you configure a source database connection in the PowerCenter Workflow Manager, the code page for the connection must be identical to the source database code page and must be a subset of the target code page. Since both the MS Windows Latin1 and the ISO 8859-1 Western European code pages contain German characters, you would most likely use one of these code pages for source database connections.
Because the target code page must be a superset of the source code page, use either MS Windows Latin1, ISO 8859-1 Western European, or UTF-8 for target database connection or flat file code pages. To ensure data consistency, the configured target code page must match the target database or flat file system code page.
If you configure the PowerCenter Integration Service for relaxed code page validation, the PowerCenter Integration Service removes restrictions on source and target code page compatibility. You can select any supported code page for source and target data. However, you must ensure that the targets only receive character data encoded in the target code page.

Step 5. Verify Lookup and Stored Procedure Database Code Page Compatibility

Lookup and stored procedure database code pages must be supersets of the source code pages and subsets of the target code pages. In this case, all lookup and stored procedure database connections must use a code page compatible with the ISO 8859-1 Western European or MS Windows Latin1 code pages.

Step 6. Verify External Procedure or Custom Transformation Procedure Compatibility

External Procedure and Custom transformation procedures must be able to process character data from the source code pages, and they must pass characters that are compatible in the target code pages. In this case, all data processed by the External Procedure or Custom transformations must be in the ISO 8859-1 Western European or MS Windows Latin1 code pages.

Step 7. Configure Session Sort Order

When you run the PowerCenter Integration Service in ASCII mode, it uses a binary sort order for all sessions. In the session properties, the PowerCenter Workflow Manager lists all sort orders associated with the PowerCenter Integration Service code page. You can select a sort order for the session.