Administrator Guide > Understanding Globalization > Code Page Compatibility
  

Code Page Compatibility

Compatibility between code pages is essential for accurate data movement when the PowerCenter Integration Service runs in the Unicode data movement mode.
A code page can be compatible with another code page, or it can be a subset or a superset of another:
For accurate data movement, the target code page must be a superset of the source code page. If the target code page is not a superset of the source code page, the PowerCenter Integration Service may not process all characters, resulting in incorrect or missing data. For example, Latin1 is a superset of US-ASCII. If you select Latin1 as the source code page and US-ASCII as the target code page, you might lose character data if the source contains characters that are not included in US-ASCII.
When you install or upgrade a PowerCenter Integration Service to run in Unicode mode, you must ensure code page compatibility among the domain configuration database, the Administrator tool, PowerCenter Clients, PowerCenter Integration Service process nodes, the PowerCenter repository, the Metadata Manager repository, and the machines hosting pmrep and pmcmd. In Unicode mode, the PowerCenter Integration Service enforces code page compatibility between the PowerCenter Client and the PowerCenter repository, and between the PowerCenter Integration Service process and the PowerCenter repository. In addition, when you run the PowerCenter Integration Service in Unicode mode, code pages associated with sessions must have the appropriate relationships:
Informatica uses code pages for the following components:
Most database servers use two code pages, a client code page to receive data from client applications and a server code page to store the data. When the database server is running, it converts data between the two code pages if they are different. In this type of database configuration, the PowerCenter Integration Service process interacts with the database client code page. Thus, code pages used by the PowerCenter Integration Service process, such as the PowerCenter repository, source, or target code pages, must be identical to the database client code page. The database client code page is usually identical to the operating system code page on which the PowerCenter Integration Service process runs. The database client code page is a subset of the database server code page.
For more information about specific database client and server code pages, see your database documentation.

Domain Configuration Database Code Page

The domain configuration database must be compatible with the code pages of the PowerCenter repository, Metadata Manager repository, and Model repository.
The Service Manager synchronizes the list of users in the domain with the list of users and groups in each application service. If a user name in the domain has characters that the code page of the application service does not recognize, characters do not convert correctly and inconsistencies occur.

Administrator Tool Code Page

The Administrator tool can run on any node in a Informatica domain. The Administrator tool code page is the code page of the operating system of the node. Each node in the domain must use the same code page.
The Administrator tool code page must be:

PowerCenter Client Code Page

The PowerCenter Client code page is the code page of the operating system of the PowerCenter Client. To communicate with the PowerCenter repository, the PowerCenter Client code page must be a subset of the PowerCenter repository code page.

PowerCenter Integration Service Process Code Page

The code page of a PowerCenter Integration Service process is the code page of the node that runs the PowerCenter Integration Service process. Define the code page for each PowerCenter Integration Service process in the Administrator tool on the Processes tab.
However, on UNIX, you can change the code page of the PowerCenter Integration Service process by changing the LANG, LC_CTYPE or LC_ALL environment variable for the user that starts the process.
The code page of the PowerCenter Integration Service process must be:
The code pages of all PowerCenter Integration Service processes must be compatible with each other. For example, you can use MS Windows Latin1 for a node on Windows and ISO-8859-1 for a node on UNIX.
PowerCenter Integration Services configured for Unicode mode validate code pages when you start a session to ensure accurate data movement. It uses session code pages to convert character data. When the PowerCenter Integration Service runs in ASCII mode, it does not validate session code pages. It reads all character data as ASCII characters and does not perform code page conversions.
Each code page has associated sort orders. When you configure a session, you can select one of the sort orders associated with the code page of the PowerCenter Integration Service process. When you run the PowerCenter Integration Service in Unicode mode, it uses the selected session sort order to sort character data. When you run the PowerCenter Integration Service in ASCII mode, it sorts all character data using a binary sort order.
If you run the PowerCenter Integration Service in the United States on Windows, consider using MS Windows Latin1 (ANSI) as the code page of the PowerCenter Integration Service process.
If you run the PowerCenter Integration Service in the United States on UNIX, consider using ISO 8859-1 as the code page for the PowerCenter Integration Service process.
If you use pmcmd to communicate with the PowerCenter Integration Service, the code page of the operating system hosting pmcmd must be identical to the code page of the PowerCenter Integration Service process.
The PowerCenter Integration Service generates the names of session log files, reject files, caches and cache files, and performance detail files based on the code page of the PowerCenter Integration Service process.

PowerCenter Repository Code Page

The PowerCenter repository code page is the code page of the data in the repository. The PowerCenter Repository Service uses the PowerCenter repository code page to save metadata in and retrieve metadata from the PowerCenter repository database. Choose the PowerCenter repository code page when you create or upgrade a PowerCenter repository. When the PowerCenter repository database code page is UTF-8, you can create a PowerCenter repository using UTF-8 as its code page.
The PowerCenter repository code page must be:
A global PowerCenter repository code page must be a subset of the local PowerCenter repository code page if you want to create shortcuts in the local PowerCenter repository that reference an object in a global PowerCenter repository.
If you copy objects from one PowerCenter repository to another PowerCenter repository, the code page for the target PowerCenter repository must be a superset of the code page for the source PowerCenter repository.

Metadata Manager Repository Code Page

The Metadata Manager repository code page is the code page of the data in the repository. The Metadata Manager Service uses the Metadata Manager repository code page to save metadata to and retrieve metadata from the repository database. The Administrator tool writes user and group information to the Metadata Manager Service. The Administrator tool also writes domain information in the repository database. The PowerCenter Integration Service process writes metadata to the repository database. Choose the repository code page when you create or upgrade a Metadata Manager repository. When the repository database code page is UTF-8, you can create a repository using UTF-8 as its code page.
The Metadata Manager repository code page must be:

PowerCenter Source Code Page

The source code page depends on the type of source:
Regardless of the type of source, the source code page must be a subset of the code page of transformations and targets that receive data from the source. The source code page does not need to be a subset of transformations or targets that do not receive data from the source.
Note: Select IBM EBCDIC as the source database connection code page only if you access EBCDIC data, such as data from a mainframe extract file.

PowerCenter Target Code Page

The target code page depends on the type of target:
The target code page must be a superset of the code page of transformations and sources that provide data to the target. The target code page does not need to be a superset of transformations or sources that do not provide data to the target.
The PowerCenter Integration Service creates session indicator files, session output files, and external loader control and data files using the target flat file code page.
Note: Select IBM EBCDIC as the target database connection code page only if you access EBCDIC data, such as data from a mainframe extract file.

Command Line Program Code Pages

The pmcmd and pmrep command line programs require code page compatibility. pmcmd and pmrep use code pages when sending commands in Unicode. Other command line programs do not require code pages.
The code page compatibility for pmcmd and pmrep depends on whether you configured the code page environment variable INFA_CODEPAGENAME for pmcmd or pmrep. You can set this variable for either command line program or for both.
If you did not set this variable for a command line program, ensure the following requirements are met:
If you set the code page environment variable INFA_CODEPAGENAME for pmcmd or pmrep, ensure the following requirements are met:
If the code pages are not compatible, the PowerCenter Integration Service process may not fetch the workflow, session, or task from the PowerCenter repository.

Code Page Compatibility Summary

The following image shows code page compatibility in the Informatica environment:
The following table summarizes code page compatibility between sources, targets, repositories, the Informatica Administrator, PowerCenter Client, and Integration Service process:
Component Code Page
Code Page Compatibility
Source (including relational, flat file, and XML file)
Subset of target.
Subset of lookup data.
Subset of stored procedures.
Subset of External Procedure or Custom transformation procedure code page.
Target (including relational, XML files, and flat files)
Superset of source.
Superset of lookup data.
Superset of stored procedures.
Superset of External Procedure or Custom transformation procedure code page.
Integration Service process creates external loader data and control files using the target flat file code page.
Lookup and stored procedure database
Subset of target.
Superset of source.
External Procedure and Custom transformation procedures
Subset of target.
Superset of source.
Domain Configuration Database
Compatible with the PowerCenter Repository Service.
Compatible with the Metadata Manager repository.
PowerCenter Integration Service process
Compatible with its operating system.
Subset of the PowerCenter repository.
Subset of the Metadata Manager repository.
Superset of the machine hosting pmcmd.
Identical to other nodes running the PowerCenter Integration Service processes.
PowerCenter repository
Compatible with the domain configuration database.
Superset of PowerCenter Client.
Superset of the nodes running the PowerCenter Integration Service process.
Superset of the Metadata Manager repository.
A global PowerCenter repository code page must be a subset of a local PowerCenter repository.
PowerCenter Client
Subset of the PowerCenter repository.
Machine running pmcmd
Subset of the PowerCenter Integration Service process.
Machine running pmrep
Subset of the PowerCenter repository.
Administrator Tool
Subset of the PowerCenter repository.
Subset of the Metadata Manager repository.
Metadata Manager repository
Compatible with the domain configuration database.
Subset of the PowerCenter repository.
Superset of the Administrator tool.
Superset of the PowerCenter Integration Service process.