Administrator Guide > Understanding Globalization > Globalization Overview
  

Globalization Overview

Informatica can process data in different languages. Some languages require single-byte data, while other languages require multibyte data. To process data correctly in Informatica, you must set up the following items:
To ensure data passes accurately through your environment, the following components must work together:
You can configure the PowerCenter Integration Service for relaxed code page validation. Relaxed validation removes restrictions on source and target code pages.

Unicode

The Unicode Standard is the work of the Unicode Consortium, an international body that promotes the interchange of data in all languages. The Unicode Standard is designed to support any language, no matter how many bytes each character in that language may require. Currently, it supports all common languages and provides limited support for other less common languages. The Unicode Consortium is continually enhancing the Unicode Standard with new character encodings. For more information about the Unicode Standard, see http://www.unicode.org.
The Unicode Standard includes multiple character sets. Informatica uses the following Unicode standards:
Informatica is a Unicode application. The PowerCenter Client, PowerCenter Integration Service, and Data Integration Service use UCS-2 internally. The PowerCenter Client converts user input from any language to UCS-2 and converts it from UCS-2 before writing to the PowerCenter repository. The PowerCenter Integration Service and Data Integration Service converts source data to UCS-2 before processing and converts it from UCS-2 after processing. The PowerCenter repository, Model repository, PowerCenter Integration Service, and Data Integration Service support UTF-8. You can use Informatica to process data in any language.

Working with a Unicode PowerCenter Repository

The PowerCenter repository code page is the code page of the data in the PowerCenter repository. You choose the PowerCenter repository code page when you create or upgrade a PowerCenter repository. When the PowerCenter repository database code page is UTF-8, you can create a PowerCenter repository using the UTF-8 code page.
The domain configuration database uses the UTF-8 code page. If you need to store metadata in multiple languages, such as Chinese, Japanese, and Arabic, you must use the UTF-8 code page for all services in that domain.
The Service Manager synchronizes the list of users in the domain with the list of users and groups in each application service. If a user in the domain has characters that the code page of the application services does not recognize, characters do not convert correctly and inconsistencies occur.
Use the following guidelines when you use UTF-8 as the PowerCenter repository code page: