Application Service Guide > Data Integration Service > Data Integration Service Properties
  

Data Integration Service Properties

To view the Data Integration Service properties, select the service in the Domain Navigator and click the Properties view. You can change the properties while the service is running, but you must restart the service for the properties to take effect.

General Properties

The general properties of a Data Integration Service includes name, license, and node assignment.
The following table describes the general properties for the service:
General Property
Description
Name
Name of the service. The name is not case sensitive and must be unique within the domain. It cannot exceed 128 characters or begin with @. It also cannot contain spaces or the following special characters:
` ~ % ^ * + = { } \ ; : ' " / ? . , < > | ! ( ) ] [
You cannot change the name of the service after you create it.
Description
Description of the service. The description cannot exceed 765 characters.
License
License object that allows use of the service.
Assign
Node or grid on which the Data Integration Service runs.
Node
Node on which the service runs.
Grid
Name of the grid on which the Data Integration Service runs if the service runs on a grid. Click the grid name to view the grid configuration.
Backup Nodes
If your license includes high availability, nodes on which the service can run if the primary node is unavailable.

Model Repository Properties

The following table describes the Model repository properties for the Data Integration Service:
Property
Description
Model Repository Service
Service that stores run-time metadata required to run mappings and SQL data services.
User Name
User name to access the Model repository. The user must have the Create Project privilege for the Model Repository Service.
Not available for a domain with Kerberos authentication.
Password
User password to access the Model repository.
Not available for a domain with Kerberos authentication.

Execution Options

The following table describes the execution options for the Data Integration Service:
Property
Description
Use Operating System Profiles and Impersonation
Runs mappings, workflows, and profiling jobs with operating system profiles.
In a Hadoop environment, the Data Integration Service uses the Hadoop impersonation user to run mappings, workflows, and profiling jobs.
You can select this option if the Data Integration Service runs on UNIX or Linux. To apply changes, restart the Data Integration Service.
Launch Job Options
Runs jobs in the Data Integration Service process, in separate DTM processes on the local node, or in separate DTM processes on remote nodes. Configure the property based on whether the Data Integration Service runs on a single node or a grid and based on the types of jobs that the service runs.
Choose one of the following options:
  • - In the service process. Configure when you run SQL data service and web service jobs on a single node or on a grid where each node has both the service and compute roles.
  • - In separate local processes. Configure when you run mapping, profile, and workflow jobs on a single node or on a grid where each node has both the service and compute roles.
  • - In separate remote processes. Configure when you run mapping, profile, and workflow jobs on a grid where nodes have a different combination of roles. If you choose this option when the Data Integration Service runs on a single node, then the service runs jobs in separate local processes.
Default is in separate local processes.
If the Data Integration Service uses operating system profiles, configure to run jobs in separate local processes.
Note: If the Data Integration Service runs on UNIX and is configured to run jobs in separate local or remote processes, verify that the host file on each node with the compute role contains a localhost entry. Otherwise, jobs that run in separate processes fail.
Maximum Execution Pool Size
Maximum number of jobs that each Data Integration Service process can run concurrently. Jobs include data previews, mappings, profiling jobs, SQL queries, and web service requests. For example, a Data Integration Service grid includes three running service processes. If you set the value to 10, each Data Integration Service process can run up to 10 jobs concurrently. A total of 30 jobs can run concurrently on the grid. Default is 10.
Maximum Memory Size
Maximum amount of memory, in bytes, that the Data Integration Service can allocate for running all requests concurrently when the service runs jobs in the Data Integration Service process. When the Data Integration Service runs jobs in separate local or remote processes, the service ignores this value. If you do not want to limit the amount of memory the Data Integration Service can allocate, set this property to 0.
If the value is greater than 0, the Data Integration Service uses the property to calculate the maximum total memory allowed for running all requests concurrently. The Data Integration Service calculates the maximum total memory as follows:
Maximum Memory Size + Maximum Heap Size + memory required for loading program components
Default is 0.
Note: If you run profiles or data quality mappings, set this property to 0.
Maximum Parallelism
Maximum number of parallel threads that process a single mapping pipeline stage.
When you set the value greater than 1, the Data Integration Service enables partitioning for mappings, column profiling, and data domain discovery. The service dynamically scales the number of partitions for a mapping pipeline at run time. Increase the value based on the number of CPUs available on the nodes where jobs run.
In the Developer tool, developers can change the maximum parallelism value for each mapping. When maximum parallelism is set for both the Data Integration Service and the mapping, the Data Integration Service uses the minimum value when it runs the mapping.
Default is 1. Maximum is 64.
Note: Developers cannot change the maximum parallelism value for each profile. When the Data Integration Service converts a profile job into one or more mappings, the mappings always use Auto for the mapping maximum parallelism.
Hadoop Kerberos Service Principal Name
Service Principal Name (SPN) of the Data Integration Service to connect to a Hadoop cluster that uses Kerberos authentication.
Hadoop Kerberos Keytab
The file path to the Kerberos keytab file on the machine on which the Data Integration Service runs.
Temporary Directories
Directory for temporary files created when jobs are run. Default is <home directory>/disTemp.
Enter a list of directories separated by semicolons to optimize performance during profile operations and during cache partitioning for Sorter transformations.
You cannot use the following characters in the directory path:
* ? < > " | , [ ]
Home Directory
Root directory accessible by the node. This is the root directory for other service directories. Default is <Informatica installation directory>/tomcat/bin. If you change the default value, verify that the directory exists.
You cannot use the following characters in the directory path:
* ? < > " | , [ ]
Cache Directory
Directory for index and data cache files for transformations. Default is <home directory>/cache.
Enter a list of directories separated by semicolons to increase performance during cache partitioning for Aggregator, Joiner, or Rank transformations.
You cannot use the following characters in the directory path:
* ? < > " | , [ ]
Source Directory
Directory for source flat files used in a mapping. Default is <home directory>/source.
If the Data Integration Service runs on a grid, you can use a shared directory to create one directory for source files. If you configure a different directory for each node with the compute role, ensure that the source files are consistent among all source directories.
You cannot use the following characters in the directory path:
* ? < > " | , [ ]
Target Directory
Default directory for target flat files used in a mapping. Default is <home directory>/target.
Enter a list of directories separated by semicolons to increase performance when multiple partitions write to the flat file target.
You cannot use the following characters in the directory path:
* ? < > " | , [ ]
Rejected Files Directory
Directory for reject files. Reject files contain rows that were rejected when running a mapping. Default is <home directory>/reject.
You cannot use the following characters in the directory path:
* ? < > " | , [ ]
Informatica Home Directory on Hadoop
The PowerCenter Big Data Edition home directory on every data node created by the Hadoop RPM install. Type /<PowerCenterBigDataEditionInstallationDirectory>/Informatica.
Hadoop Distribution Directory
The directory containing a collection of Hive and Hadoop JARS on the cluster from the RPM Install locations. The directory contains the minimum set of JARS required to process Informatica mappings in a Hadoop environment. Type /<PowerCenterBigDataEditionInstallationDirectory>/Informatica/services/shared/hadoop/[Hadoop_distribution_name].
Data Integration Service Hadoop Distribution Directory
The Hadoop distribution directory on the Data Integration Service node. The contents of the Data Integration Service Hadoop distribution directory must be identical to Hadoop distribution directory on the data nodes. Type <Informatica Installation directory/Informatica/services/shared/hadoop/[Hadoop_distribution_name].

Logical Data Object/Virtual Table Cache Properties

The following table describes the data object and virtual table cache properties:
Property
Description
Cache Removal Time
The number of milliseconds that the Data Integration Service waits before cleaning up cache storage after a refresh. Default is 3,600,000.
Cache Connection
The database connection name for the database that stores the data object cache. Select a valid connection object name.
Maximum Concurrent Refresh Requests
Maximum number of cache refreshes that can occur at the same time. Limit the concurrent cache refreshes to maintain system resources.
Enable Nested LDO Cache
Indicates that the Data Integration Service can use cache data for a logical data object used as a source or a lookup in another logical data object during a cache refresh. If false, the Data Integration Service accesses the source resources even if you enabled caching for the logical data object used as a source or a lookup.
For example, logical data object LDO3 joins data from logical data objects LDO1 and LDO2. A developer creates a mapping that uses LDO3 as the input and includes the mapping in an application. You enable caching for LDO1, LDO2, and LDO3. If you enable nested logical data object caching, the Data Integration Service uses cache data for LDO1 and LDO2 when it refreshes the cache table for LDO3. If you do not enable nested logical data object caching, the Data Integration Service accesses the source resources for LDO1 and LDO2 when it refreshes the cache table for LDO3.
Default is False.

Logging Properties

The following table describes the log level properties:
Property
Description
Log Level
Configure the Log Level property to set the logging level. The following values are valid:
  • - Fatal. Writes FATAL messages to the log. FATAL messages include nonrecoverable system failures that cause the service to shut down or become unavailable.
  • - Error. Writes FATAL and ERROR code messages to the log. ERROR messages include connection failures, failures to save or retrieve metadata, service errors.
  • - Warning. Writes FATAL, WARNING, and ERROR messages to the log. WARNING errors include recoverable system failures or warnings.
  • - Info. Writes FATAL, INFO, WARNING, and ERROR messages to the log. INFO messages include system and service change messages.
  • - Trace. Write FATAL, TRACE, INFO, WARNING, and ERROR code messages to the log. TRACE messages log user request failures.
  • - Debug. Write FATAL, DEBUG, TRACE, INFO, WARNING, and ERROR messages to the log. DEBUG messages are user request logs.

Deployment Options

The following table describes the deployment options for the Data Integration Service:
Property
Description
Default Deployment Mode
Determines whether to enable and start each application after you deploy it to a Data Integration Service. Default Deployment mode affects applications that you deploy from the Developer tool, command line, and Administrator tool.
Choose one of the following options:
  • - Enable and Start. Enable the application and start the application.
  • - Enable Only. Enable the application but do not start the application.
  • - Disable. Do not enable the application.

Pass-through Security Properties

The following table describes the pass-through security properties:
Property
Description
Allow Caching
Allows data object caching for all pass-through connections in the Data Integration Service. Populates data object cache using the credentials from the connection object.
Note: When you enable data object caching with pass-through security, you might allow users access to data in the cache database that they might not have in an uncached environment.

Modules

By default, all Data Integration Service modules are enabled. You can disable some of the modules.
You might want to disable a module if you are testing and you have limited resources on the computer. You can save memory by limiting the Data Integration Service functionality. Before you disable a module, you must disable the Data Integration Service.
The following table describes the Data Integration Service modules:
Module
Description
Web Service Module
Runs web service operation mappings.
Mapping Service Module
Runs mappings and previews.
Profiling Service Module
Runs profiles and generate scorecards.
SQL Service Module
Runs SQL queries from a third-party client tool to an SQL data service.
Workflow Orchestration Service Module
Runs workflows.

HTTP Proxy Server Properties

The following table describes the HTTP proxy server properties:
Property
Description
HTTP Proxy Server Host
Name of the HTTP proxy server.
HTTP Proxy Server Port
Port number of the HTTP proxy server.
Default is 8080.
HTTP Proxy Server User
Authenticated user name for the HTTP proxy server. This is required if the proxy server requires authentication.
HTTP Proxy Server Password
Password for the authenticated user. The Service Manager encrypts the password. This is required if the proxy server requires authentication.
HTTP Proxy Server Domain
Domain for authentication.

HTTP Configuration Properties

The following table describes the HTTP Configuration Properties:
Property
Description
Allowed IP Addresses
List of constants or Java regular expression patterns compared to the IP address of the requesting machine. Use a space to separate multiple constants or expressions.
If you configure this property, the Data Integration Service accepts requests from IP addresses that match the allowed address pattern. If you do not configure this property, the Data Integration Service uses the Denied IP Addresses property to determine which clients can send requests.
Allowed Host Names
List of constants or Java regular expression patterns compared to the host name of the requesting machine. The host names are case sensitive. Use a space to separate multiple constants or expressions.
If you configure this property, the Data Integration Service accepts requests from host names that match the allowed host name pattern. If you do not configure this property, the Data Integration Service uses the Denied Host Names property to determine which clients can send requests.
Denied IP Addresses
List of constants or Java regular expression patterns compared to the IP address of the requesting machine. Use a space to separate multiple constants or expressions.
If you configure this property, the Data Integration Service accepts requests from IP addresses that do not match the denied IP address pattern. If you do not configure this property, the Data Integration Service uses the Allowed IP Addresses property to determine which clients can send requests.
Denied Host Names
List of constants or Java regular expression patterns compared to the host name of the requesting machine. The host names are case sensitive. Use a space to separate multiple constants or expressions.
If you configure this property, the Data Integration Service accepts requests from host names that do not match the denied host name pattern. If you do not configure this property, the Data Integration Service uses the Allowed Host Names property to determine which clients can send requests.
HTTP Protocol Type
Security protocol that the Data Integration Service uses. Select one of the following values:
  • - HTTP. Requests to the service must use an HTTP URL.
  • - HTTPS. Requests to the service must use an HTTPS URL.
  • - HTTP&HTTPS. Requests to the service can use either an HTTP or an HTTPS URL.
When you set the HTTP protocol type to HTTPS or HTTP&HTTPS, you enable Transport Layer Security (TLS) for the service.
You can also enable TLS for each web service deployed to an application. When you enable HTTPS for the Data Integration Service and enable TLS for the web service, the web service uses an HTTPS URL. When you enable HTTPS for the Data Integration Service and do not enable TLS for the web service, the web service can use an HTTP URL or an HTTPS URL. If you enable TLS for a web service and do not enable HTTPS for the Data Integration Service, the web service does not start.
Default is HTTP.

Result Set Cache Properties

The following table describes the result set cache properties:
Property
Description
File Name Prefix
The prefix for the names of all result set cache files stored on disk. Default is RSCACHE.
Enable Encryption
Indicates whether result set cache files are encrypted using 128-bit AES encryption. Valid values are true or false. Default is true.

Mapping Service Properties

The following table describes Mapping Service Module properties for the Data Integration Service:
Property
Description
Maximum Notification Thread Pool Size
Maximum number of concurrent job completion notifications that the Mapping Service Module sends to external clients after the Data Integration Service completes jobs. The Mapping Service Module is a component in the Data Integration Service that manages requests sent to run mappings. Default is 5.
Maximum Memory Per Request
The behavior of Maximum Memory Per Request depends on the following Data Integration Service configurations:
  • - The service runs jobs in separate local or remote processes, or the service property Maximum Memory Size is 0 (default).
  • Maximum Memory Per Request is the maximum amount of memory, in bytes, that the Data Integration Service can allocate to all transformations that use auto cache mode in a single request. The service allocates memory separately to transformations that have a specific cache size. The total memory used by the request can exceed the value of Maximum Memory Per Request.
  • - The service runs jobs in the Data Integration Service process, and the service property Maximum Memory Size is greater than 0.
  • Maximum Memory Per Request is the maximum amount of memory, in bytes, that the Data Integration Service can allocate to a single request. The total memory used by the request cannot exceed the value of Maximum Memory Per Request.
Default is 536,870,912.
Requests include mappings and mappings run from Mapping tasks within a workflow.

Profiling Warehouse Database Properties

The following table describes the profiling warehouse database properties:
Property
Description
Profiling Warehouse Database
The connection to the profiling warehouse.
Select the connection object name.
Maximum Ranks
Number of minimum and maximum values to display for a profile. Default is 5.
Maximum Patterns
Maximum number of patterns to display for a profile. Default is 10.
Maximum Profile Execution Pool Size
Maximum number of threads to run profiling. Default is 10.
Maximum DB Connections
Maximum number of database connections for each profiling job. Default is 5.
Profile Results Export Path
Location where the Data Integration Service exports profile results file.
If the Data Integration Service and Analyst Service run on different nodes, both services must be able to access this location. Otherwise, the export fails.
Maximum Memory Per Request
Maximum amount of memory, in bytes, that the Data Integration Service can allocate for each mapping run for a single profile request.
Default is 536,870,912.

Advanced Profiling Properties

The following table describes the advanced profiling properties:
Property
Description
Pattern Threshold Percentage
Maximum number of values required to derive a pattern. Default is 5.
Maximum # Value Frequency Pairs
Maximum number of value-frequency pairs to store in the profiling warehouse. Default is 16,000.
Maximum String Length
Maximum length of a string that the Profiling Service can process. Default is 255.
Maximum Numeric Precision
Maximum number of digits for a numeric value. Default is 38.
Maximum Concurrent Profile Jobs
The maximum number of concurrent profile threads used to run a profile on flat files and relational sources. If left blank, the Profiling Service plug-in determines the best number based on the set of running jobs and other environment factors.
Maximum Concurrent Columns
Maximum number of columns that you can combine for profiling flat files in a single execution pool thread. Default is 5.
Maximum Concurrent Profile Threads
The maximum number of concurrent execution pool threads used to run a profile on flat files. Default is 1.
Maximum Column Heap Size
Amount of memory to allow each column for column profiling. Default is 64 megabytes.
Reserved Profile Threads
Number of threads of the Maximum Execution Pool Size that are for priority requests. Default is 1.

SQL Properties

The following table describes the SQL properties:
Property
Description
DTM Keep Alive Time
Number of milliseconds that the DTM instance stays open after it completes the last request. Identical SQL queries can reuse the open instance. Use the keep alive time to increase performance when the time required to process the SQL query is small compared to the initialization time for the DTM instance. If the query fails, the DTM instance terminates.
Must be greater than or equal to 0. 0 means that the Data Integration Service does not keep the DTM instance in memory. Default is 0.
You can also set this property for each SQL data service that is deployed to the Data Integration Service. If you set this property for a deployed SQL data service, the value for the deployed SQL data service overrides the value you set for the Data Integration Service.
Table Storage Connection
Relational database connection that stores temporary tables for SQL data services. By default, no connection is selected.
Maximum Memory Per Request
The behavior of Maximum Memory Per Request depends on the following Data Integration Service configurations:
  • - The service runs jobs in separate local or remote processes, or the service property Maximum Memory Size is 0 (default).
  • Maximum Memory Per Request is the maximum amount of memory, in bytes, that the Data Integration Service can allocate to all transformations that use auto cache mode in a single request. The service allocates memory separately to transformations that have a specific cache size. The total memory used by the request can exceed the value of Maximum Memory Per Request.
  • - The service runs jobs in the Data Integration Service process, and the service property Maximum Memory Size is greater than 0.
  • Maximum Memory Per Request is the maximum amount of memory, in bytes, that the Data Integration Service can allocate to a single request. The total memory used by the request cannot exceed the value of Maximum Memory Per Request.
Default is 50,000,000.
Skip Log Files
Prevents the Data Integration Service from generating log files when the SQL data service request completes successfully and the tracing level is set to INFO or higher. Default is false.

Workflow Orchestration Service Properties

The following table describes the Workflow Orchestration Service properties for the Data Integration Service:
Property
Description
Workflow Connection
The connection name of the database that stores the run-time configuration data for the workflows that the Data Integration Service runs. Select a database on the Connections view.
Create the workflow database contents before you run a workflow. To create the contents, use the Actions menu options for the Data Integration Service in the Administrator tool.
Note: Recycle the Data Integration Service after you configure the workflow database connection and before you create the workflow database contents.

Web Service Properties

The following table describes the web service properties:
Property
Description
DTM Keep Alive Time
Number of milliseconds that the DTM instance stays open after it completes the last request. Web service requests that are issued against the same operation can reuse the open instance. Use the keep alive time to increase performance when the time required to process the request is small compared to the initialization time for the DTM instance. If the request fails, the DTM instance terminates.
Must be greater than or equal to 0. 0 means that the Data Integration Service does not keep the DTM instance in memory. Default is 5000.
You can also set this property for each web service that is deployed to the Data Integration Service. If you set this property for a deployed web service, the value for the deployed web service overrides the value you set for the Data Integration Service.
Logical URL
Prefix for the WSDL URL if you use an external HTTP load balancer. For example,
http://loadbalancer:8080
The Data Integration Service requires an external HTTP load balancer to run a web service on a grid. If you run the Data Integration Service on a single node, you do not need to specify the logical URL.
Maximum Memory Per Request
The behavior of Maximum Memory Per Request depends on the following Data Integration Service configurations:
  • - The service runs jobs in separate local or remote processes, or the service property Maximum Memory Size is 0 (default).
  • Maximum Memory Per Request is the maximum amount of memory, in bytes, that the Data Integration Service can allocate to all transformations that use auto cache mode in a single request. The service allocates memory separately to transformations that have a specific cache size. The total memory used by the request can exceed the value of Maximum Memory Per Request.
  • - The service runs jobs in the Data Integration Service process, and the service property Maximum Memory Size is greater than 0.
  • Maximum Memory Per Request is the maximum amount of memory, in bytes, that the Data Integration Service can allocate to a single request. The total memory used by the request cannot exceed the value of Maximum Memory Per Request.
Default is 50,000,000.
Skip Log Files
Prevents the Data Integration Service from generating log files when the web service request completes successfully and the tracing level is set to INFO or higher. Default is false.

Custom Properties for the Data Integration Service

Configure custom properties that are unique to specific environments.
You might need to apply custom properties in special cases. When you define a custom property, enter the property name and an initial value. Define custom properties only at the request of Informatica Global Customer Support.