Application Service Guide > Data Integration Service Management > Directories for Data Integration Service Files

Directories for Data Integration Service Files

The Data Integration Service accesses file directories when it reads source files, reads control files, writes output files, and writes log files.

When the Data Integration Service runs on multiple nodes, you might need to configure some of the directory properties to use a single shared directory to ensure that the processes running on each node can access all files.

When the Data Integration Service uses operating system profiles, the operating system user specified in the profile must have access to the directories that the Data Integration Service accesses at run time.

Source and Output File Directories

Configure the directories for source and output files in the Execution Options on the Properties view for the Data Integration Service.

The Data Integration Service accesses source files when it runs a mapping or web service operation mapping that reads from a flat file source. The service generates output files when it runs mappings, mappings included in a workflow, profiles, SQL queries to an SQL data service, or web service operation requests. Based on transformation cache settings and target types, the Data Integration Service can generate cache, reject, target, and temporary files.

When you configure directories for the source and output files, you configure the paths for the home directory and its subdirectories. The default value of the Home Directory property is <Informatica installation directory>/tomcat/bin. If you change the default value, verify that the directory exists.

By default, the following directories have values relative to the home directory:

•Temporary directories
•Cache directory
•Source directory
•Target directory
•Rejected files directory

You can define a different directory relative to the home directory. Or, you can define an absolute directory outside the home directory.

If you define a different absolute directory, use the correct syntax for the operating system:

•On Windows, enter an absolute path beginning with a drive letter, colon, and backslash. For example:

C:\<Informatica installation directory>\tomcat\bin\MyHomeDir

•On UNIX, enter an absolute path beginning with a slash. For example:

/<Informatica installation directory>/tomcat/bin/MyHomeDir

Data objects and transformations in the Developer tool use system parameters to access the values of these Data Integration Service directories. By default, the system parameters are assigned to flat file directory, cache file directory, and temporary file directory fields.

For example, when a developer creates an Aggregator transformation in the Developer tool, the CacheDir system parameter is the default value assigned to the cache directory field. The value of the CacheDir system parameter is defined in the Cache Directory property for the Data Integration Service. Developers can remove the default system parameter and enter a different value for the cache directory. However, jobs fail to run if the Data Integration Service cannot access the directory.

Configure Source and Output File Directories for Multiple Nodes

When the Data Integration Service runs on primary and back-up nodes or on a grid, DTM instances can run jobs on each node with the compute role. Each DTM instance must be able to access the source and output file directories. To run mappings that manage metadata changes in flat file sources, each Data Integration Service process must be able to access the source file directories.

When you configure the source and output file directories for a Data Integration Service that runs on multiple nodes, consider the following guidelines:

•You can configure the Source Directory property to use a shared directory to create one directory for source files.

If you run mappings that manage metadata changes in flat file sources and if the Data Integration Service grid is configured to run jobs in separate remote processes, you must configure the Source Directory property to use a shared directory.

If you run other types of mappings or if you run mappings that manage metadata changes in flat file sources on any other Data Integration Service grid configuration, you can configure different source directories for each node with the compute role. Replicate all source files in all of the source directories.

•If you run mappings that use a persistent lookup cache, you must configure the Cache Directory property to use a shared directory. If no mappings use a persistent lookup cache, you can configure the cache directory to have a different directory for each node with the compute role.
•You can configure the Target Directory, Temporary Directories, and Reject File Directory properties to have different directories for each node with the compute role.

To configure a shared directory, configure the directory in the Execution Options on the Properties view. You can configure a shared directory for the home directory so that all source and output file directories use the same shared home directory. Or, you can configure a shared directory for a specific source or output file directory. Remove any overridden values for the same execution option on the Compute view.

To configure different directories for each node with the compute role, configure the directory in the Execution Options on the Compute view.

Control File Directories

The Data Integration Service accesses control files when it runs mappings that generate columns for flat file sources based on control files. When the Data Integration Service runs the mapping, it fetches metadata from the control file of the flat file source.

Use the Developer tool to configure the control file directory for each flat file data object that is configured to generate run-time column names from a control file. You cannot use the Administrator tool to configure a single control file directory used by the Data Integration Service.

Configure Control File Directories for Multiple Nodes

When the Data Integration Service runs on primary and back-up nodes or on a grid, Data Integration Service processes can run on each node with the service role. Each Data Integration Service process must be able to access the control file directories.

Use the Developer tool to configure the Control File Directory property for each flat file data object that is configured to generate run-time column names from a control file. Configure the Control File Directory property in the Advanced properties for the flat file data object. Find the property in the Runtime: Read section.

When the Data Integration Service runs on multiple nodes, use one of the following methods to ensure that each Data Integration Service process can access the directories:

•Configure the Control File Directory property for each flat file data object to use a shared directory to create one directory for control files.
•Configure the Control File Directory property for each flat file data object to use an identical directory path that is local to each node with the service role. Replicate all control files in the identical directory on each node with the service role.

Log Directory

Configure the directory for log files on the Processes view for the Data Integration Service. Data Integration Service log files include files that contain service log events and files that contain job log events.

By default, the log directory for each Data Integration Service process is within the Informatica installation directory on the node.

Configure the Log Directory for Multiple Nodes

When the Data Integration Service runs on primary and back-up nodes or on a grid, a Data Integration Service process can run on each node with the service role. Configure each service process to use the same shared directory for log files.

When you configure a shared log directory, you ensure that if the master service process fails over to another node, the new master service process can access previous log files.

Configure each service process with identical absolute paths to the shared directories. If you use a mapped or mounted drive, the absolute path to the shared location must also be identical.

For example, a newly elected master service process cannot access previous log files when nodes use the following drives for the log directory:

•Mapped drive on node1: F:\shared\<Informatica installation directory>\logs\<node_name>\services\DataIntegrationService\disLogs
•Mapped drive on node2: G:\shared\<Informatica installation directory>\logs\<node_name>\services\DataIntegrationService\disLogs

A newly elected master service process also cannot access previous log files when nodes use the following drives for the log directory:

•Mounted drive on node1: /mnt/shared/<Informatica installation directory>/logs/<node_name>/services/DataIntegrationService/disLogs
•Mounted drive on node2: /mnt/shared_filesystem/<Informatica installation directory>/logs/<node_name>/services/DataIntegrationService/disLogs

Output and Log File Permissions

When a Data Integration Service process generates output or log files, it sets file permissions based on the operating system.

When a Data Integration Service process on UNIX generates an output or log file, it sets the file permissions according to the umask of the shell that starts the Data Integration Service process. For example, when the umask of the shell that starts the Data Integration Service process is 022, the Data Integration Service process creates files with rw-r--r-- permissions. To change the file permissions, you must change the umask of the shell that starts the Data Integration Service process and then restart it.

A Data Integration Service process on Windows generates output and log files with read and write permissions.