Installation and Configuration Guide > Part II: Before You Install the Services > Before You Begin > Verify System Requirements
  

Verify System Requirements

Verify that your environment meets the minimum system requirements for the installation process, temporary disk space, port availability, databases, and application service hardware.
For more information about product requirements and supported platforms, see the Product Availability Matrix on Informatica Network: https://network.informatica.com/community/informatica-network/product-availability-matrices

Verify Temporary Disk Space and Permissions

Verify that your environment meets the minimum system requirements for the temporary disk space.
Disk space for the temporary files
The installer writes temporary files to the hard disk. Verify that you have 1 GB disk space on the machine to support the installation. When the installation completes, the installer deletes the temporary files and releases the disk space.
Permissions for the temporary files
Verify that you have read, write, and execute permissions on the /tmp directory.
For more information about product requirements and supported platforms, see the Product Availability Matrix on Informatica Network: https://network.informatica.com/community/informatica-network/product-availability-matrices

Verify the Distribution

Informatica big data products integrate with the Hadoop environment. You must integrate the domain with the Hadoop environment. The integration varies by product, as do the requirements at installation.
The following table lists the supported Hadoop distribution versions for the big data products:
Distribution
Version
Amazon EMR
5.10 *
Azure HDInsight
3.6.x
Cloudera CDH
5.13
Deferred support for versions 5.11.x, 5.12.x.
Hortonworks HDP
2.6.x
Deferred support for version 2.5.x.
MapR
6.x MEP 4.0.x *
*Enterprise Data Catalog does not support Amazon EMR or MapR.
*Enterprise Data Lake does not support MapR.
The following table lists the installer dependencies on the Hadoop environment for each product:
Product
Installer Dependency on Hadoop Environment
Informatica domain services *
The Hadoop environment is not required at install time. Integrate the environments after installation.
Enterprise Data Catalog
If you choose to use an external cluster, the Hadoop environment is required at install time. If you choose to use an embedded cluster, the Hadoop environment is not required at install time.
Enterprise Data Lake
The Hadoop environment is required at install time if you want to create and enable the Data Preparation Service and the Enterprise Data Lake Service when you run the installer. You complete the environment integration after installation.
*The Informatica domain services installation includes the following big data products: Big Data Management, Big Data Parser, Big Data Quality, and Big Data Streaming.
In each release, Informatica adds, defers, and drops support for Hadoop distribution versions. Informatica might reinstate support for deferred versions in a future release. To see a list of the latest supported versions, see the Product Availability Matrix on the Informatica Customer Portal: https://network.informatica.com/community/informatica-network/product-availability-matrices.

Verify Sizing Requirements

Allocate resources for installation and deployment of services based on the expected deployment type of your environment.
Before you allocate resources, you need to identify the deployment type based on your requirements for the volume of processing and the level of concurrency. Based on the deployment type, you can allocate resources for disk space, cores, and RAM. You can also choose to tune services when you run the installer.

Determine the Installation and Service Deployment Type

The following table describes the environment for the different deployment types:
Deployment Type
Environment Description
Sandbox
Used for proof of concepts or as a sandbox with minimal users.
Basic
Used for low volume processing with low levels of concurrency.
Standard
Used for high volume processing with low levels of concurrency.
Advanced
Used for high volume processing with high levels of concurrency.

Identify Sizing Requirements

The following table provides the minimum sizing requirements:
Deployment Type
Disk Space per Node
Total Cores
RAM per Node
Sandbox
50 GB *
16
32 GB
Basic
100 GB
24
64 GB
Standard
100 GB
48
64 GB
Advanced
100 GB
96
128 GB
* Enterprise Data Catalog requires 100 GB of disk space for a Sandbox deployment type.
The sizing requirements account for the following factors:
The sizing numbers do not account for operational processing and object caching requirements.

Tune During Installation

When you run the installer, you can choose to tune the services based on the deployment size. If you create a Model Repository Service, a Data Integration Service, or a Content Management Service during installation, the installer can tune the services based on the deployment type that you enter. The installer configures properties such as maximum heap size and execution pool size.
You can tune services at any time after you install the services by using the infacmd autotune command. When you run the command, you can tune properties for other services as well as the Hadoop run-time engine properties.

Review Patch Requirements

Before you install the Informatica services, verify that the machine has the required operating system patches and libraries.
The following table lists the patches and libraries required for installation:
Platform
Operating System
Operating System Patch
Linux-x64
Red Hat Enterprise Linux 6.5
All of the following packages, where <version> is any version of the package:
  • - e2fsprogs-libs-<version>.el6
  • - keyutils-libs-<version>.el6
  • - libselinux-<version>.el6
  • - libsepol-<version>.el6
Linux-x64
Red Hat Enterprise Linux 7.0
All of the following packages, where <version> is any version of the package:
  • - e2fsprogs-libs-<version>.el7
  • - keyutils-libs-<version>.el7
  • - libselinux-<version>.el7
  • - libsepol-<version>.el7
Linux-x64
SUSE Linux Enterprise Server 11
Service Pack 2
Linux-x64
SUSE Linux Enterprise Server 12
Service Pack 2

Verify Port Requirements

The installer sets up the ports for components in the Informatica domain, and it designates a range of dynamic ports to use for some application services.
You can specify the port numbers to use for the components and a range of dynamic port numbers to use for the application services. Or you can use the default port numbers provided by the installer. Verify that the port numbers are available on the machines where you install the Informatica domain services, Enterprise Data Catalog, or Enterprise Data Lake.
The following table describes the port requirements for installation:
Port
Description
Node port
Port number for the node created during installation. Default is 6005.
Service Manager port
Port number used by the Service Manager on the node. The Service Manager listens for incoming connection requests on this port. Client applications use this port to communicate with the services in the domain. The Informatica command line programs use this port to communicate to the domain. This is also the port for the SQL data service JDBC/ODBC driver. Default is 6006.
Service Manager Shutdown port
Port number that controls server shutdown for the domain Service Manager. The Service Manager listens for shutdown commands on this port. Default is 6007.
Informatica Administrator port
Port number used by Informatica Administrator. Default is 6008.
Informatica Administrator HTTPS port
No default port. Enter the required port number when you create the service. Setting this port to 0 disables an HTTPS connection to the Administrator tool.
Informatica Administrator shutdown port
Port number that controls server shutdown for Informatica Administrator. Informatica Administrator listens for shutdown commands on this port. Default is 6009.
Minimum port number
Lowest port number in the range of dynamic port numbers that can be assigned to the application service processes that run on this node. Default is 6014.
Maximum port number
Highest port number in the range of dynamic port numbers that can be assigned to the application service processes that run on this node. Default is 6114.
Range of dynamic ports for application services
Range of port numbers that can be dynamically assigned to application service processes as they start up. When you start an application service that uses a dynamic port, the Service Manager dynamically assigns the first available port in this range to the service process. The number of ports in the range must be at least twice the number of application service processes that run on the node. Default is 6014 to 6114.
The Service Manager dynamically assigns port numbers from this range to the Model Repository Service.
HTTPS port for Hadoop distributions
If you deploy Enterprise Data Catalog in an HTTPS-enabled Hadoop distribution, the following are the default port numbers:
  • - Cloudera. 7183
  • - Hortonworks. 8443
  • - Azure HDInsight. 8443
Required only if you install Enterprise Data Catalog.
Static ports for application services
Static ports have dedicated port numbers assigned that do not change. When you create the application service, you can accept the default port number, or you can manually assign the port number.
The following services use static port numbers:
  • - Catalog Service. Default is 9085 for HTTP.
  • - Content Management Service. Default is 8105 for HTTP.
  • - Data Integration Service. Default is 8095 for HTTP.
  • - Data Preparation Service. Default is 8099 for HTTP.
  • - Enterprise Data Lake Service. Default is 9045 for HTTP.
  • - Informatica Cluster Service. Default is 9075 for HTTP.
  • - Mass Ingestion Service. Default is 9050 for HTTP.
  • - Metadata Access Service. Default is 7080 for HTTP.
Note: Services and nodes can fail to start if there is a port conflict.

Guidelines for Port Configuration

The installer validates the port numbers that you specify to ensure that there will be no port conflicts in the domain.
Use the following guidelines to determine the port numbers:

Verify the File Descriptor Limit

Verify that the operating system meets the file descriptor requirement.
Informatica service processes can use a large number of files. To prevent errors that result from the large number of files and processes, you can change system settings with the limit command if you use a C shell, or the ulimit command if you use a Bash shell.
To get a list of the operating system settings, including the file descriptor limit, run the following command:
C Shell
limit
Bash Shell
ulimit -a
Informatica service processes can use a large number of files. Set the file descriptor limit per process to 16,000 or higher. The recommended limit is 32,000 file descriptors per process.
To change system settings, run the limit or ulimit command with the pertinent flag and value. For example, to set the file descriptor limit, run the following command:
C Shell
limit -h filesize <value>
Bash Shell
ulimit -n <value>
Informatica services use a large number of user processes. Use the ulimit -u command to adjust the max user processes setting to a level that is high enough to account for all the processes required by Blaze. Depending on the number of mappings and transformations that might run concurrently, set the file descriptor limit per process to 16,000 or higher.
Run the following command to set the max user processes setting:
C Shell
limit -u processes <value>
Bash Shell
ulimit -u <value>