Create and Configure the Data Preparation Service

1. In the Administrator tool, click the Managetab.

2. Click the Services and Nodes view.

3. In the Domain Navigator, select the domain.

4. Click Actions > New > Data Preparation Service.

5. Enter the following properties:

Property	Description
Name	Name of the Data Preparation service. The name is not case sensitive and must be unique within the domain. It cannot exceed 128 characters or begin with @. It also cannot contain spaces or the following special characters: ` ~ % ^ * + = { } \ ; : ' " / ? . , < > \| ! ( ) ] [
Description	Description of the Data Preparation service. The description cannot exceed 765 characters.
Location	Location of the Data Preparation Service in the Informatica domain. You can create the service within a folder in the domain.
License	License object with the data lake option that allows the use of the Data Preparation Service.
Node Assignment	Type of node in the Informatica domain on which the Data Preparation Service runs. Select Single Node if a single service process runs on the node or Primary and Backup Nodes if a service process is enabled on each node for high availability. However, only a single process runs at any given time, and the other processes maintain standby status. The Primary and Backup Nodes option will be available for selection based on the license configuration. Select the Grid option to ensure horizontal scalability by using grid for the Data Preparation Service with multiple Data Preparation Service nodes. Improved scalability supports high performance, interactive data preparation during increased data volumes and number of users. Each user is assigned a node in the grid using round-robin method to distribute the load across the nodes. Default is Single Node.
Node	Name of the node on which the Data Preparation Service runs.
Backup Nodes	If your license includes high availability, nodes on which the service can run if the primary node is unavailable. Select each backup node on which the service runs.
Grid	Select the grid that you want to use for the Data Preparation Service.

6. Click Next.

7. If you plan to use rules, you must associate the Data Preparation Service with the Model Repository Service that manages the Model repository that contains the rule objects and metadata. You must also associate a Data Integration Service with the Data Preparation Service that runs rules during data preparation.

Enter the following properties for the Model Repository Service and the Data Integration Service required to enable rules:

Property	Description
Model Repository Service Name	Name of the Model Repository Service. The name is not case sensitive and must be unique within the domain. It cannot exceed 128 characters or begin with @. It also cannot contain spaces or the following special characters: ` ~ % ^ * + = { } \ ; : ' " / ? . , < > \| ! ( ) ] [You cannot change the name of the service after you create it.
Model Repository Service User Name	User name to access the Model Repository Service.
Model Repository Service Password	Password to access the Model Repository Service.
Data Integration Service Name	Name of the Data Integration Service.

8. Click Next.

9. Enter the following communication properties:

Property	Description
HTTP Port	Port number for the HTTP connection to the Data Preparation Service.
Enable Secure Communication	Use a secure connection to connect to the Data Preparation Service. If you enable secure communication, you must set all required HTTPS properties, including the keystore and truststore properties.
HTTPS Port	Port number for the HTTPS connection to the Data Preparation Service.
Keystore File	Path and the file name of keystore file that contains key and certificates required for HTTPS communication.
Keystore Password	Password for the keystore file.
Truststore File	Path and the file name of truststore file that contains authentication certificates for the HTTPS connection.
Truststore Password	Password for the truststore file.

10. Click Next.

11. Enter the following Data Preparation repository database connection properties:

Property	Description
Database Type	Type of database to use for the Data Preparation repository.
Host Name	Host name of the machine that hosts a MySQL database.
Port Number	Port number for a MySQL database.
Connection String	Connection string used to access an Oracle database. Use the following connection string format: jdbc:informatica:oracle://<database host name>:<port>;ServiceName=<database name>
Secure JDBC Parameters	Secure JDBC parameters required to access a secure Oracle database. If the database is secure, information such as TrustStore and TrustStorePassword can be included in this field. The information is saved in an encrypted format. Parameters usually configured include the following: EncryptionMethod=<encryption method>;HostNameInCertificate=<host name>;TrustStore=<truststore file name and path>;TrustStorePassword=<truststore password>;KeyStore==<keystore file name and path>;KeyStorePassword=<keystore password>;ValidateServerCertificate=<true}false>
Database User Name	Database user account to use to connect to the database.
Database User Password	Password for the database user account.
Schema Name	Schema or database name for a MySQL database.

12. Click Next.

13. Enter the following rules execution property:

Property	Description
Rules Server Port	Port used by the rules server managed by the Data Preparation Service. Set the value to an available port on the node where the Data Preparation Service runs.

14. Click Next.

15. Enter the following Solr property:

Property	Description
Solr Port	Port number for the Apache Solr server used to provide data preparation recommendations.

16. Click Next.

17. Enter the following data preparation properties:

Property	Description
Local Storage Location	Directory for data preparation file storage on the node where the Data Preparation Service runs.
HDFS Connection	HDFS connection for data preparation file storage.
HDFS Storage Location	HDFS location for data preparation file storage. If the connection to the local storage fails, the Data Preparation Service recovers data preparation files from the HDFS location.

18. Click Next.

19. Enter the following Hive security properties:

Property	Description
Hadoop Authentication Mode	Security mode enabled for the Hadoop cluster for data preparation storage. If the Hadoop cluster uses Kerberos authentication, you must set the required Hadoop security properties for the cluster.
HDFS Service Principal Name	Service Principal Name (SPN) for the data preparation Hadoop cluster. Specify the service principal name in the following format: user/_HOST@REALM
Hadoop Impersonation User Name	User name to use in Hadoop impersonation as set in the Hadoop connection properties. Use the Administrator tool to view Hadoop connection properties.
SPN Keytab File for User Impersonation	Path and file name of the SPN keytab file for the user account to impersonate when connecting to the Hadoop cluster. The keytab file must be in a directory on the machine where the Data Preparation Service runs.

20. Click Next.

21. Enter the following logging configuration property:

Property	Description
Log Severity	Severity of messages to include in the logs. Select from the following values: - FATAL. Writes FATAL messages to the log. FATAL messages include nonrecoverable system failures that cause the service to shut down or become unavailable. - ERROR. Writes FATAL and ERROR code messages to the log. ERROR messages include connection failures, failures to save or retrieve metadata, service errors. - WARNING. Writes FATAL, WARNING, and ERROR messages to the log. WARNING errors include recoverable system failures or warnings. - INFO. Writes FATAL, INFO, WARNING, and ERROR messages to the log. INFO messages include system and service change messages. - TRACE. Write FATAL, TRACE, INFO, WARNING, and ERROR code messages to the log. TRACE messages log user request failures. - DEBUG. Write FATAL, DEBUG, TRACE, INFO, WARNING, and ERROR messages to the log. DEBUG messages are user request logs. Default value is INFO.

Property

Description

Log Severity

Severity of messages to include in the logs. Select from the following values:

- FATAL. Writes FATAL messages to the log. FATAL messages include nonrecoverable system failures that cause the service to shut down or become unavailable.
- ERROR. Writes FATAL and ERROR code messages to the log. ERROR messages include connection failures, failures to save or retrieve metadata, service errors.
- WARNING. Writes FATAL, WARNING, and ERROR messages to the log. WARNING errors include recoverable system failures or warnings.
- INFO. Writes FATAL, INFO, WARNING, and ERROR messages to the log. INFO messages include system and service change messages.
- TRACE. Write FATAL, TRACE, INFO, WARNING, and ERROR code messages to the log. TRACE messages log user request failures.
- DEBUG. Write FATAL, DEBUG, TRACE, INFO, WARNING, and ERROR messages to the log. DEBUG messages are user request logs.

Default value is INFO.

22. Click Finish.

23. Select the Data Integration Service in the Domain Navigator, and then select Actions > Create Repository to create the repository contents.

24. Select Actions > Enable Service to enable the Data Preparation Service.

Create and Configure the Data Preparation Service

Creating the Data Preparation Service