Intelligent Data Lake Administrator Guide > Intelligent Data Lake Service > Creating and Managing the Intelligent Data Lake Service
  

Creating and Managing the Intelligent Data Lake Service

Use the Administrator tool to create and manage the Intelligent Data Lake Service. When you change a service property, you must recycle the service or disable and then enable the service for the changes to take affect.

Creating the Intelligent Data Lake Service

Use the service creation wizard in the Administrator tool to create the service.
    1. In the Administrator tool, click the Manage tab.
    2. Click the Services and Nodes view.
    3. In the Domain Navigator, select the domain.
    4. Click Actions > New > Intelligent Data Lake Service.
    5. On the New Intelligent Data Lake Service - Step 1 of 7 page, enter the following properties:
    Property
    Description
    Name
    Name of the Intelligent Data Lake Service. The name is not case sensitive and must be unique within the domain. It cannot exceed 128 characters or begin with @. It also cannot contain spaces or the following special characters: ` ~ % ^ * + = { } \ ; : ' " / ? . , < > | ! ( ) ] [
    Description
    Description of the Intelligent Data Lake Service. The description cannot exceed 765 characters.
    Location
    Location of the Intelligent Data Lake Service in the Informatica domain. You can create the service within a folder in the domain.
    License
    License object with the data lake option that allows the use of the Intelligent Data Lake Service.
    Node Assignment
    Type of node in the Informatica domain on which the Intelligent Data Lake Service runs. Select Single Node if a single service process runs on the node or Primary and Backup Nodes if a service process is enabled on each node for high availability. However, only a single process runs at any given time, and the other processes maintain standby status.
    The Primary and Backup Nodes option will be available for selection based on the license configuration.
    Default is Single Node.
    Node
    Name of the node on which the Intelligent Data Lake Service runs.
    Backup Nodes
    If your license includes high availability, nodes on which the service can run if the primary node is unavailable.
    6. Click Next.
    7. On the New Intelligent Data Lake Service - Step 2 of 7 page, enter the following properties for the Model Repository Service:
    Property
    Description
    Model Repository Service
    Name of the Model Repository Service associated with the Intelligent Data Lake Service.
    Model Repository Service User Name
    User account to use to log in to the Model Repository Service.
    Model Repository Service User Password
    Password for the Model Repository Service user account.
    8. Click Next.
    9. On the New Intelligent Data Lake Service - Step 3 of 7 page, enter the following properties for the Data Preparation Service, Data Integration Service, and Catalog Service:
    Property
    Description
    Data Preparation Service
    Name of the Data Preparation Service associated with the Intelligent Data Lake Service.
    Data Integration Service
    Name of the Data Integration Service associated with the Intelligent Data Lake Service.
    Catalog Service
    Name of the Catalog Service associated with the Intelligent Data Lake Service.
    Catalog Service User Name
    User account to use to log in to the Catalog Service.
    Catalog Service User Password
    Password for the Catalog Service user account.
    Data Lake Resource Name
    Hive resource for the data lake. You configure the resource in Live Data Map Administrator.
    10. Click Next.
    11. On the New Intelligent Data Lake Service - Step 4 of 7 page, enter the following properties:
    Property
    Description
    Hadoop Authentication Mode
    Security mode of the Hadoop cluster for the data lake. If the Hadoop cluster uses Kerberos authentication, you must set the required Hadoop security properties for the cluster.
    HDFS Service Principal Name
    Service principal name (SPN) of the data lake Hadoop cluster.
    Principal Name for User Impersonation
    Service principal name (SPN) of the user account to impersonate when connecting to the data lake Hadoop cluster. The user account for impersonation must be set in the core-site.xml file.
    SPN Keytab File for User Impersonation
    Path and file name of the SPN keytab file for the user account to impersonate when connecting to the Hadoop cluster. The keytab file must be in a directory on the machine where the Intelligent Data Lake Service runs.
    HBase Master Service Principal Name
    Service principal name (SPN) of the HBase Master Service. Use the value set in this file: /etc/hbase/conf/hbase-site.xml.
    HBase RegionServer Service Principal Name
    Service principal name (SPN) of the HBase Region Server service. Use the value set in this file: /etc/hbase/conf/hbase-site.xml.
    HBase User Name
    User name with permissions to access the HBase database.
    12. Click Next.
    13. On the New Intelligent Data Lake Service - Step 5 of 7 page, enter the following properties:
    Property
    Description
    HDFS Connection
    HDFS connection for the data lake.
    HDFS Working Directory
    HDFS directory where the Intelligent Data Lake Service copies temporary data and files necessary for the service to run.
    Hadoop Distribution Directory
    Directory that contains Hadoop distribution files on the machine where Intelligent Data Lake Service runs.
    Hive Connection
    Hive connection for the data lake.
    Hive Table Storage Format
    Data storage format for the Hive tables. Select from the following options:
    • - DefaultFormat
    • - Parquet
    • - ORC
    14. Click Next.
    15. On the New Intelligent Data Lake Service - Step 6 of 7 page, enter the following properties:
    Property
    Description
    Log User Activity Events
    Indicates whether the Intelligent Data Lake service logs the user activity events for auditing. The user activity logs are stored in an Hbase instance.
    HBase ZooKeeper Quorum
    List of host names and port numbers of the ZooKeeper Quorum used to log events. Specify the host names and port numbers as comma-separated key value pairs. For example: <hostname1>:<port1>,<hostname2>:<port2>.
    HBase ZooKeeper Client Port
    Port number on which the ZooKeeper server listens for client connections. Default value is 2181.
    ZooKeeper Parent Znode
    Name of the ZooKeeper znode where the Intelligent Data Lake configuration details are stored.
    HBase Namespace
    Namespace for the HBase tables.
    Number of Rows to Export
    Number of rows to export to a .csv file. You can specify a maximum of 2,000,000,000 rows. Enter a value of -1 to export all rows.
    Number of Recommendations to Display
    The number of recommended data assets to display on the Projects page. You can specify a maximum of 50 recommendations. A value of 0 means no recommendations will be displayed. You can use recommended alternate or additional data assets to improve productivity.
    Data Preparation Sample Size
    Number of sample rows to fetch for data preparation. You can specify a maximum number of 1,000,000 rows and a minimum of 1,000 rows.
    16. Click Next.
    17. On the New Intelligent Data Lake Service - Step 7 of 7 page, enter the following properties:
    Property
    Description
    Log Severity
    Severity of messages to include in the logs. Select from one of the following values:
    • - FATAL. Writes FATAL messages to the log. FATAL messages include nonrecoverable system failures that cause the service to shut down or become unavailable.
    • - ERROR. Writes FATAL and ERROR code messages to the log. ERROR messages include connection failures, failures to save or retrieve metadata, service errors.
    • - WARNING. Writes FATAL, WARNING, and ERROR messages to the log. WARNING errors include recoverable system failures or warnings.
    • - INFO. Writes FATAL, INFO, WARNING, and ERROR messages to the log. INFO messages include system and service change messages.
    • - TRACE. Write FATAL, TRACE, INFO, WARNING, and ERROR code messages to the log. TRACE messages log user request failures.
    • - DEBUG. Write FATAL, DEBUG, TRACE, INFO, WARNING, and ERROR messages to the log. DEBUG messages are user request logs.
    Default value is INFO.
    HTTP Port
    Port number for the HTTP connection to the Intelligent Data Lake Service.
    Enable Secure Communication
    Use a secure connection to connect to the Intelligent Data Lake Service. If you enable secure communication, you must enter all required HTTPS options.
    HTTPS Port
    Port number for the HTTPS connection to the Intelligent Data Lake Service.
    Keystore File
    Path and the file name of keystore file that contains key and certificates required for the HTTPS connection.
    Keystore Password
    Password for the keystore file.
    Truststore File
    Path and the file name of the truststore file that contains authentication certificates for the HTTPS connection.
    Truststore Password
    Password for the truststore file.
    Enable Service
    Select this checkbox if you want to enable the service immediately after you create the service. If you want to enable the service at a later time, in the Domain Navigator, select the service and then select Actions > Enable Service.
    18. Click Finish.

Enabling, Disabling and Recycling the Intelligent Data Lake Service

You can enable, disable, and recycle the service from the Administrator tool.
    1. In the Administrator tool, click the Manage tab > Services and Nodes view.
    2. In the Domain Navigator, select the service.
    3. On the Actions tab, select one of the following options:
    1. a. Enable Service to enable the service.
    2. b. Disable Service to disable the service.
    3. Choose the mode to disable the service in. Optionally, you can choose to specify whether the action was planned or unplanned, and enter comments about the action. If you complete these options, the information appears in the Events and Command History panels in the Domain view on the Manage tab.
    4. c. Recycle Service to recycle the service.

Editing the Intelligent Data Lake Service

To edit the Intelligent Data Lake Service, select the service in the Domain Navigator and click the Properties view. You can change the properties while the service is running, but you must restart the service for the properties to take effect.
To edit the Intelligent Data Lake Service:
    1. To edit specific properties, click the pencil icon in the selected properties area.
    2. In the Edit Properties window, edit the required fields.
    3. Click OK.
    4. Click Actions > Recycle Service.
    5. In the Recycle Service window, select the required options.
    6. Click OK to restart the service.

Deleting the Intelligent Data Lake Service

Only users with ADMIN or WRITE permissions for the Intelligent Data Lake Service can delete the service.
To delete the Intelligent Data Lake Service:
    1. On the Manage tab, select the Services and Nodes view.
    2. In the Domain Navigator, select the Intelligent Data Lake Service.
    3. Disable the Intelligent Data Lake Service by clicking Actions > Disable Service .
    4. To delete the Intelligent Data Lake Service, click Actions > Delete.