Intelligent Data Lake Administrator Guide > Introduction to Intelligent Data Lake Administration > Architecture and Components

Architecture and Components

Intelligent Data Lake uses a number of components to search, discover, and prepare data.

The following image shows the components that Intelligent Data Lake uses and how they interact:

Note: You must install and configure Live Data Map before you install Intelligent Data Lake. Live Data Map requires additional clients, Informatica services, repositories, and Hadoop services. For more information about Live Data Map architecture, see the Live Data Map Administrator Guide.

Clients

Administrators and analysts use several clients to make data available for analysis in Intelligent Data Lake.

Intelligent Data Lake uses the following clients:

Informatica Live Data Map Administrator: Administrators use Informatica Live Data Map Administrator to administer the resources, scanners, schedules, attributes, and connections that are used to create the catalog. The catalog represents an indexed inventory of all the information assets in an enterprise.
Informatica Administrator: Administrators use Informatica Administrator (the Administrator tool) to manage the application services that Intelligent Data Lake requires. They also use the Administrator tool to administer the Informatica domain and security and to monitor the mappings run during the upload and publishing processes.
Intelligent Data Lake application: Analysts use the Intelligent Data Lake application to search, discover, and prepare data that resides in the data lake. Analysts combine, cleanse, transform, and structure the data to prepare the data for analysis. When analysts finish preparing the data, they publish the transformed data back to the data lake to make available to other analysts.
Informatica Developer: Administrators use Informatica Developer (the Developer tool) to view the mappings created when analysts publish prepared data in the Intelligent Data Lake application. They can operationalize the mappings so that data is regularly written to the data lake.

Application Services

Intelligent Data Lake requires application services to complete operations. Use the Administrator tool to create and manage the application services.

Intelligent Data Lake requires the following application services:

Intelligent Data Lake Service
Data Preparation Service
Catalog Service
Model Repository Service
Data Integration Service

Repositories

The Intelligent Data Lake Service connects to other application services in the Informatica domain that access data from repositories. The Intelligent Data Lake Service does not directly access any repositories.

Intelligent Data Lake requires the following repositories:

Data Preparation repository: When an analyst prepares data in a project, the Data Preparation Service stores worksheet metadata in the Data Preparation repository.
Model repository: When an analyst creates a project, the Intelligent Data Lake Service connects to the Model Repository Service to store the project metadata in the Model repository. When an analyst publishes prepared data, the Intelligent Data Lake Service converts each recipe to a mapping. The Intelligent Data Lake Service connects to the Model Repository Service to store the converted mappings in the Model repository.

Hadoop Services

Intelligent Data Lake connects to several Hadoop services on a Hadoop cluster to read from and write to Hive tables, to write events, and to store sample preparation data.

Intelligent Data Lake connects to the following services in the Hadoop cluster:

HBase: As analysts complete actions in the Intelligent Data Lake application, the Intelligent Data Lake Service writes events to HBase. You can view the events to audit user activity.
Hadoop Distributed File System (HDFS)
HiveServer2