User Guide > Introduction to Big Data Streaming > Component Architecture
  

Component Architecture

The Big Data Streaming components for a streaming mapping include client tools, application services, repositories, and third-party tools.
The following image shows the components that Big Data Streaming uses for streaming mappings:
The Big Data Streaming component architecture includes client tools, application services, repositories, and third-party tools.

Clients and Tools

Based on your product license, you can use multiple Informatica tools and clients to manage streaming mappings.
Use the following tools to manage streaming mappings:
Informatica Administrator
Monitor the status of mappings on the Monitoring tab of the Administrator tool. The Monitoring tab of the Administrator tool is called the Monitoring tool.
Informatica Developer
Create and run mappings on the Spark engine from the Developer tool.
Informatica Analyst
Create rules in Informatica Analyst and run the rules as mapplets in a streaming mapping.

Application Services

Big Data Streaming uses application services in the Informatica domain to process data. The application services depend on the task you perform.
Big Data Streaming uses the following application services when you create and run streaming mappings:
Data Integration Service
The Data Integration Service processes mappings on the Spark engine in the Hadoop environment. The Data Integration Service retrieves metadata from the Model repository when you run a Developer tool mapping. The Developer tool connects to the Data Integration Service to run mappings.
Metadata Access Service
The Metadata Access Service is a user-managed service that provides metadata from a Hadoop cluster to the Developer tool at design time. HBase, HDFS, Hive, and MapR-DB connections use the Metadata Access Service when you import an object from a Hadoop cluster. Create and configure a Metadata Access Service before you create HBase, HDFS, Hive, MapR Streams, and MapR-DB connections.
Model Repository Service
The Model Repository Service manages the Model repository. The Model Repository Service connects to the Model repository when you run a mapping.
Analyst Service
The Analyst Service runs the Analyst tool in the Informatica domain. The Analyst Service manages the connections between service components and the users that have access to the Analyst tool.

Repository

Big Data Streaming includes a repository to store data related to connections and source metadata. Big Data Streaming uses application services in the Informatica domain to access data in the repository.
Big Data Streaming stores Spark streaming mappings in the Model repository. You can manage the Model repository in the Developer tool.

Third-Party Applications

Big Data Streaming uses third-parties distributions to connect to a Spark engine on a Hadoop cluster.
Big Data Streaming pushes job processing to the Spark engine. It uses YARN to manage the resources on a Spark cluster more efficiently.