Hadoop Environment Logs

The Data Integration Service generates log events when you run a mapping in the Hadoop environment.

You can view logs for the Blaze engine, the Spark engine, or the MapReduce or Tez Hive engines. You can view log events relating to different types of errors such as Hadoop connection failures, Hive query failures, Hive command failures, or other Hadoop job failures.

When you run a mapping on the Spark engine, you can view the Scala code in logs that the Logical Data Translation Generater generates from the Informatica mapping.

You can view reject files in the reject file directory specified for the Data Integration Service.

YARN Web User Interface

You can view the applications that ran on a cluster in the YARN web user interface. Click the Monitoring URL for Blaze, Hive, or Spark jobs to access the YARN web user interface.

Blaze, Spark, and Hive engines run on the Hadoop cluster that you configure in the Hadoop connection. The YARN web user interface shows each job that the engine runs as a YARN application.

The application ID is the unique identifier for the application. The application ID is a link to the application summary. The URL is the same as the Monitoring URL in the Administrator tool.

The amount of information in the application logs depends on the tracing level that you configure for a mapping in the Developer tool. The following table describes the amount of information that appears in the application logs for each tracing level:

Tracing Level	Messages
None	The log displays FATAL messages. FATAL messages include non-recoverable system failures that cause the service to shut down or become unavailable.
Terse	The log displays FATAL and ERROR code messages. ERROR messages include connection failures, failures to save or retrieve metadata, service errors.
Normal	The log displays FATAL, ERROR, and WARNING messages. WARNING errors include recoverable system failures or warnings.
Verbose initialization	The log displays FATAL, ERROR, WARNING, and INFO messages. INFO messages include system and service change messages.
Verbose data	The log displays FATAL, ERROR, WARNING, INFO, and DEBUG messages. DEBUG messages are user request logs.

Accessing the Monitoring URL

The Monitoring URL opens the Blaze Job Monitor web application or the YARN web user interface. Access the Monitoring URL from the Execution Statistics view in the Administrator tool.

Viewing Hadoop Environment Logs in the Administrator Tool

You can view log events for a Blaze or Hive mapping from the Monitor tab of the Administrator tool.

1. In the Administrator tool, click the Monitor tab.

2. Select the Execution Statistics view.

3. In the Navigator, choose to open an ad hoc job, a deployed mapping job, or a workflow.

- To choose an ad hoc job, expand a Data Integration Service and click Ad Hoc Jobs.
- To choose a deployed mapping job, expand an application and click Deployed Mapping Jobs.
- To choose a workflow, expand an application and click Workflows.

The list of jobs appears in the contents panel.

4. Click Actions > View Logs for Selected Object to view the run-time logs for the mapping.

The log file shows the results of the Hive queries and Blaze engine queries run by the Data Integration Service. This includes the location of Hive session logs and Hive session history file.

Monitoring a Mapping

You can monitor a mapping that runs in the Hadoop environment.

1. In the Administrator tool, click the Monitor tab.

2. Select the Execution Statistics view.

3. In the Navigator, choose to open an ad hoc job, a deployed mapping job, or a workflow.

- To choose an ad hoc job, expand a Data Integration Service and click Ad Hoc Jobs.

- To choose a deployed mapping job, expand an application and click Deployed Mapping Jobs.

- To choose a workflow, expand an application and click Workflows.

The list of jobs appears in the contents panel.

4. Click a job to view its properties.

The contents panel shows the default Properties view for the job. For a Blaze engine mapping, the Blaze engine monitoring URL appears in the general properties in the details panel. The monitoring URL is a link to the YARN web user interface for Spark jobs.

5. Choose a view in the contents panel to view more information about the job:

- To view the execution plan for the mapping, select the Execution Plan view.
- To view the summary statistics for a job, click the Summary Statistics view.
- To view the detailed statistics for a job, click the Detailed Statistics view.

Note: You can view the number of rows processed in the Summary Statistics for a Hive source or target. The remaining values do not appear for Hive sources and targets.