Big Data Management Administrator Guide > Tuning for Big Data Processing > Tuning the Hadoop Run-time Engines
  

Tuning the Hadoop Run-time Engines

Tune the Blaze and Spark engines based on the deployment type. You tune the Blaze and Spark engines to adhere to big data processing requirements.

Tuning the Spark Engine

Tune the Spark engine according to a deployment type that defines the big data processing requirements on the Spark engine. When you tune the Spark engine, the autotune command configures the Spark advanced properties in the Hadoop connection.
The following table describes the advanced properties that are tuned:
Property
Description
spark.driver.memory
The driver process memory that the Spark engine uses to run mapping jobs.
spark.executor.memory
The amount of memory that each executor process uses to run tasklets on the Spark engine.
spark.executor.cores
The number of cores that each executor process uses to run tasklets on the Spark engine.
spark.sql.shuffle.partitions
The number of partitions that the Spark engine uses to shuffle data to process joins or aggregations in a mapping job.
The following table lists the tuned value for each advanced property based on the deployment type:
Property
Sandbox
Basic
Standard
Advanced
spark.driver.memory
1 GB
2 GB
4 GB
4 GB
spark.executor.memory
2 GB
4 GB
6 GB
6 GB
spark.executor.cores
2
2
2
2
spark.sql.shuffle.partitions
100
400
1500
3000

Tuning the Blaze Engine

Tune the Blaze engine to adhere to big data processing requirements on the Blaze engine. When you tune the Blaze engine, the autotune command configures the Blaze advanced properties in the Hadoop connection.
The following table describes the Blaze properties that are tuned:
Property
Description
Value
infagrid.orch.scheduler.oop.container.pref.memory
The amount of memory that the Blaze engine uses to run tasklets.
5120
infagrid.orch.scheduler.oop.container.pref.vcore
The number of DTM instances that run on the Blaze engine.
4
infagrid.tasklet.dtm.buffer.block.size
The amount of buffer memory that a DTM instance uses to move a block of data in a tasklet.
6553600
* The tuned properties do not depend on the deployment type.