Spark Executor Memory Tuning

Galaxy Glossary

How do I tune Spark executor memory settings?

Optimizing the memory allocated to Apache Spark executors to balance performance, stability, and resource utilization.

Sign up for the latest in SQL knowledge from the Galaxy Team!
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Description

Overview

Apache Spark’s in-memory processing shines when memory is tuned correctly. Executor memory settings determine how much RAM each executor process can use for cached data, shuffle buffers, user code, and internal Spark structures. If memory is undersized, you’ll see OutOfMemoryError; if oversized, cluster nodes are under-utilized and scheduling becomes inefficient. Proper tuning requires understanding Spark’s memory model, workload characteristics, and cluster hardware.

1. Spark Memory Model Refresher

1.1 Unified Memory

Since Spark 2.0, Unified Memory Management partitions executor memory into two dynamic regions:

  • Execution memory – used for shuffles, joins, sorts, aggregations.
  • Storage memory – used to cache RDDs/DataFrames.

The regions borrow from each other, but only up to limits set by spark.memory.storageFraction (default 0.5). Unified memory itself occupies spark.memory.fraction (default 0.6) of the executor JVM heap.

1.2 Components of Executor Memory

  • JVM Heap – controlled by --executor-memory (or spark.executor.memory).
  • Off-Heap – allocated with spark.memory.offHeap.size; zero unless off-heap enabled.
  • Memory Overhead – native memory outside the heap, set by spark.executor.memoryOverhead or calculated as max(384MB, 0.10 * executorMemory).

The sum of heap, off-heap, and overhead must fit into the node’s physical RAM minus OS daemons and other services (YARN/Nomad/K8s sidecars).

2. Why Tuning Matters

Poorly chosen executor memory settings lead to:

  • OOM errors & task retries – cascading slowdowns.
  • Garbage Collection (GC) pauses – inflated job latency.
  • Low CPU utilization – too few executors per node.
  • Inefficient shuffle spills – increased disk and network I/O.

Conversely, well-tuned memory delivers predictable performance and maximizes cluster throughput.

3. Step-by-Step Tuning Process

3.1 Profile the Cluster

  1. Record vCPU and RAM per worker node.
  2. Subtract 1 CPU & 2–4 GB RAM for the OS and daemons.

3.2 Choose Executor Cores

Start with 3–5 cores per executor to balance parallelism and GC overhead. More cores → fewer executors → higher heap per executor.

3.3 Derive Heap Size (--executor-memory)

  1. Calculate memory available per executor: (node RAM - OS reserve) / executorsPerNode.
  2. Reserve overhead:overhead = max(384MB, 0.10 * heap).
  3. Compute heap: heap = available – overhead – offHeap.

Example on a 64 GB node, 4 executors/node, 4 GB OS reserve:

available = (64 - 4) / 4 = 15 GB
heap ≈ 13.5 GB (reserving 1.5 GB overhead)

3.4 Fine-Tune Memory Fractions

  • spark.memory.fraction (default 0.6) – lower it (e.g., 0.5) if your code holds large on-heap objects like broadcast vars.
  • spark.memory.storageFraction (default 0.5) – increase if caching datasets is critical; decrease for heavy shuffles.

3.5 Enable Kryo & Compression

Smaller objects → less heap.

spark.serializer org.apache.spark.serializer.KryoSerializer
spark.kryo.registrationRequired true
spark.sql.inMemoryColumnarStorage.compressed true

3.6 Monitor & Iterate

Use Spark UI, Ganglia, CloudWatch, or Datadog to track:

  • Peak storage/execution memory.
  • GC time (Target < 10%).
  • Shuffle spill metrics.
  • Task failures due to OOM.

Increase heap if spill/OOM persists; decrease if GC > 20% or CPUs idle.

4. Practical Example

# 8-core node, 32 GB RAM, YARN cluster
# Reserve 2 GB for OS, run 2 executors per node, 4 cores each

spark-submit \ \
--master yarn \
--deploy-mode cluster \
--num-executors 6 \
--executor-cores 4 \
--executor-memory 13G \
--conf spark.executor.memoryOverhead=2G \
--conf spark.memory.fraction=0.55 \
--conf spark.memory.storageFraction=0.4 \
--conf spark.serializer=org.apache.spark.serializer.KryoSerializer \
my_etl_job.jar

This layout leaves ~1 GB node RAM slack for YARN and OS, keeps GC under control, and minimizes shuffle spills.

5. Best Practices

  • Stay below ~32 GB heap to avoid the 64-bit pointer inflations (compressedoops disabled).
  • Avoid single executor per node; GC pauses skyrocket.
  • Scale executors, not heap, when dataset grows horizontally.
  • Use dynamic allocation with min/max memory bounds in multi-tenant clusters.

6. Common Misconceptions

  1. “Just max out executor memory.” Too large heaps lead to lengthy GC pauses.
  2. “Overhead is optional.” Native libraries (Netty, Python, Arrow) require it; under-estimating causes container-kill OOM.
  3. “Storage vs. execution is fixed.” Unified memory re-allocates at runtime; tuning fractions changes the starting balance, not a hard wall.

7. Integration with Galaxy

Galaxy focuses on SQL authoring. When you offload heavy-duty Spark SQL pipelines to a cluster, ensure your spark.sql workloads are memory-tuned as described. Craft and test your statements in Galaxy, then embed them in Spark jobs with the tuned executor settings for production.

8. Checklist

  • [ ] Record node RAM & cores.
  • [ ] Decide cores/executor (3–5).
  • [ ] Calculate heap & overhead.
  • [ ] Adjust spark.memory.* fractions.
  • [ ] Enable Kryo; compress storage.
  • [ ] Monitor GC, spills, OOM.
  • [ ] Iterate.

Why Spark Executor Memory Tuning is important

Spark’s promise of in-memory speed fails if executors run out of RAM or spend half their time garbage collecting. Proper tuning prevents job failures, minimizes shuffle spills, and ensures every gigabyte of cluster memory translates into useful work. In multi-tenant environments it also avoids noisy-neighbor syndrome by keeping each job within predictable resource envelopes.

Spark Executor Memory Tuning Example Usage


spark-submit --executor-memory 13G --executor-cores 4 my_job.jar

Common Mistakes

Frequently Asked Questions (FAQs)

What is the safest starting point for executor memory?

Begin with 4 GB heap, 512 MB overhead per core, then scale based on Spark UI metrics. This usually avoids immediate OOM while leaving room for OS.

How do off-heap settings affect executor memory?

When spark.memory.offHeap.enabled=true, Spark allocates the specified spark.memory.offHeap.size outside the JVM. Subtract this from available RAM just like overhead.

Why does GC time spike when I increase executor memory?

Larger heaps prolong stop-the-world GC cycles. Instead of raising heap, add more executors or fine-tune GC (e.g., G1, ZGC) if on Java 11+.

Can I rely on dynamic allocation instead of manual tuning?

Dynamic allocation resizes the number of executors, not the memory per executor. You must still set sensible baseline memory and overhead values.

Want to learn about other SQL terms?