Spark Troubleshooting guide: Memory Management: How to troubleshooting out-of-memory (OOM) issues on Spark Executor

Document created by hdevanath Employee on Jun 19, 2017
Version 1Show Document
  • View in full screen mode

The executor memory may need to be configured for optimal performance. Following are example scenarios where a reconfiguration of memory may be required or considered for evaluation:
- Executor OOM error
- Stages running slower than usual

 

 

By default, the memory allocated for Spark executor is 1GB.

If the memory is not adequate this would lead to frequent Full Garbage collection. Full Garbage collection typically results in releasing redundant memory. 
If the amount of memory released after each Full GC cycle is less than 2% in the last 5 consecutive Full GC's, then JVM will throw and Out of Memory exception.
And the stack trace would look something similar to below:

17/06/07 11:25:00 ERROR akka.ErrorMonitor: Uncaught fatal error from thread [sparkDriverakka. actor.default-dispatcher-29] shutting down ActorSystem [sparkDriver] java.lang.OutOfMemoryError: Java heap space Exception in thread "task-result-getter-0" java.lang.OutOfMemoryError: Java heap space


This can hang the executor process and the executor would not progress. Tasks would be eventually killed by with below stack trace
 

Container killed by YARN for exceeding memory limits. 12.0 GB of 12 GB physical memory used. Consider boosting spark.yarn.executor.memoryOverhead 


Increase the Spark executor Memory. If running in Yarn, its recommended to increase the overhead memory as well to avoid OOM issues. Overhead memory is used for JVM threads, internal metadata etc. The following setting is captured as part of the spark-submit or in the spark-defaults.conf file. 

 

--conf “spark.executor.memory=12g”
--conf “spark.yarn.executor.memoryOverhead=2048”
or
-- "executor-memory=12g"

Attachments

    Outcomes