AnsweredAssumed Answered

Spark on YARN configure scratch dir (SPARK_LOCAL_DIR)

Question asked by dafox on Dec 11, 2015
Latest reply on Dec 17, 2015 by dafox
Hi,
we are trying to run Spark on YARN for a big dataset. We have  experienced an issue with the temporary directory where Spark stores data during data shuffling. Because this is under /tmp/hadoop-mapr by default and our partition hosting /tmp is not very large, our jobs fail.

We would like Spark to store data to a local volume on maprfs. There is documentation how to configure this for [Spark Standalone](http://doc.mapr.com/display/MapR/Configure+Scratch+Directory+for+Spark+Standalone). In case of Spark on YARN this configuration is taken from YARN, however.

How can we configure scratch directory location for Spark on YARN?

Thanks.

Outcomes