spark.local.dir is a variable for comma separated directories for spilling data in Spark. I have a MapR node where 1 disk is for OS and the remainder are formatted for MapRFS, what should I do to get the other disks engaged for spill activity? Do I need to reformat disks?
What is the best/recommended practice?
- Have separate Spark Disks (seems a bit odd)
- Monkey with partitioning to have space left on each disk for local temp (best?)
- Write to MaprFS volume with replication 1?
I suspect the right answer is to partition properly up front but I thought I would ask.