AnsweredAssumed Answered

Understanding Local Node tmp usage

Question asked by mandoskippy on Sep 26, 2012
Latest reply on Sep 26, 2012 by nabeel
I've noticed some iowait on my nodes for my local OS drive during large import jobs. I am using hive and hive transforms to load my data, and was wondering if some of gurus here could help explain how the local /tmp is used by nodes during hadoop jobs.  

So I see a directory:
/tmp/mapr-hadoop/mapred/local

In the directory are a few other directories:
taskTracker
toBeDeleted
tt_log_tmp
ttprivate


When running a job, what is written here? How much data? How is it used in the hive job?  Should I consider trying to find some faster storage for my /tmp?  In my case, my local OS drives aren't on the fastest sources.  These are shared resources, and if all nodes are hitting them quite a bit, it may make sense to spec out some alternative locations for /tmp.  So that being said, what is written here? How much data? Any mapr specific documentation on how I can tweak this to get the best performance?

Thanks!

Outcomes