AnsweredAssumed Answered

Out of memory error in reduce

Question asked by heathbar on Dec 1, 2012
Latest reply on Dec 2, 2012 by aaron
I have a map stage that sends about 50GB to each of 45 reducers (using hadoop pipes).  The reducers seem to be killed frequently (so far no reducer has completed but 186 are killed). 

The errors of the killed reducer processes look like this:
<pre>
Killing one of the memory-consuming tasks - attempt_201212020243_0001_r_000041_0, as the cumulative RSS memory usage of all the tasks on the TaskTracker exceeds physical memory limit 5048893440.
</pre>
Some are killed during the copy phase, some are killed during the sort phase, and some are killed during the reduce phase.  While some reducers have started the reduce phase, none of them can complete before they are killed.

The reducer code itself allocates 1 GB ram, my nodes have 7GB and I set the task tracker to limit to 1 map and 2 reduce slots per node.  I'm using native snappy compression and 40 copy threads.

Any tips for sorting out what exactly is eating up all the RAM?

[Link to related log files...](http://graphics.stanford.edu/~heathkh/mapr/hadoop_error_logs_01/)

Outcomes