AnsweredAssumed Answered

Setting mapred.tasktracker.reduce.tasks.maximum for CPU intensive workload

Question asked by mkarthikswamy on Dec 12, 2014

The MapR recommendation for "mapred.tasktracker.reduce.tasks.maximum" is as follows:
Number of reducers: No more than half to two thirds as many reducers as there are disks: set the value of the mapred.tasktracker.reduce.tasks.maximum parameter in mapred-site.xml (the default is -1, which means calculate automatically).

My blades have 20 cores but only 10 disks.

For a highly CPU intensive workload on the reducer side, will it not be better utilization of CPU, to give (cores-2)*2 number of reducers than limiting it based on disks?

Can someone please point at what other aspects to be considered here?

Thanks & Regards