AnsweredAssumed Answered

Underutilized Cluster (Dynamic number of reduce slots?)

Question asked by jerdavis on May 14, 2012
Latest reply on May 24, 2012 by jerdavis

It seems there is a tradeoff when setting max mappers and max reducers, and I'm wondering if there is another setting somewhere to help tune this.

I can configure my cluster to get 95% utilization with a good mix of mappers and reducers. However often times I end up with a final Reduce that takes a long time, and my cluster is only 50% utilized (because I have reserved those resources for Maps that might come along but never will).

On the other hand, if I do allow more reduce slots, it's easy for me to OOM the nodes during normal (mixed) execution.

Any suggestions on how to handle this?