AnsweredAssumed Answered

TT nodes distributed cache failure

Question asked by thealy on Jan 25, 2013
Latest reply on Jun 19, 2013 by thealy
Running V2.0.0 on small (20) M3 cluster.

When running a Map/Reduce job that uses several .jars loaded into the Distributed cache, several (~4) nodes have their map jobs fails because of ClassNotFoundException. All the other nodes proceed through the job normally and the jobs completes. But this is wasting 20-25% of my TT nodes.

Can anyone explain why some nodes might fail to read all the .jars from the Distributed cache?

[Apologies for cross-posting to those of you on hadoop-users; I'm not sure if this is MapR specific or not]

-Terry

Outcomes