Shuffle Error: Exceeded MAX_FAILED_UNIQUE_FETCHES; bailing-out. different users

Question asked by vejcik on Mar 27, 2014
Latest reply on Mar 28, 2014 by yufeldman
We're getting the above message during failures of trivial command-line m-r jobs at the reducer. Most references appear to suggest its in the hostname resolution/dns, etc. however, we have an additional observation that the command executes fine for some users and fails for others, so we do not believe the problem is likely to be in the /etc/hosts file for instance. Does anyone have any suggestions on what to look for?