AnsweredAssumed Answered

Failure committing: java.io.IOException: Error: Not a directory

Question asked by sorenmacbeth on Nov 2, 2013
Latest reply on Aug 27, 2015 by dannyman
Hello,

I'm running into an issue on our cluster where a task fails to commit. It typically happens when a job has more map tasks than slots, so all slots are used for a solid period of time. here is the full stack trace:

<code><pre>
2013-11-02 20:30:40,084 WARN org.apache.hadoop.mapred.Task: Failure committing: java.io.IOException: Error: Not a directory
  at com.mapr.fs.MapRFileSystem.rename(MapRFileSystem.java:523)
  at org.apache.hadoop.mapred.FileOutputCommitter.moveTaskOutputs(FileOutputCommitter.java:154)
  at org.apache.hadoop.mapred.FileOutputCommitter.moveTaskOutputs(FileOutputCommitter.java:172)
  at org.apache.hadoop.mapred.FileOutputCommitter.commitTask(FileOutputCommitter.java:132)
  at org.apache.hadoop.mapred.OutputCommitter.commitTask(OutputCommitter.java:221)
  at org.apache.hadoop.mapred.Task.commit(Task.java:1040)
  at org.apache.hadoop.mapred.Task.done(Task.java:901)
  at org.apache.hadoop.mapred.MapTask.run(MapTask.java:352)
  at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
  at java.security.AccessController.doPrivileged(Native Method)
  at javax.security.auth.Subject.doAs(Subject.java:396)
  at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
  at org.apache.hadoop.mapred.Child.main(Child.java:264)
</pre></code>

There is also the follow errors reported in stderr from the same task:

<code><pre>
Timing out request 28.15 sent to 10.89.0.125:5660
Other ips are: 10.89.0.125:5660 208.52.187.125:5660
2013-11-02 20:28:45,1607 ERROR Client fs/client/fileclient/cc/client.cc:3468 Thread: 139716839499520 rpc err Connection timed out(110) 28.15 to 10.89.0.125:5660, fid 2565.2555.1742558, upd 1
</code></pre>

This is a cluster running MapR M3  v. 3.0.1.21771.GA

Any help is greatly appreciated.

TIA

Outcomes