AnsweredAssumed Answered

error=11 resource not available

Question asked by siddharth1988 on Nov 27, 2013
Hi team when I run smaller jobs its fine but as soon as I run larger jobs say teragen and terasort error happens.
below is the trace

2013-11-27 19:01:50,678 WARN org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

2013-11-27 19:01:51,051 WARN mapreduce.Counters: Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead

2013-11-27 19:01:51,539 WARN org.apache.hadoop.conf.Configuration: session.id is deprecated. Instead, use dfs.metrics.session-id

2013-11-27 19:01:51,540 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=MAP, sessionId=

2013-11-27 19:01:51,867 INFO org.apache.hadoop.util.ProcessTree: setsid exited with exit code 0

2013-11-27 19:01:51,870 INFO org.apache.hadoop.mapred.Task:  Using ResourceCalculatorPlugin :org.apache.hadoop.util.LinuxResourceCalculatorPlugin@4a0bd13d

2013-11-27 19:01:52,217 INFO org.apache.hadoop.mapred.MapTask: Processing split:org.apache.hadoop.examples.terasort.TeraGen$RangeInputFormat$RangeInputSplit@6c30aec7

2013-11-27 19:01:52,222 WARN mapreduce.Counters: Counter name MAP_INPUT_BYTES is deprecated. Use FileInputFormatCounters as group name and  BYTES_READ as counter name instead

2013-11-27 19:01:52,226 INFO org.apache.hadoop.mapred.MapTask: numReduceTasks: 0

2013-11-27 19:01:52,250 ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:hadoop (auth:SIMPLE) cause:java.io.IOException: Cannot run program "chmod": error=11, Resource temporarily unavailable

2013-11-27 19:01:52,250 WARN org.apache.hadoop.mapred.Child: Error running child

java.io.IOException: Cannot run program "chmod": error=11, Resource temporarily unavailable

        at java.lang.ProcessBuilder.start(ProcessBuilder.java:1041)

        at org.apache.hadoop.util.Shell.runCommand(Shell.java:206)

        at org.apache.hadoop.util.Shell.run(Shell.java:188)

        at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:381)

        at org.apache.hadoop.util.Shell.execCommand(Shell.java:467)

        at org.apache.hadoop.util.Shell.execCommand(Shell.java:450)

        at org.apache.hadoop.fs.RawLocalFileSystem.execCommand(RawLocalFileSystem.java:593)

        at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:584)

        at org.apache.hadoop.io.SecureIOUtils.insecureCreateForWrite(SecureIOUtils.java:146)

        at org.apache.hadoop.io.SecureIOUtils.createForWrite(SecureIOUtils.java:168)

        at org.apache.hadoop.mapred.TaskLog.writeToIndexFile(TaskLog.java:310)

        at org.apache.hadoop.mapred.TaskLog.syncLogs(TaskLog.java:383)

        at org.apache.hadoop.mapred.Child$4.run(Child.java:270)

        at java.security.AccessController.doPrivileged(Native Method)

        at javax.security.auth.Subject.doAs(Subject.java:415)

        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)

        at org.apache.hadoop.mapred.Child.main(Child.java:262)

Caused by: java.io.IOException: error=11, Resource temporarily unavailable

        at java.lang.UNIXProcess.forkAndExec(Native Method)

        at java.lang.UNIXProcess.<init>(UNIXProcess.java:135)

        at java.lang.ProcessImpl.start(ProcessImpl.java:130)

        at java.lang.ProcessBuilder.start(ProcessBuilder.java:1022)

        ... 16 more

2013-11-27 19:01:52,256 INFO org.apache.hadoop.mapred.Task: Runnning cleanup for the task



I have set map slots to 24 and reduce to 12 with 32 cores on HT

ulimit is 64K

what is causing it and how can we get rid of it. Its happening only for bigger jobs say terasort

Outcomes