AnsweredAssumed Answered

Problem signalling task 32039 with KILL; exit = 6

Question asked by vinod_singh on Feb 13, 2012
Latest reply on Feb 28, 2015 by shurb
At times I see some of the tasks being failed due to OutOfMemoryError with a stack trace in syslogs as-

<code>2012-02-13 08:09:42,003 FATAL org.apache.hadoop.mapred.Child: Error running child : java.lang.OutOfMemoryError: unable to create new native thread
at java.lang.Thread.start0(Native Method)
at java.lang.Thread.start(Thread.java:640)
at com.mapr.fs.MapRFsInStream.getCacheSize(MapRFsInStream.java:64)
at com.mapr.fs.Inode.<init>(Inode.java:112)
at com.mapr.fs.MapRFsInStream.<init>(MapRFsInStream.java:36)
at com.mapr.fs.MapRClient.open(MapRClient.java:191)
at com.mapr.fs.MapRFileSystem.open(MapRFileSystem.java:307)
at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:460)
at org.apache.hadoop.mapred.Merger$Segment.init(Merger.java:204)
at org.apache.hadoop.mapred.Merger$Segment.access$100(Merger.java:165)
at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:444)
at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:407)
at org.apache.hadoop.mapred.Merger.merge(Merger.java:77)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.mergePerPartition(MapTask.java:1885)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1312)
at org.apache.hadoop.mapred.MapTask$NewOutputCollector.close(MapTask.java:589)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:656)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)
at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1109)
at org.apache.hadoop.mapred.Child.main(Child.java:264)</code>

I would expect such tasks to be killed and retried. Strangely sometimes such tasks are not killed and stderr logs will have something like-

<code>Exception in thread "main" org.apache.hadoop.ipc.RemoteException: java.io.IOException: Problem signalling task 32039 with KILL; exit = 6
at org.apache.hadoop.mapred.LinuxTaskController.signalTask(LinuxTaskController.java:339)
at org.apache.hadoop.mapred.JvmManager$JvmManagerForType$JvmRunner.kill(JvmManager.java:704)
at org.apache.hadoop.mapred.JvmManager$JvmManagerForType.killJvmRunner(JvmManager.java:351)
at org.apache.hadoop.mapred.JvmManager$JvmManagerForType.killJvm(JvmManager.java:330)
at org.apache.hadoop.mapred.JvmManager$JvmManagerForType.taskKilled(JvmManager.java:321)
at org.apache.hadoop.mapred.JvmManager.taskKilled(JvmManager.java:170)
at org.apache.hadoop.mapred.TaskRunner.kill(TaskRunner.java:810)
at org.apache.hadoop.mapred.TaskTracker$TaskInProgress.kill(TaskTracker.java:4474)
at org.apache.hadoop.mapred.TaskTracker$TaskInProgress.jobHasFinished(TaskTracker.java:4446)
at org.apache.hadoop.mapred.TaskTracker.purgeTask(TaskTracker.java:3353)
at org.apache.hadoop.mapred.TaskTracker.fatalError(TaskTracker.java:4878)
at sun.reflect.GeneratedMethodAccessor21.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:964)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1318)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1314)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1109)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1312)
at org.apache.hadoop.ipc.Client.call(Client.java:1071)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:275)
at $Proxy0.fatalError(Unknown Source)
at org.apache.hadoop.mapred.Child.main(Child.java:324)</code>

What could be wrong here?

Outcomes