AnsweredAssumed Answered

Core File and Mystery Time Traveling Job

Question asked by impermisha on May 2, 2013
Latest reply on May 2, 2013 by gera
One node last night generated a core file:
mapreduce_java_error21109.log

Which seems to be complaining about
0x00007ff531a4d000 JavaThread "MapOutputCopier attempt_201304190926_12284_r_000079_0.36" [_thread_blocked, id=21229, stack(0x00007ff516f51000,0x00007ff
517052000)]

So the first thing that is odd is that it is from April 19th, however the core was generated last night May 1st.

Looking in Job tracker sure enough that job was from 4/19 (it is very far back in history) when you view History in JT.

However when you drill down to the detail of this job (where it talks about Job Counters, FileSystemCounts, etc) this page now says the start time was May 1.  There was 1 failed reduce task, on the node in question.

task_201304190926_12284_r_000079 1/05 20:10:56 1/05 20:11:04 (7sec) java.lang.Throwable: Child Error at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:275) Caused by: java.io.IOException: Task process exit with nonzero status of 134. at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:262

I have no idea on this one...

Outcomes