AnsweredAssumed Answered

Map and Reduce execute 100% each but streaming job fails.

Question asked by shreya on Jan 27, 2016
Latest reply on Feb 9, 2016 by adamdiaz
Iam running a Graph Traversal algorithm using map reduce, and it gives the desired output when tested without using hadoop. but on running the command :

hadoop jar /usr/local/hadoop/share/hadoop/tools/lib/hadoop-streaming-2.6.0.jar -file /home/hduser/finalmap.py -mapper 'python finalmap.py' -file /home/hduser/finalred.py -reducer 'python finalred.py' -input /Random_Walk_Input -output Random_Walk_Output1

the following happens :

16/01/27 11:03:51 INFO mapreduce.Job:  map 0% reduce 0%

16/01/27 11:03:55 INFO mapreduce.Job:  map 33% reduce 0%

16/01/27 11:04:02 INFO mapreduce.Job: Task Id : attempt_1453872707553_0001_m_000001_1, Status : FAILED

Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:322)
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:535)
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)


16/01/27 11:04:03 INFO mapreduce.Job:  map 50% reduce 0%

16/01/27 11:04:14 INFO mapreduce.Job: Task Id : attempt_1453872707553_0001_m_000001_2, Status : FAILED

Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:322)
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:535)
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

16/01/27 11:04:22 INFO mapreduce.Job:  map 50% reduce 17%

16/01/27 11:04:25 INFO mapreduce.Job:  map 100% reduce 100%


16/01/27 11:04:26 INFO mapreduce.Job: Job job_1453872707553_0001 failed with state FAILED due to: Task failed task_1453872707553_0001_m_000001


Job failed as tasks failed. failedMaps:1 failedReduces:0

16/01/27 11:04:27 INFO mapreduce.Job: Counters: 39
    File System Counters
        FILE: Number of bytes read=0
        FILE: Number of bytes written=15725173
        FILE: Number of read operations=0
        FILE: Number of large read operations=0
        FILE: Number of write operations=0
        HDFS: Number of bytes read=413787
        HDFS: Number of bytes written=0
        HDFS: Number of read operations=3
        HDFS: Number of large read operations=0
        HDFS: Number of write operations=0
    Job Counters
        Failed map tasks=4
        Killed reduce tasks=1
        Launched map tasks=5
        Launched reduce tasks=1
        Other local map tasks=3
        Data-local map tasks=2
        Total time spent by all maps in occupied slots (ms)=68482
        Total time spent by all reduces in occupied slots (ms)=19382
        Total time spent by all map tasks (ms)=68482
        Total time spent by all reduce tasks (ms)=19382
        Total vcore-seconds taken by all map tasks=68482
        Total vcore-seconds taken by all reduce tasks=19382
        Total megabyte-seconds taken by all map tasks=70125568
        Total megabyte-seconds taken by all reduce tasks=19847168
    Map-Reduce Framework
        Map input records=17666
        Map output records=767145
        Map output bytes=14081829
        Map output materialized bytes=15616125
        Input split bytes=91
        Combine input records=0
        Spilled Records=767145
        Failed Shuffles=0
        Merged Map outputs=0
        GC time elapsed (ms)=229
        CPU time spent (ms)=17120
        Physical memory (bytes) snapshot=269684736
        Virtual memory (bytes) snapshot=852369408
        Total committed heap usage (bytes)=200802304
    File Input Format Counters
        Bytes Read=413696
16/01/27 11:04:27 ERROR streaming.StreamJob: Job not successful!
Streaming Command Failed!


What does this mean?
It shows mapper and reducer have executed 100% each but again says
failed maps :1 and failed reduces :0

Outcomes