DEV 301 LAB, exercise 4

Discussion created by idengrenme on Apr 15, 2016

I was running my University map reduce program and it would map , output the map, and never output the reduce, as follows :

  Map-Reduce Framework

  Map input records=65

  Map output records=82

  Map output bytes=738

  Map output materialized bytes=0

  Input split bytes=123

  Combine input records=0

  Combine output records=0

  Reduce input groups=2

  Reduce shuffle bytes=904

  Reduce input records=82

  Reduce output records=82

(should only be 6 records out!)

  Spilled Records=164

  Shuffled Maps =1

  Failed Shuffles=0

  Merged Map outputs=2

  GC time elapsed (ms)=37

  CPU time spent (ms)=390

  Physical memory (bytes) snapshot=454348800

  Virtual memory (bytes) snapshot=5528358912

  Total committed heap usage (bytes)=419954688

  Shuffle Errors


  File Input Format Counters

  Bytes Read=160443

  File Output Format Counters

  Bytes Written=738

(too much output!)


[user01@maprdemo UNIVERSITY_LAB]$ more OUT/part-r-00000

satm 675

satm 475

satm 500

satm 550

satm 575

satm 650

satm 780

satm 650

satm 650


I debugged for (more time than i care to admit) and found that this was the problem, in the default UNIVERSITY_LAB/ exercise file :

  public void reduce(org.w3c.dom.Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException {


delete that out of the default exercise , and just declare Text key, and then the program runs properly and gives reduced output :


Reduce output records=6

  Bytes Written=92

[user01@maprdemo UNIVERSITY_LAB]$ more OUT/part-r-00000

satm_min 325.0

satm_max 780.0

satm_mean 592.0

satv_min 300.0

satv_max 700.0

satv_mean 539.0