AnsweredAssumed Answered

Map only job fails with .gz (gzip) input files

Question asked by thealy on Jan 14, 2013
Latest reply on Jan 15, 2013 by thealy
Running M3 Version 2.0.0, with 20 node cluster.

I'm trying to run a Map only job using .gz input format. For testing, I have one compressed log file in the input directory. If the file is un-zipped, the code works fine.

Watching the job with .gz input via the job tracker shows that the mapper apparently has read the correct number of records (880,000), and it reports 195,357 map output records just as it does if the input file is un-zipped. But it then hangs until I finally kill the job.

[Sorry for the cross-post if you are on the hadoop list, but I'm not sure if this is MapR specific or not.]