AnsweredAssumed Answered

distcp from Cloudera CDH3U3 to MapR failing with incorrect sizes copied

Question asked by chriscurtin on Jun 14, 2012
Latest reply on Jun 19, 2012 by chriscurtin
Hi,

We're trying to copy files from a Cloudera CDH3U3 cluster to Mapr. There are 300+GB of files, in probably 3000+ files. The majority are coming across fine use distcp but every few minutes a copy will fail. The task logs for all look like below. I noticed that all of the files that are failing are 300+ MB in size, up to 2 GB.

Are there issues using distcp with hftp between clusters? Note there is nothing on the Cloudera side in the logs and only a notice on the Mapr side with the same error as from the task.

    2012-06-14 15:04:09,997 INFO org.apache.hadoop.mapred.Child: JVM: jvm_201206110813_0002_m_-1364867362 pid: 24740
    2012-06-14 15:04:10,574 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library
    2012-06-14 15:04:11,228 INFO org.apache.hadoop.mapred.Child: Starting task attempt_201206110813_0002_m_000028_0
    2012-06-14 15:04:11,229 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=MAP, sessionId=
    2012-06-14 15:04:11,363 INFO org.apache.hadoop.mapreduce.util.ProcessTree: setsid exited with exit code 0
    2012-06-14 15:04:11,368 WARN org.apache.hadoop.mapreduce.util.ProcfsBasedProcessTree: /proc/<pid>/status does not have information about swap space used(VmSwap). Can not track swap usage of a task.
    2012-06-14 15:04:11,368 INFO org.apache.hadoop.mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.mapreduce.util.LinuxResourceCalculatorPlugin@63b0bdc8
    2012-06-14 15:18:12,030 INFO org.apache.hadoop.tools.DistCp: FAIL 2012_05/5321313_198679868_20120504170000_list.csv : java.io.IOException: File size not matched: copied 1337647104 bytes (1.2g) to tmpfile (=/offlined/3_3303/_distcp_tmp_cpau5c/2012_05/5321313_198679868_20120504170000_list.csv) but expected 3043224805 bytes (2.8g) from hftp://hadnn02.atlis1:50070/offlined/3_3303/2012_05/5321313_198679868_20120504170000_list.csv
     at org.apache.hadoop.tools.DistCp$CopyFilesMapper.copy(DistCp.java:439)
     at org.apache.hadoop.tools.DistCp$CopyFilesMapper.map(DistCp.java:547)
     at org.apache.hadoop.tools.DistCp$CopyFilesMapper.map(DistCp.java:314)
     at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
     at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:394)
     at org.apache.hadoop.mapred.MapTask.run(MapTask.java:327)
     at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
     at java.security.AccessController.doPrivileged(Native Method)
     at javax.security.auth.Subject.doAs(Subject.java:396)
     at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1109)
     at org.apache.hadoop.mapred.Child.main(Child.java:264)
    
    2012-06-14 15:23:47,440 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
    2012-06-14 15:23:47,455 WARN org.apache.hadoop.mapred.Child: Error running child
    java.io.IOException: Copied: 87 Skipped: 0 Failed: 1
     at org.apache.hadoop.tools.DistCp$CopyFilesMapper.close(DistCp.java:582)
     at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
     at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:394)
     at org.apache.hadoop.mapred.MapTask.run(MapTask.java:327)
     at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
     at java.security.AccessController.doPrivileged(Native Method)
     at javax.security.auth.Subject.doAs(Subject.java:396)
     at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1109)
     at org.apache.hadoop.mapred.Child.main(Child.java:264)
    2012-06-14 15:23:47,458 INFO org.apache.hadoop.mapred.Task: Runnning cleanup for the task

Outcomes