AnsweredAssumed Answered

Large Hadoop Distcp fails to create staging directory

Question asked by dannyman on Apr 26, 2017
Latest reply on May 8, 2017 by maprcommunity

MapR v. 4.1.0.31175.GA

 

Usually, the hadoop distcp command works just fine for us. For sufficiently large data volumes it fails with errors like these:

WARN fs.LocalDirAllocator$AllocatorPerContext: Disk Error Exception: 
org.apache.hadoop.util.DiskChecker$DiskErrorException: Can not create directory: /var/mapr/cluster/yarn/rm/staging/prod/.staging/_distcp-93833993

ERROR tools.DistCp: Exception encountered
org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any valid local directory for maprfs:/var/mapr/cluster/yarn/rm/staging/prod/.staging/_distcp-93833993/intermediate.1

Full output is below. My suspicion is that these directories get created at the start of the distcp comand, but something cleans them out in the time it takes distcp to determine its manifest. I haven't found this error online. Can anyone suggest a remedy?

 

Thanks,

-danny

prod@mapr-01:/mapr/mapr-prod/data$ hadoop distcp -m 8192 -skipcrccheck -update /mapr/mapr-prod/data/src/foo/ /mapr/mapr-prod/data/dest/foo/
17/04/25 14:54:38 INFO tools.DistCp: Input Options: DistCpOptions{atomicCommit=false, syncFolder=true, deleteMissing=false, ignoreFa
ilures=false, maxMaps=8192, sslConfigurationFile='null', copyStrategy='uniformsize', sourceFileListing=null, sourcePaths=[/mapr/mapr
-prod/data/src/foo], targetPath=/mapr/mapr-prod/data/dest/foo, targetPathExists=true}
17/04/25 14:54:39 INFO client.RMProxy: Connecting to ResourceManager at mapr-03.prod.qxxxxxxxxxd.com/10.10.12.135:8032
17/04/25 20:54:38 INFO Configuration.deprecation: io.sort.mb is deprecated. Instead, use mapreduce.task.io.sort.mb
17/04/25 20:54:38 INFO Configuration.deprecation: io.sort.factor is deprecated. Instead, use mapreduce.task.io.sort.factor
17/04/25 21:02:26 WARN fs.LocalDirAllocator$AllocatorPerContext: Disk Error Exception:
org.apache.hadoop.util.DiskChecker$DiskErrorException: Can not create directory: /var/mapr/cluster/yarn/rm/staging/prod/.staging/_distcp-93833993
at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:91)
at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.createPath(LocalDirAllocator.java:317)
at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:391)
at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:150)
at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:131)
at org.apache.hadoop.io.SequenceFile$Sorter$MergeQueue.merge(SequenceFile.java:3414)
at org.apache.hadoop.io.SequenceFile$Sorter.merge(SequenceFile.java:3208)
at org.apache.hadoop.io.SequenceFile$Sorter.mergePass(SequenceFile.java:3184)
at org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:2747)
at org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:2786)
at org.apache.hadoop.tools.util.DistCpUtils.sortListing(DistCpUtils.java:330)
at org.apache.hadoop.tools.CopyListing.validateFinalListing(CopyListing.java:145)
at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:91)
at org.apache.hadoop.tools.GlobbedCopyListing.doBuildListing(GlobbedCopyListing.java:90)
at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:84)
at org.apache.hadoop.tools.DistCp.createInputFileListing(DistCp.java:353)
at org.apache.hadoop.tools.DistCp.execute(DistCp.java:160)
at org.apache.hadoop.tools.DistCp.run(DistCp.java:121)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.tools.DistCp.main(DistCp.java:401)
17/04/25 21:02:26 ERROR tools.DistCp: Exception encountered
org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any valid local directory for maprfs:/var/mapr/cluster/yarn/rm/staging/prod/.staging/_distcp-93833993/intermediate.1
at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:403)
at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:150)
at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:131)
at org.apache.hadoop.io.SequenceFile$Sorter$MergeQueue.merge(SequenceFile.java:3414)
at org.apache.hadoop.io.SequenceFile$Sorter.merge(SequenceFile.java:3208)
at org.apache.hadoop.io.SequenceFile$Sorter.mergePass(SequenceFile.java:3184)
at org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:2747)
at org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:2786)
at org.apache.hadoop.tools.util.DistCpUtils.sortListing(DistCpUtils.java:330)
at org.apache.hadoop.tools.CopyListing.validateFinalListing(CopyListing.java:145)
at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:91)
at org.apache.hadoop.tools.GlobbedCopyListing.doBuildListing(GlobbedCopyListing.java:90)
at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:84)
at org.apache.hadoop.tools.DistCp.createInputFileListing(DistCp.java:353)
at org.apache.hadoop.tools.DistCp.execute(DistCp.java:160)
at org.apache.hadoop.tools.DistCp.run(DistCp.java:121)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.tools.DistCp.main(DistCp.java:401)

Outcomes