AnsweredAssumed Answered

Hadoop fs -copyToLocal fails when directory has a massive number of files

Question asked by jacques on Feb 11, 2012
Latest reply on Feb 12, 2012 by jacques
We generated a million files in a single directory.  When we attempted to grab a few of them with the Hadoop FS command, the command ran for a while but then just started erroring out over and over with:

<code>
\#hadoop fs -copyToLocal /path/to/file/filePrefix-* ~/output/
2012-02-11 17:39:18,4121 ERROR JniCommon fs/client/fileclient/cc/jni_common.cc:479 Thread: 140457850164992 createMapRFileStatus failed, could not get gidstr for gid 0
2012-02-11 17:39:22,2184 ERROR JniCommon fs/client/fileclient/cc/jni_common.cc:1402 Thread: 140457850164992 readdirplus failed, could not create MapRFileStatus object for [filename]</code>

This appears to be an out of memory error.  We had to kill -9 the process.

Mounting the same directory via NFS was slow, but did ultimately copy the files.

Outcomes