Author: Mufeed Usman
Original Publication Date: April 23, 2015
Multiple part files reside on an HDFS (MFS) symlink location. Wildcard(*) is able to read multiple files from the path if the path is physical. Example,
But when a symlink to this folder is used it fails to detect any paths.
How to read these files on MapR-FS with symlinks from Spark.
The following helps to obtain the target directory path. Using MapRFileStatus instead of FileStatus, an API getSymlink() becomes available for use as shown below.
FileSystem fs = FileSystem.get(conf);
MapRFileStatus fst = (MapRFileStatus)fs.getFileStatus(path);
Path target = fst.getSymlink();