NFS & HDFS

Document created by aalvarez on May 23, 2017Last modified by aalvarez on May 23, 2017
Version 7Show Document
  • View in full screen mode

What is NFS?

NFS is the Network File System. It's been part of Linux and the broader Unix ecosystem for decades and been used for a long time in both enterprise environments to share files as well as in customized environments like high performance computing. 

 

What is HDFS and how is it different from NFS?

HDFS is the Hadoop underlying file system, which is a write once file system. For that reason, developers are limited as you can't update files once they exist. You're either reading them or writing them. As a result, you're unable to take the best kind of advantage of the file system you have available because you're writing and reading much more than you need to.

 

What does MapR use?

Instead of HDFS, MapR uses NFS, which has been around for decades – behaving like an existing enterprise storage system that people who haven't used Hadoop are familiar with. This brings a lot of advantages in terms of familiarity and in terms of applications as it  works without having to have some kind of customized connector or data ingestion engine and just be able to use the standard application rewrite using a POSIX format. It makes things a lot easier for the developers trying to port it and creates a lot fewer issues by being able to use formats and protocols that are very well understood.

Attachments

    Outcomes