Jack Norris of MapR talks to Donnie Berkholz, PhD, of RedMonk about NFS
Jack Norris: Welcome to WATF, where we explore various claims and statements surrounding Hadoop and ask, "What are the facts?" In this episode we examine NFS -- specifically with Hadoop. To help us, we've got Donnie Berkholz from RedMonk. RedMonk is the first and only developer focused analyst firm. Donnie, welcome to the show.
Donnie Berkholz: Thanks for having me on the show Jack, I'm happy to be here.
Jack Norris: Well, let's start with NFS. What, exactly, is it and why should we care?
Donnie Berkholz: NFS is the Network File System. It's been part of Linux and the broader unit's
ecosystem for decades and been used for a long time in both enterprise environments to share files as well as in customized environments like high performance computing.
Jack Norris: Great. Well, with respect to Hadoop and NFS, what are the facts?
Donnie Berkholz: With Hadoop, the underlying file system, HDFS, is a write once file system. You're very much limited in that you can't update files once they exist. You're either reading them or writing them. The problem that you run into is that there's no file close operation. You're unable to take the best kind of advantage of the file system you have available because you're writing and reading much more than you need to.
Jack Norris: Are there other approaches in the industry with respect to NFS?
Donnie Berkholz: Yes. There are absolutely other approaches. My understanding is that MapR has one of them.
Jack Norris: Well, it was a leading question Donnie, thank you for identifying that. No, in all seriousness, talk about the difference with NFS support if that underlying layer is a fully random read, write storage layer.
Donnie Berkholz: One of the things we were talking about at the beginning of this segment is that NFS has been around for decades -- behaving like an existing enterprise storage system that people who haven't used Hadoop are familiar with. This brings a lot of advantages in terms of familiarity and in terms of applications.
Jack Norris: Basically, it'll work out of the box, it'll support those applications. It'll behave like you expect it to behave.
Donnie Berkholz: Exactly. Without having to have some kind of customized connector or data ingestion engine and just be able to use the standard application, rewrite using a POSIX format, that's been around for decades. It makes things a lot easier for the developers trying to port it and creates a lot less issues by being able to use formats and protocols that are very well understood.
Jack Norris: How many applications support NFS?
Donnie Berkholz: All of them. I think pretty much every application ever written for Linux or for Unix understands how to write to a POSIX file system -- NFS is one of those.
Jack Norris: POSIX files systems -- you kind of mentioned that. What exactly is that?
Donnie Berkholz: When I say POSIX, what I mean is, it uses a standard that's existed for decades of APIs and how to interact with the Unix system. It interoperates across any distribution of Linux or Unix.
Jack Norris: Excellent. I guess it's a good time now to point out, that's exactly how MapR works. Its NFS support is POSIX compliant. The next time someone says, "NFS and Hadoop," make sure you ask, "WATF?"