AnsweredAssumed Answered

MapR File Locks and Hive Queries?

Question asked by MichaelSegel on Jul 8, 2017
Latest reply on Jul 22, 2017 by MichaelSegel

Just to preface this... this is what happens when I make an 8 shot espresso iced latte and I have some free time on my hands... ;-)

 

I was curious about MapR File System locks. Since the FS is posix compliant, I was wondering what would happen if I held a file lock on a file or set of files in one app, while someone was attempting to run a hive query against the larger set. 

 

Or if I created a file in the sub directory where a hive query was running... if I placed a lock on the file, would that cause the hive query to throw an exception or would it just ignore the file.

 

This is just one question in a series... overall I wanted to write a process that would let me compact files in a sub directory into a single file.  The issue though is that in a multi-tenant environment, someone else may be running a query against the data set.  Usually its not a problem since the files are immutable, so the isolation is a dirty read. However if I'm writing a process to compact the files... if I write it to the same directory, hive may barf.  If I write to a temp directory... when I want to replace the files... boom. There's a small window of uncertainty where I will be removing the old while I copy in the new file.

 

I get the impression that Hive won't handle this well and its a hard thing to test...

 

Thoughts?

Outcomes