AnsweredAssumed Answered

Spark Low Latency SQL - Thrift vs Job Server

Question asked by john.humphreys on Jun 19, 2017
Latest reply on Jun 19, 2017 by MichaelSegel

I have had a few different people tell me that using fronting Spark with a thrift server and using Spark SQL to load parquet files and query them may be a faster alternative to Drill (depending on how I store the parquet files).

 

Now that I'm looking into this, it seems there may be two options; a Thrift Server, and the Spark Job Server GitHub - spark-jobserver/spark-jobserver: REST job server for Apache Spark . 

 

The spark job server seems cool, but a little hard to get going.  The Thrift server sounds somewhat easy to get going but I don't really understand how it works (yet).

 

I don't have much time left for my POCs, so can someone shed some light here?  

 

If I want to quickly load and query parquet files using SQL, which of these options is better and why?  Also, are they highly available/good for production use?

Outcomes