AnsweredAssumed Answered

Does Drill understand Parquet partitions created by Spark?

Question asked by gesgeorge on Dec 7, 2017
Latest reply on Dec 19, 2017 by Hao Zhu

Hi,

When you write partitioned parquet files with Spark, they take the form

/data/year=2017/month=02/day=25/test.parquet

In Spark, if I query for year 2017, it uses the directory structure for partition pruning.

 

But when I try the same query in Drill it fails as drill does not seem to recognize the directory structure as partitions. I'm a bit surprised that Drill is unable to recognize the partition structure generated by Spark. Interoperability between Spark and Drill would seem natural. But I think partitioned created by Drill have a different structure. Will Drill ever recognize Spark generated partition structures?

 

-Gesly

Outcomes