AnsweredAssumed Answered

Drill vs Hive on MapR-DB - advantage of the range scan's start and stop row filters

Question asked by MichaelSegel on Feb 8, 2017
Latest reply on Feb 8, 2017 by MichaelSegel

With Hive, the Where clause must contain conditions key >= min_val and key <= max_val in order to take advantage of the range scan's start and stop row filters.

 

Does this also apply to Drill?

 

Within HBase / MapRDB there's also a time range method on the scan() class. Here you can set a time range to find all of the rows/cells which have been inserted/updated during that time period.  

 

This is valuable in use cases where we only want to process rows that were last touched during a specific period.  I believe that the entire region / tablet is bypassed if there are no rows that meet this criteria. ( This implies that there's a min/max timestamp for the tablet. )

 

Hive apparently can't take advantage of this and I was wondering if Drill is capable of using this feature?

Outcomes