Note that Limit 0 queries are very commonly used by the BI tools (mainly Tableau) for various metadata operations on the underlying database they are querying. Given Drill doesn't manage any centralized schema repository, Limit 0 is a very important operation in Drill world to optimize so BI tools can have optimized performance while working with Drill as datasources just like other datasources. In the past few releases of Drill, there has been lot of work done to optimize the Limit 0 path.
The best way to determine if a given query is leveraging the optimized path that has been introduces is by the query plan output from Drill.
- Run EXPLAIN PLAN for the Limit 0 query.
- Look for a string ‘DirectRelScan’ operator in the plan. This is an indication that Drill is not doing a full scan of the table to answer Limit 0 query.
This optimization was introduced starting from Drill 1.4.
Different versions of drill may have different physical plans.
Drill 1.4 shows below visualized plan:
The keyword in the physical plan is "DrillDirectScanRel".
Drill 1.6 shows below visualized plan:
The keyword in the physical plan is "org.apache.drill.exec.planner.sql.handlers.FindLimit0Visitor$RelDataTypeReader".
Details are in How does Drill 1.4 improve the performance of "limit 0" queries | Open Knowledge Base
Retrieving data ...