I am using Apache Drill from MapR 5.1 against a view on CSV files. Considering how the view is structured, there are about 6 joins making up the final view.
When I query via ODBC in a visualization tool, the response time is about 5 minutes to return approximately 30,000 rows, given my WHERE clause. However, with a different WHERE clause that returns approximately 1,000,000 rows the query takes more than 10 hours to complete. The view contains just under 5,000,000 rows in total.
Changing the view to a table and storing in parquet speeds things up tremendously, but I would like to know how I can diagnose the ODBC performance since my preference is to query against the view.