AnsweredAssumed Answered

Drill  - Group by with varying parquet schemas?

Question asked by john.humphreys on Jan 24, 2018
Latest reply on Jan 25, 2018 by john.humphreys

Everyone regularly touts that drill is very capable of querying data from parquet files with evolving schemas.

 

This simple query seems to be giving me an error though (Drill 1.10).

 

SELECT entity, epoch_hour, max(`1_max`), max(`2_max`), max(`1_max` + `2_max`) as combo
FROM dfs.`/nmr/eis/sysm/pmp/work/dev/maas/parquet-aggregation/drill-hour/2018/01/24/*`
where epoch_hour = 1516770000
group by `entity`, `epoch_hour`

 

UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema changes

 

Can drill really not do group by with evolving schemas, or am I missing something?  This seems like a pretty basic feature to be missing.

Outcomes