I'm putting Microstrategy on top of zookeeper that's pointed to a drill cluster. The query Microstrategy creates is pretty long, but I've been able to successfully execute them. However, something that used to execute, now does not. The query gets assigned a foreman, goes into planning, and then just disappears rendering the foreman unusable and dropped from the zookeeper quorum. (The quorum is a small machine. Not sure if that's a problem.)
I run the same query, on my laptop (embedded mode), pointing to the same datasource, which happens to be S3, and although it takes 5 minutes, I get a query plan it starts executing.
This is what I've tried so far in addition to some settings on a machine that has 36 cores and 72 GB of memory:
1. I've used the DRILL HEAP (13GB) and DRILL_MAX_DIRECT_MEMORY (51GB) formulas based on the hardware of the bit.
2. I've set the S3 max connection limit to 10,000 to outrun the connection pooling error.
3. planner.memory.max_query_memory_per_node (31 GB) is set to specification for low concurrency.
4. planner.width.max_per_node (20) is set to specification for low concurrency.
5. All files in S3 are parquet with the meta file with the FACT table partitioned by date. (5 billion + rows)
6. I set planner.in_subquery_threshold = 100 so I get partition elimination on larger IN clauses.
7. Because of point 6, I increased the planner.memory_limit to 512 and then 1 GB without success.
8. I jconsoled the foreman being used and watched the heap memory go real near the upper limit of the memory available. (Running out of memory in planning???)
9. I do not have access to the log b/c the foreman drops out of the quorum after about 7 minutes.
I think it's odd that my laptop running embedded was able to obtain a plan, but a 6 machine cluster running 36 cores and 72 GBs each was not able too. Any guidance would be awesome!