AnsweredAssumed Answered

Drill-on-Yarn keeps throwing memory errors

Question asked by Engel on Nov 16, 2017
Latest reply on Jan 18, 2018 by agirish

On our five node MapR cluster we are trying to run a query through Drill-on-Yarn. However the query is causing drillbits to crash with the error:

Container [pid=38406,containerID=container_e51_1510839882429_0001_01_000006] is running beyond physical memory limits. Current usage: 30.1 GB of 30 GB physical memory used; 57.5 GB of 63.0 GB virtual memory used. Killing container.

 

In the drill-on-yarn.conf file we've configured the following memory settings for drillbits:

    heap: "8G"
    max-direct-memory: "20G"
    code-cache: "1G"
    memory-mb: 30720

 

We tried changing several memory settings, but none of these settings are providing a solution for our problem.

 

The query which we are running is:

SELECT * FROM dfs.drillviews.abc AS abc INNER JOIN dfs.drillviews.def AS def ON def.VBELN = abc.VBELN INNER JOIN dfs.drillviews.ghi AS ghi ON ghi.MATNR = def.MATNR WHERE abc.AUDAT BETWEEN '20140101' AND '20160101' AND abc.VBELN IN (SELECT abc.VBELN FROM dfs.drillviews.abc AS abc INNER JOIN dfs.drillviews.def AS def ON def.VBELN = abc.VBELN INNER JOIN dfs.drillviews.ghi AS ghi ON ghi.MATNR = def.MATNR WHERE ghi.MATNR = '19304781' OR ghi.MATNR = '19302781');

 

And it is running over several parquet files which are totally about 10G in size.

 

When we run the query it takes some time and after that the drillbits start to fail with the error above. Drill than gives an system error or connection error back as a result and the error above appears in the logfiles.

 

Has someone got an idea what could be wrong and how we can solve it? We are using Drill version 1.10.0.

Outcomes