riteshlath

Using Drill To Query JSON tables

Discussion created by riteshlath on Jun 27, 2018

In our project, we are storing mixed schema JSON onto Mapr cluster and querying it using Apache Drill. The JSON table is being queried using the default schema dfs.`path` and not as a storage plugin.

 

The select query when limited to a few columns /family of columns is not throwing any error.

 

However, when the query is submitted as a "select * from table" it throws the below error:

/**********************************************************************************************************************/

org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: IndexOutOfBoundsException: index: 0, length: 131072 (expected: range(0, 65536)) Fragment 1:3 [Error Id: 44be5854-ac79-48a2-b3a9-3a947fde5200 on nodename:31010]

/**********************************************************************************************************************/

 

Kindly suggest if there are any workaround for the same or environment setting that can be put in place so that this error can be handled.

 

Also, please suggest if Drill can be used to query JSON effectively.

 

Note the below:

MapR Version: 5.2.2.44680.GA
Drill Version: 1.10.0

Outcomes