AnsweredAssumed Answered

DATA_READ ERROR while creating table as parquet with drill

Question asked by pkhnkn on Feb 9, 2017
Latest reply on Feb 10, 2017 by pkhnkn

Hello,

i´m trying to convert files from csv to parquet with Hive, Spark and Drill. Both Hive and Spark worked perfectly fine, but with Drill i get the following error message:

Error: DATA_READ ERROR: Error processing input: , line=1064193, char=195035136. Content parsed: [ ]

Failure while reading file maprfs:///mapr/demo.mapr.com/input/komma/stichprobe_20mio/Stichprobe_AB_20Mio.csv. Happened at or shortly before byte position 2343567360.
Fragment 1:0

[Error Id: f049d95c-34d8-4c77-bee2-3eb94dbef68a on maprdemo:31010] (state=,code=0)
Aborting command set because "force" is false and command failed: "create table komma.root.parquet_20mio AS (select * from komma.root.stichprobe_20mio);"

The same statement also worked well with smaller files, just this one (3.5 GB) fails multiple times.

I´m using the MapR Sandbox with Drill on a virtual machine that has 6 GB RAM and 2 x 1.3 GHz processor.

I already looked for similar errors but i didn´t find anything that helped me with my problem.

 

How can i solve this?

Thank you in advance!

Outcomes