i´m trying to convert files from csv to parquet with Hive, Spark and Drill. Both Hive and Spark worked perfectly fine, but with Drill i get the following error message:
Error: DATA_READ ERROR: Error processing input: , line=1064193, char=195035136. Content parsed: [ ]
Failure while reading file maprfs:///mapr/demo.mapr.com/input/komma/stichprobe_20mio/Stichprobe_AB_20Mio.csv. Happened at or shortly before byte position 2343567360.
[Error Id: f049d95c-34d8-4c77-bee2-3eb94dbef68a on maprdemo:31010] (state=,code=0)
Aborting command set because "force" is false and command failed: "create table komma.root.parquet_20mio AS (select * from komma.root.stichprobe_20mio);"
The same statement also worked well with smaller files, just this one (3.5 GB) fails multiple times.
I´m using the MapR Sandbox with Drill on a virtual machine that has 6 GB RAM and 2 x 1.3 GHz processor.
I already looked for similar errors but i didn´t find anything that helped me with my problem.
How can i solve this?
Thank you in advance!