AnsweredAssumed Answered

Data Dropping in Parquet File Conversion

Question asked by futuredriller on Nov 8, 2017
Latest reply on Nov 21, 2017 by cathy

Hello All,

 

I've been trying to create a parquet table from a pipe delimited, gzipped file using the Apache Drill CTAS method and noticed data drop when the file completed.  I've escaped the extraction of data from Redshift, tried processing the non-compressed version of the file, and it still drops rows. 

 

I located a missing row from the file, moved it to the beginning of the file, and reprocessed.  That row then made it into the parquet file. 

 

I'm not sure what is causing the drop and was wondering if anyone else has come across this problem.

 

Thanks.

Outcomes