AnsweredAssumed Answered

"File too large" on MapR-DB scan from Pig

Question asked by imichaeldotorg on Nov 14, 2014
Latest reply on Nov 14, 2014 by nabeel
I'm running a Pig script that queries data from a MapR-DB table.  I get an error "File too large" when scanning the MapR-DB table.  When I run the same pig script on a traditional HBase table, the scan works fine.  We're using MapR-DB on

The Pig code that fails is pretty straightforward:
A = LOAD '/user/mapr/table_name_here' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('my_cf:*', '-loadKey true')  AS (the_id:chararray, my_cf:map[]);

The error that is thrown (both in YARN and MR1) is below:

Error: Scan Error: File too large(27) at
com.mapr.fs.Inode.scanNext( at at
org.apache.hadoop.hbase.mapreduce.TableRecordReaderImpl.nextKeyValue( at
org.apache.hadoop.hbase.mapreduce.TableRecordReader.nextKeyValue( at
org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat$HBaseTableRecordReader.nextKeyValue( at
org.apache.pig.backend.hadoop.hbase.HBaseStorage.getNext( at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue( at
org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue( at
org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue( at$Context.nextKeyValue( at at
org.apache.hadoop.mapred.MapTask.runNewMapper( at at
org.apache.hadoop.mapred.YarnChild$ at Method) at at at
Container killed by the ApplicationMaster.
Container killed on request.
Exit code is 143 Container exited with a non-zero exit code 143

Other Pig scripts work on other MapR-DB tables in the cluster.  There are approximately 1.6M rows in the table.