AnsweredAssumed Answered

MapR-DB Bulk Upload fails at the reducer end.

Question asked by lovehasija88 on Sep 6, 2016
Latest reply on Nov 8, 2016 by aalvarez

Hi,

 

I have written a custom map reduce job to bulk upload the data to the Mapr-DB. The utility is written as suggested in the documentation. While the job runs fine and uploads most of the data, it fails due to some records stating that the data is not sorted. Since, the data is itself sorted by the reducer which is beyond our control, this should not happen. Here's the stack trace:

 

Error: java.io.IOException: Received unsorted KeyValues : prev \x1F<key1>/d:/1473185830616/Put/vlen=0/seqid=0 cur \x00<key2>/d:/1473185830616/Put/vlen=0/seqid=0

        at com.mapr.fs.hbase.BulkLoadRecordWriter.write(BulkLoadRecordWriter.java:84)

        at com.mapr.fs.hbase.BulkLoadRecordWriter.write(BulkLoadRecordWriter.java:26)

        at org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.write(ReduceTask.java:558)

        at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)

        at org.apache.hadoop.mapreduce.lib.reduce.WrappedReducer$Context.write(WrappedReducer.java:105)

        at org.apache.hadoop.hbase.mapreduce.PutSortReducer.reduce(PutSortReducer.java:78)

        at org.apache.hadoop.hbase.mapreduce.PutSortReducer.reduce(PutSortReducer.java:43)

        at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171)

        at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627)

        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)

        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)

        at java.security.AccessController.doPrivileged(Native Method)

        at javax.security.auth.Subject.doAs(Subject.java:422)

        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1595)

        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

 

Please suggest the resolution.

Outcomes