AnsweredAssumed Answered

How to do pig hbase integration in mapr v 2.0.0

Question asked by harish on Sep 27, 2012
Latest reply on Oct 28, 2013 by sanjay_bhosale
pig version 0.9.2 ( Tried with 0.10.0 as well. Same error)
hbase version 0.92.1
mapr v2.0.0

I am trying to load output of a pig script to hbase table. Getting the below error
<pre>
ERROR 2244: Job failed, hadoop does not return any error message
</pre>
Task Tracker logs show this error
<pre>
java.lang.RuntimeException: java.io.IOException:
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@64df83e5
closed
at org.apache.hadoop.hbase.mapreduce.TableOutputFormat.setConf(TableOutputFormat.java:200)
</pre>

Have exported export PIG_CLASSPATH="`$HBASE_HOME/bin/hbase classpath`:$PIG_CLASSPATH".
Copied zookeeper, guava and hbase jars to $PIG_HOME/lib as well.

Is this a known issue or was it some thing that I missed at my end ? Would be great if someone can give me a solution.

My pig script is as below:
<pre>
 REGISTER /opt/mapr/pig/pig-0.9.2/lib/zookeeper-3.3.2.jar
 REGISTER /opt/mapr/pig/pig-0.9.2/lib/hbase-0.92.1.jar
 REGISTER /opt/mapr/pig/pig-0.9.2/lib/guava-r09.jar
 M = load '/user/hive/warehouse/clean_sdp_exportit' using PigStorage('\t');
 A = foreach M generate $0,$1,$4,$11,$16,$21;
 N = load '/user/hive/warehouse/clean_clickstream_exportit_d0' using PigStorage('\t');
 B = foreach N generate $0,$1,$2,$8;
 C =join A by ($0,$1),B by ($0,$1);
 D = foreach C generate FilterTimeStamp($0,$1,$2,$3,$4,$5,$6,$7,$8,$9);
 E = foreach D generate $0.$0,$0.$1,$0.$3,$0.$4,$0.$5,$0.$8,$0.$9;
 F = filter E by $0 is not null;
 G = distinct F;
 I = foreach G generate $5,$0,$1,$2,$3,$4,$6;
 STORE I INTO 'Table1' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('data:acctid_char data:custid data:vr_app_start data:talk_tm data:call_disposition data:page_url');

</pre>
Thanks in advance for your help

Outcomes