AnsweredAssumed Answered

How to use PigServer api to submit pig job to remote mapr cluster

Question asked by tushars007 on Oct 21, 2014
Latest reply on Dec 11, 2014 by Ted Dunning
I am trying to use java PigServer api to submit pig job to remote mapr cluster. So I have (windows) machine A which has a web application which is used for uploading a pig script. On Machine B, there is Mapr Hadoop and Pig installed.
 
My questions are:  
 
1) Can PigServer api on Machine A communicate with Pig on Machine B to submit jobs, even if there is no pig installed on Machine A ? 
2) Would I need a Mapr client on Windows (Machine A) for the communication purpose ?

I have used the mapr jars for pig and hadoop in my web-app. I also tried creating core-site.xml and hadoop-site.xml with required cluster address in WEB-INF/lib where the jars are located. But no luck there! I am still getting the following error:

<pre>
org.apache.pig.backend.executionengine.ExecException: **ERROR 4010**: Cannot find hadoop configurations in classpath (neither hadoop-site.xml nor core-site.xml was found in the classpath). If you plan to use local mode, please put -x local option in command line.
</pre>
I have seen several posts with similar error and most of them suggested having the xml files or pointing to a Hadoop conf directory. I have tried keeping just xml files in classpath, but that didn't work out. How should I point to a Hadoop conf directory when I do not have Hadoop installed locally on Machine A ?

I am using following code related to PigServer api:

<code>
props.setProperty("fs.default.name", "maprfs://ipaddress:7222");
props.setProperty("mapred.job.tracker", "ipaddress:9001");
PigServer pig = new PigServer(ExecType.MAPREDUCE, props);
pig.registerScript(filecontent);
</code>

Outcomes