AnsweredAssumed Answered

Accessing data between clusters

Question asked by kusako on Feb 13, 2013
Latest reply on Feb 18, 2013 by kusako
I trying to access data in one MapR cluster from another MapR cluster (both running the same version of MapR). The requirement is, that the map/reduce job runs on the second cluster.
I tried running a simple wordcount like this on a node in cluster2:

hadoop jar /opt/mapr/hadoop/hadoop-0.20.2/hadoop-0.20.2-dev-examples.jar wordcount maprfs:// maprfs://

Unfortunately this tries to run the job on cluster1.
I also tried setting jobtracker and default file system like this:

hadoop jar /opt/mapr/hadoop/hadoop-0.20.2/hadoop-0.20.2-dev-examples.jar wordcount maprfs:// maprfs://

but this leads to a file not found exception:
13/02/14 11:42:46 ERROR ipc.RPC: FailoverProxy: Failing this Call: submitJob for error(RemoteException): org.apache.hadoop.ipc.RemoteException: Requested file maprfs:/var/mapr/cluster/mapred/jobTracker/staging/test/.staging/job_201302131405_0038/job.xml does not exist.

I'm wondering if this is could work at all, or if I'm completely off track?

Thanks for any suggestions,