AnsweredAssumed Answered

Can't submit job remotely

Question asked by dave_kincaid on May 6, 2013
Latest reply on May 8, 2013 by gera
We are trying to understand how to submit a hadoop MR job remotely and can't quite get something to work. The job gets submitted, but as soon as the task trackers try to run it they throw NPE's. I have the MapR M3 VM running as hostname mapr-vm. I'll put the full Java class in a Gist at https://gist.github.com/dkincaid/5529308. It's just the stock word count that you can find in any MR tutorial. The goal here is to figure out how to submit a job remotely from a Java program. I think the relevant part is:

    conf.set("mapred.job.tracker", "mapr-vm:9001");
    conf.set("fs.default.name", "maprfs://mapr-vm:7222/");
    conf.set("fs.maprfs.impl", "com.mapr.fs.MapRFileSystem");

when I run the code it creates the job on the VM but when the tasks try to run they throw an NPE:

    13/05/06 19:06:04 INFO mapred.JobClient: Task Id : attempt_201305050908_0010_m_000002_0, Status : FAILED
    Error initializing attempt_201305050908_0010_m_000002_0:
    java.lang.NullPointerException
     at org.apache.hadoop.mapred.TaskTracker$6.run(TaskTracker.java:3771)

Any ideas what I have wrong? Am I setting the properties correctly? I saw some posts that the fs.default.name might  be maprfs:/// but trying that gives me an error that it can't find a CLDB node for my.cluster.com.

We've been working on this for a few days now. We are closer than ever to getting this working, but can't seem to make the final connection. Any help would be most appreciated.

Outcomes