AnsweredAssumed Answered

HCATLOG Mapreduce program integration in oozie

Question asked by arjun_hareendran on Jul 10, 2015
I have written a mapreduce program that reads the data from hive table using HCATLOG and writes into HBase. This is a map only job with no reducers. I have ran the program from command line and it works as expected(Created a fat jar to avoid Jar issues). I wanted to integrate it oozie (with Help of HUE) . I have two options to run it

Use Mapreduce Action
Use Java Action

Since my Mapreduce program has a driver method (Hbasevalidateinsertdriver), How do i specify the driver method in oozie, All that i can see is to specify mapper and reducer class.Can someone guide me how do i set the properties ?

Using java action i can specify my driver class as the main class and get this executed , but i face errors like table not found, HCATLOG jars not found etc. I have include hive-site.xml in the workflow(Using Hue) but i feel the system is not able to pick up the properties. Can someone advise me what all do i have to take care of, are there any other configuration properties that i need to include ?

Also the sample program i referred in cloudera website uses

    HCatInputFormat.setInput(job, InputJobInfo.create(dbName,
                    inputTableName, null));

where as i use the below (I dont see a method that accept the above input

    HCatInputFormat.setInput(job, dbName, tableName, null);

I have attached the driver class and the mapper class.

Note: I use MapR (M3) Cluster with HUE as the interface for oozie. Hive Version : 1-0 HCAT Version: 1-0
  
[link text][1]


  [1]: /storage/temp/269-hbasevaldiateinsertdriver.txt

Outcomes