Author: Venkata Sowriraja, last modified by Hassan Shaik
Original Publication Date: December 9, 2014
In many of the Hadoop applications, class loading seems to be one of the critical problems. Generally, in a Map Reduce job, Map and Reduce tasks run in a separate JVM (i.e. as a separate process). So if we need to run those Map or reduce tasks with some set of JVM parameters we need to pass those java opts specifically to those tasks (JVM process) alone. Here is how we can do that. We can set Java parameters for the map and reduce tasks separately as below.
Given below is one such example for that.
Under "/opt/mapr/hadoop/hadoop-0.20.2/conf/mapred-site.xml" add the below properties to get "-verbose:class" output for each of the tasks. This verbose logging of class loading helps in troubleshooting issues such as ClassNotFoundException or ClassDefNotFoundException to isolate the root cause.
From task logs you can get the info about classes loaded for each tasks (Map or Reduce tasks). Similarly, we can add any other JVM flags (for example -verbose:gc to get verbose GC logging) to Map/Reduce tasks parameters in their respective “child java opts property”in “mapred-site.xml”. This will be helpful even for Hive or Pig jobs since they also run on Map/Reduce framework.