AnsweredAssumed Answered

HIVE UDF ClassNotFoundException Error

Question asked by chriswirt on Feb 6, 2013
Latest reply on Feb 14, 2013 by chriswirt
We have migrated our 20 node Hadoop cluster to MapR. A majority of our jobs run as Hive queries and use a large number of UDFs. We've been moving from hadoop-0.20./hive-0.10 to MapR-2.1/hive-0.10.

Since the initial migration we've been experiencing intermittent issues whereby a hive job will fail with a ClassNotFoundException for the required UDF. I say intermittent as we've found the issue to not happen all of the time on any particular job or job-stage. The fail also doesn’t appear to be local to any particular cluster node.

The only noticeable trends are, it is the larger jobs which are often failing and if one map/reduce task of a particular job-stage fails with ClassNotFoundException then all map/reduce tasks will fail and the entire job will have to be rerun.

If a job does fail a couple of retries will generally eventually result in a successful run.

If you look at the job.xml our jars appear in hive.aux.jars.path and hive.added.jars.path and mapred.job.classpath.files. Also you can clearly see in the task log (of a failed task) the jar is in the classpath.

Any help is very much appreciated. I’m really hoping that someone else has experienced similar issues as I’ve been pulling my hair out for over a week now trying to fix this.

Here is a syslog output for a failed reduce task


    2013-02-07 09:02:36,262 INFO org.apache.hadoop.mapred.Child: JVM: jvm_201302051218_0964_r_862716539 pid: 22686
    2013-02-07 09:02:36,436 INFO org.apache.hadoop.mapred.TaskRunner: Creating symlink: /tmp/mapr-hadoop/mapred/local/taskTracker/distcache/-653392944033370040_1219476872_870520019/maprfs/tmp/hive-root/hive_2013-02-07_08-59-27_338_7470363753800856636/-mr-10009/5c632170-d077-46f1-b387-d2a959df82f7 <- /tmp/mapr-hadoop/mapred/local/taskTracker/root/jobcache/job_201302051218_0964/attempt_201302051218_0964_r_000011_1/work/HIVE_PLAN5c632170-d077-46f1-b387-d2a959df82f7
    2013-02-07 09:02:36,437 INFO org.apache.hadoop.mapred.TaskRunner: Creating symlink: /tmp/mapr-hadoop/mapred/local/taskTracker/root/distcache/7623848587582835789_1565874548_870520335/maprfs/var/mapr/cluster/mapred/jobTracker/staging/root/.staging/job_201302051218_0964/files/bme.jar <- /tmp/mapr-hadoop/mapred/local/taskTracker/root/jobcache/job_201302051218_0964/attempt_201302051218_0964_r_000011_1/work/bme.jar
    2013-02-07 09:02:36,438 INFO org.apache.hadoop.mapred.TaskRunner: Creating symlink: /tmp/mapr-hadoop/mapred/local/taskTracker/root/distcache/-6033273689890822233_1358270708_870520357/maprfs/var/mapr/cluster/mapred/jobTracker/staging/root/.staging/job_201302051218_0964/files/pylatencyandeventcount.py <- /tmp/mapr-hadoop/mapred/local/taskTracker/root/jobcache/job_201302051218_0964/attempt_201302051218_0964_r_000011_1/work/pylatencyandeventcount.py
    2013-02-07 09:02:36,438 INFO org.apache.hadoop.mapred.TaskRunner: Creating symlink: /tmp/mapr-hadoop/mapred/local/taskTracker/root/distcache/3422536012653977660_1103443956_870520439/maprfs/var/mapr/cluster/mapred/jobTracker/staging/root/.staging/job_201302051218_0964/files/udf.jar <- /tmp/mapr-hadoop/mapred/local/taskTracker/root/jobcache/job_201302051218_0964/attempt_201302051218_0964_r_000011_1/work/udf.jar
    2013-02-07 09:02:36,442 INFO org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: /tmp/mapr-hadoop/mapred/local/taskTracker/root/jobcache/job_201302051218_0964/jars/.job.jar.crc <- /tmp/mapr-hadoop/mapred/local/taskTracker/root/jobcache/job_201302051218_0964/attempt_201302051218_0964_r_000011_1/work/.job.jar.crc
    2013-02-07 09:02:36,442 INFO org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: /tmp/mapr-hadoop/mapred/local/taskTracker/root/jobcache/job_201302051218_0964/jars/job.jar <- /tmp/mapr-hadoop/mapred/local/taskTracker/root/jobcache/job_201302051218_0964/attempt_201302051218_0964_r_000011_1/work/job.jar
    2013-02-07 09:02:36,455 INFO org.apache.hadoop.mapred.Child: Starting task attempt_201302051218_0964_r_000011_1
    2013-02-07 09:02:36,456 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=SHUFFLE, sessionId=
    2013-02-07 09:02:36,469 INFO org.apache.hadoop.mapreduce.util.ProcessTree: setsid exited with exit code 0
    2013-02-07 09:02:36,471 WARN org.apache.hadoop.mapreduce.util.ProcfsBasedProcessTree: /proc/<pid>/status does not have information about swap space used(VmSwap). Can not track swap usage of a task.
    2013-02-07 09:02:36,471 INFO org.apache.hadoop.mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.mapreduce.util.LinuxResourceCalculatorPlugin@35595365
    2013-02-07 09:02:36,542 INFO org.apache.hadoop.io.compress.zlib.ZlibFactory: Successfully loaded & initialized native-zlib library
    2013-02-07 09:02:36,542 INFO org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor
    2013-02-07 09:02:36,543 INFO org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor
    2013-02-07 09:02:36,544 INFO org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor
    2013-02-07 09:02:36,548 INFO org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor
    2013-02-07 09:02:36,550 INFO org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor
    2013-02-07 09:02:36,550 INFO org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor
    2013-02-07 09:02:36,559 INFO org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor
    2013-02-07 09:02:36,575 INFO org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor
    2013-02-07 09:02:36,575 INFO org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor
    2013-02-07 09:02:36,579 INFO org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor
    2013-02-07 09:02:36,579 INFO org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor
    2013-02-07 09:02:36,580 INFO org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor
    2013-02-07 09:02:37,049 INFO org.apache.hadoop.mapred.Merger: Merging 17 sorted segments
    2013-02-07 09:02:37,049 INFO org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 16 segments left of total size: 32875527 bytes
    2013-02-07 09:02:37,052 INFO org.apache.hadoop.io.compress.CodecPool: Got brand-new compressor
    2013-02-07 09:02:39,367 INFO org.apache.hadoop.mapred.Merger: Merging 1 sorted segments
    2013-02-07 09:02:39,368 INFO org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 4484496 bytes
    2013-02-07 09:02:39,373 INFO ExecReducer: maximum memory = 1398145024
    2013-02-07 09:02:39,373 INFO ExecReducer: conf classpath = [file:/tmp/mapr-hadoop/mapred/local/taskTracker/root/jobcache/job_201302051218_0964/jars/classes, file:/tmp/mapr-hadoop/mapred/local/taskTracker/root/jobcache/job_201302051218_0964/jars/, file:/tmp/mapr-hadoop/mapred/local/taskTracker/root/jobcache/job_201302051218_0964/attempt_201302051218_0964_r_000011_1/]
    2013-02-07 09:02:39,373 INFO ExecReducer: thread classpath = [file:/opt/mapr/hadoop/hadoop-0.20.2/conf/, file:/usr/java/latest/lib/tools.jar, file:/opt/mapr/hadoop/hadoop-0.20.2/, file:/opt/mapr/hadoop/hadoop-0.20.2/hadoop*core*.jar, file:/opt/mapr/hadoop/hadoop-0.20.2/lib/amazon-s3.jar, file:/opt/mapr/hadoop/hadoop-0.20.2/lib/asm-3.2.jar, file:/opt/mapr/hadoop/hadoop-0.20.2/lib/aspectjrt-1.6.5.jar, file:/opt/mapr/hadoop/hadoop-0.20.2/lib/aspectjtools-1.6.5.jar, file:/opt/mapr/hadoop/hadoop-0.20.2/lib/aws-java-sdk-1.3.2.jar, file:/opt/mapr/hadoop/hadoop-0.20.2/lib/commons-cli-1.2.jar, file:/opt/mapr/hadoop/hadoop-0.20.2/lib/commons-codec-1.5.jar, file:/opt/mapr/hadoop/hadoop-0.20.2/lib/commons-configuration-1.8.jar, file:/opt/mapr/hadoop/hadoop-0.20.2/lib/commons-daemon-1.0.1.jar, file:/opt/mapr/hadoop/hadoop-0.20.2/lib/commons-el-1.0.jar, file:/opt/mapr/hadoop/hadoop-0.20.2/lib/commons-httpclient-3.0.1.jar, file:/opt/mapr/hadoop/hadoop-0.20.2/lib/commons-httpclient-3.1.jar, file:/opt/mapr/hadoop/hadoop-0.20.2/lib/commons-lang-2.6.jar, file:/opt/mapr/hadoop/hadoop-0.20.2/lib/commons-logging-1.0.4.jar, file:/opt/mapr/hadoop/hadoop-0.20.2/lib/commons-logging-1.1.1.jar, file:/opt/mapr/hadoop/hadoop-0.20.2/lib/commons-logging-api-1.0.4.jar, file:/opt/mapr/hadoop/hadoop-0.20.2/lib/commons-math-2.1.jar, file:/opt/mapr/hadoop/hadoop-0.20.2/lib/commons-net-1.4.1.jar, file:/opt/mapr/hadoop/hadoop-0.20.2/lib/commons-net-3.1.jar, file:/opt/mapr/hadoop/hadoop-0.20.2/lib/core-3.1.1.jar, file:/opt/mapr/hadoop/hadoop-0.20.2/lib/emr-metrics-1.0.jar, file:/opt/mapr/hadoop/hadoop-0.20.2/lib/eval-0.5.jar, file:/opt/mapr/hadoop/hadoop-0.20.2/lib/gson-1.4.jar, file:/opt/mapr/hadoop/hadoop-0.20.2/lib/guava-13.0.1.jar, file:/opt/mapr/hadoop/hadoop-0.20.2/lib/hadoop-0.20.2-dev-capacity-scheduler.jar, file:/opt/mapr/hadoop/hadoop-0.20.2/lib/hadoop-0.20.2-dev-core.jar, file:/opt/mapr/hadoop/hadoop-0.20.2/lib/hadoop-0.20.2-dev-fairscheduler.jar, file:/opt/mapr/hadoop/hadoop-0.20.2/lib/hsqldb-1.8.0.10.jar, file:/opt/mapr/hadoop/hadoop-0.20.2/lib/httpclient-4.1.1.jar, file:/opt/mapr/hadoop/hadoop-0.20.2/lib/httpcore-4.1.jar, file:/opt/mapr/hadoop/hadoop-0.20.2/lib/jackson-core-asl-1.5.2.jar, file:/opt/mapr/hadoop/hadoop-0.20.2/lib/jackson-mapper-asl-1.5.2.jar, file:/opt/mapr/hadoop/hadoop-0.20.2/lib/jasper-compiler-5.5.12.jar, file:/opt/mapr/hadoop/hadoop-0.20.2/lib/jasper-runtime-5.5.12.jar, file:/opt/mapr/hadoop/hadoop-0.20.2/lib/jersey-core-1.8.jar, file:/opt/mapr/hadoop/hadoop-0.20.2/lib/jersey-json-1.8.jar, file:/opt/mapr/hadoop/hadoop-0.20.2/lib/jersey-server-1.8.jar, file:/opt/mapr/hadoop/hadoop-0.20.2/lib/jets3t-0.6.1.jar, file:/opt/mapr/hadoop/hadoop-0.20.2/lib/jetty-6.1.14.jar, file:/opt/mapr/hadoop/hadoop-0.20.2/lib/jetty-servlet-tester-6.1.14.jar, file:/opt/mapr/hadoop/hadoop-0.20.2/lib/jetty-util-6.1.14.jar, file:/opt/mapr/hadoop/hadoop-0.20.2/lib/junit-4.5.jar, file:/opt/mapr/hadoop/hadoop-0.20.2/lib/kfs-0.2.2.jar, file:/opt/mapr/hadoop/hadoop-0.20.2/lib/log4j-1.2.15.jar, file:/opt/mapr/hadoop/hadoop-0.20.2/lib/logging-0.1.jar, file:/opt/mapr/hadoop/hadoop-0.20.2/lib/maprfs-0.1.jar, file:/opt/mapr/hadoop/hadoop-0.20.2/lib/maprfs-test-0.1.jar, file:/opt/mapr/hadoop/hadoop-0.20.2/lib/mockito-all-1.8.2.jar, file:/opt/mapr/hadoop/hadoop-0.20.2/lib/mockito-all-1.8.5.jar, file:/opt/mapr/hadoop/hadoop-0.20.2/lib/mysql-connector-java-5.0.8-bin.jar, file:/opt/mapr/hadoop/hadoop-0.20.2/lib/oro-2.0.8.jar, file:/opt/mapr/hadoop/hadoop-0.20.2/lib/protobuf-java-2.4.1.jar, file:/opt/mapr/hadoop/hadoop-0.20.2/lib/servlet-api-2.5-6.1.14.jar, file:/opt/mapr/hadoop/hadoop-0.20.2/lib/slf4j-api-1.4.3.jar, file:/opt/mapr/hadoop/hadoop-0.20.2/lib/slf4j-log4j12-1.4.3.jar, file:/opt/mapr/hadoop/hadoop-0.20.2/lib/xmlenc-0.52.jar, file:/opt/mapr/hadoop/hadoop-0.20.2/lib/zookeeper-3.3.6.jar, file:/opt/mapr/hadoop/hadoop-0.20.2/lib/jsp-2.1/jsp-2.1.jar, file:/opt/mapr/hadoop/hadoop-0.20.2/lib/jsp-2.1/jsp-api-2.1.jar, file:/tmp/mapr-hadoop/mapred/local/taskTracker/root/jobcache/job_201302051218_0964/jars/classes, file:/tmp/mapr-hadoop/mapred/local/taskTracker/root/jobcache/job_201302051218_0964/jars/job.jar, file:/tmp/mapr-hadoop/mapred/local/taskTracker/root/distcache/-6468543773086522035_1124684670_870520892/maprfs/var/mapr/cluster/mapred/jobTracker/staging/root/.staging/job_201302051218_0964/libjars/bme.jar, file:/tmp/mapr-hadoop/mapred/local/taskTracker/root/distcache/2247779240621316091_673955763_870520559/maprfs/var/mapr/cluster/mapred/jobTracker/staging/root/.staging/job_201302051218_0964/libjars/UserAgentUtils-1.6.jar, file:/tmp/mapr-hadoop/mapred/local/taskTracker/root/distcache/7395240957318548105_504908146_870520637/maprfs/var/mapr/cluster/mapred/jobTracker/staging/root/.staging/job_201302051218_0964/libjars/commons-math3-3.0.jar, file:/tmp/mapr-hadoop/mapred/local/taskTracker/root/distcache/2691031807485332722_550646587_870520833/maprfs/var/mapr/cluster/mapred/jobTracker/staging/root/.staging/job_201302051218_0964/libjars/udf.jar, file:/tmp/mapr-hadoop/mapred/local/taskTracker/root/distcache/7615220508236847536_1908660950_870520762/maprfs/var/mapr/cluster/mapred/jobTracker/staging/root/.staging/job_201302051218_0964/libjars/hive-builtins-0.10.0-SNAPSHOT.jar, file:/tmp/mapr-hadoop/mapred/local/taskTracker/root/distcache/2691031807485332722_550646587_870520833/maprfs/var/mapr/cluster/mapred/jobTracker/staging/root/.staging/job_201302051218_0964/libjars/udf.jar, file:/tmp/mapr-hadoop/mapred/local/taskTracker/root/distcache/-6468543773086522035_1124684670_870520892/maprfs/var/mapr/cluster/mapred/jobTracker/staging/root/.staging/job_201302051218_0964/libjars/bme.jar, file:/tmp/mapr-hadoop/mapred/local/taskTracker/root/jobcache/job_201302051218_0964/attempt_201302051218_0964_r_000011_1/work/]
    2013-02-07 09:02:39,390 WARN org.apache.hadoop.hive.conf.HiveConf: hive-site.xml not found on CLASSPATH
    2013-02-07 09:02:39,618 INFO ExecReducer:
    <GBY>Id =8
      <Children>
        <SEL>Id =7
          <Children>
            <SEL>Id =6
              <Children>
                <FS>Id =5
                  <Parent>Id = 6 null<\Parent>
                <\FS>
              <\Children>
              <Parent>Id = 7 null<\Parent>
            <\SEL>
          <\Children>
          <Parent>Id = 8 null<\Parent>
        <\SEL>
      <\Children>
    <\GBY>
    2013-02-07 09:02:39,618 INFO org.apache.hadoop.hive.ql.exec.GroupByOperator: Initializing Self 8 GBY
    2013-02-07 09:02:39,626 INFO org.apache.hadoop.hive.ql.exec.GroupByOperator: Operator 8 GBY initialized
    2013-02-07 09:02:39,626 INFO org.apache.hadoop.hive.ql.exec.GroupByOperator: Initializing children of 8 GBY
    2013-02-07 09:02:39,626 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: Initializing child 7 SEL
    2013-02-07 09:02:39,626 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: Initializing Self 7 SEL
    2013-02-07 09:02:39,626 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: SELECT struct<_col0:string,_col1:int,_col2:string,_col3:string,_col4:string,_col5:double,_col6:double,_col7:double,_col8:double>
    2013-02-07 09:02:39,633 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
    2013-02-07 09:02:39,638 WARN org.apache.hadoop.mapred.Child: Error running child
    java.lang.RuntimeException: Error in configuring object
     at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
     at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
     at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
     at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:462)
     at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:447)
     at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
     at java.security.AccessController.doPrivileged(Native Method)
     at javax.security.auth.Subject.doAs(Subject.java:415)
     at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
     at org.apache.hadoop.mapred.Child.main(Child.java:264)
    Caused by: java.lang.reflect.InvocationTargetException
     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
     at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
     at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
     at java.lang.reflect.Method.invoke(Method.java:601)
     at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
     ... 9 more
    Caused by: java.lang.RuntimeException: Reduce operator initialization failed
     at org.apache.hadoop.hive.ql.exec.ExecReducer.configure(ExecReducer.java:157)
     ... 14 more
    Caused by: org.apache.hadoop.hive.ql.exec.UDFArgumentException: The UDF implementation class 'com.struq.bme.udf.UDFBMEGenerateFeatureRocArea' is not present in the class path
     at org.apache.hadoop.hive.ql.udf.generic.GenericUDFBridge.initialize(GenericUDFBridge.java:141)
     at org.apache.hadoop.hive.ql.udf.generic.GenericUDF.initializeAndFoldConstants(GenericUDF.java:98)
     at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.initialize(ExprNodeGenericFuncEvaluator.java:137)
     at org.apache.hadoop.hive.ql.exec.Operator.initEvaluators(Operator.java:942)
     at org.apache.hadoop.hive.ql.exec.Operator.initEvaluatorsAndReturnStruct(Operator.java:968)
     at org.apache.hadoop.hive.ql.exec.SelectOperator.initializeOp(SelectOperator.java:60)
     at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:360)
     at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:436)
     at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:392)
     at org.apache.hadoop.hive.ql.exec.GroupByOperator.initializeOp(GroupByOperator.java:388)
     at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:360)
     at org.apache.hadoop.hive.ql.exec.ExecReducer.configure(ExecReducer.java:150)
     ... 14 more
    2013-02-07 09:02:39,641 INFO org.apache.hadoop.mapred.Task: Runnning cleanup for the task
    
     



Outcomes