AnsweredAssumed Answered

Table join issue after upgrade to Hive 0.9.0

Question asked by mlauer on Oct 4, 2012
Latest reply on Oct 18, 2012 by mlauer
Hello,

After upgrading our test cluster to MapR V2.0 (Hive 0.9.0), we were no longer able to join two tables in Hive.  The following error was output by Hive when trying to join two simple tables below.

PersonCount(Name STRING, Count INT)

PersonState(Name STRING, State STRING)

    hive> SELECT ps.Name, ps.State, pc.Count
        > FROM PersonState ps
        > JOIN PersonCount pc
        > ON ps.Name = pc.Name;
    Total MapReduce jobs = 1
    Launching Job 1 out of 1
    Number of reduce tasks not specified. Estimated from input data size: 1
    In order to change the average load for a reducer (in bytes):
      set hive.exec.reducers.bytes.per.reducer=<number>
    In order to limit the maximum number of reducers:
      set hive.exec.reducers.max=<number>
    In order to set a constant number of reducers:
      set mapred.reduce.tasks=<number>
    Starting Job = job_201209141257_0114, Tracking URL = http://mapr1:50030/jobdetails.jsp?jobid=job_201209141257_0114
    Kill Command = /opt/mapr/hadoop/hadoop-0.20.2/bin/../bin/hadoop job  -Dmapred.job.tracker=maprfs:/// -kill job_201209141257_0114
    Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
    2012-10-02 14:43:47,768 Stage-1 map = 0%,  reduce = 0%
    2012-10-02 14:44:14,883 Stage-1 map = 100%,  reduce = 100%
    Ended Job = job_201209141257_0114 with errors
    Error during job, obtaining debugging information...
    Examining task ID: task_201209141257_0114_m_000000 (and more) from job job_201209141257_0114
    Exception in thread "Thread-13" java.lang.IllegalArgumentException: port out of range:-1
            at java.net.InetSocketAddress.<init>(InetSocketAddress.java:118)
            at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:166)
            at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:124)
            at org.apache.hadoop.hive.ql.exec.JobTrackerURLResolver.getURL(JobTrackerURLResolver.java:42)
            at org.apache.hadoop.hive.ql.exec.JobDebugger.showJobFailDebugInfo(JobDebugger.java:209)
            at org.apache.hadoop.hive.ql.exec.JobDebugger.run(JobDebugger.java:92)
            at java.lang.Thread.run(Thread.java:662)
    FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask
    MapReduce Jobs Launched:
    Job 0: Map: 1  Reduce: 1   MAPRFS Read: 0 MAPRFS Write: 0 FAIL
    Total MapReduce CPU Time Spent: 0 msec

In addition, when looking at the Job Tracker Logs, we see the following exception for every node.

    java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"name":"joe","state":"OH"} at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:161) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:405) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:336) at org.apache.hadoop.mapred.Child$4.run(Child.java:270) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1126) at org.apache.hadoop.mapred.Child.main(Child.java:264) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"name":"joe","state":"OH"} at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:548) at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:143) ... 8 more Caused by: java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be cast to org.apache.hadoop.io.IntWritable at org.apache.hadoop.hive.serde2.lazy.objectinspector.primitive.LazyIntObjectInspector.get(LazyIntObjectInspector.java:38) at org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serialize(LazyBinarySerDe.java:317) at org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serializeStruct(LazyBinarySerDe.java:255) at org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serialize(LazyBinarySerDe.java:202) at org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.processOp(ReduceSinkOperator.java:236) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:762) at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:83) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:762) at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:529)

We rolled back to Hive 0.7.1 and were able to again join tables.  Has anyone experienced similar issues after an upgrade to Hive 0.9.0?

Thanks,

Matt

Outcomes