Fixing java.lang.ClassNotFoundException: org.apache.derby.jdbc.EmbeddedDriver when user has setup metastore using mysql

Document created by rsingh on Feb 13, 2016
Version 1Show Document
  • View in full screen mode

Author: Rajkumar Singh

 

Original Publication Date: April 27, 2015

 

Env:

Hive 0.12

 

Symptom:

User can see the following derby related logs in-spite of Hive-Metastore configured with the MySQL.

2013-05-19 16:16:02,940 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: Final Path: FS maprfs:/user/hadoop/tmp/hive/hive_2013-05-19_12-31-42_449_4781220154914130577/_tmp.-ext-10000/000088_1

 

2013-05-19 16:16:02,941 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: Writing to temp file: FS maprfs:/user/hadoop/tmp/hive/hive_2013-05-19_12-31-42_449_4781220154914130577/_task_tmp.-ext-10000/_tmp.000088_1

2013-05-19 16:16:02,950 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: New Final Path: FS maprfs:/user/hadoop/tmp/hive/hive_2013-05-19_12-31-42_449_4781220154914130577/_tmp.-ext-10000/000088_1

2013-05-19 16:16:02,985 INFO ExecReducer: ExecReducer: processed 3046471231 rows: used memory = 410427976

2013-05-19 16:16:02,985 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 4 finished. closing...

2013-05-19 16:16:02,985 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 4 forwarded 1 rows

2013-05-19 16:16:24,311 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: SKEWJOINFOLLOWUPJOBS:0

2013-05-19 16:16:24,311 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: 5 finished. closing...

2013-05-19 16:16:24,311 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: 5 forwarded 1 rows

2013-05-19 16:16:24,311 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: 6 finished. closing...

2013-05-19 16:16:24,311 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: 6 forwarded 0 rows

 

2013-05-19 16:16:24,326 ERROR org.apache.hadoop.hive.ql.stats.jdbc.JDBCStatsPublisher: Error during instantiating JDBC driver org.apache.derby.jdbc.EmbeddedDriver.

java.lang.ClassNotFoundException: org.apache.derby.jdbc.EmbeddedDriver
  at java.net.URLClassLoader$1.run(URLClassLoader.java:202)

  at java.security.AccessController.doPrivileged(Native Method)

  at java.net.URLClassLoader.findClass(URLClassLoader.java:190)

  at java.lang.ClassLoader.loadClass(ClassLoader.java:306)

  at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)

  at java.lang.ClassLoader.loadClass(ClassLoader.java:247)

  at java.lang.Class.forName0(Native Method)

  at java.lang.Class.forName(Class.java:169)

  at org.apache.hadoop.hive.ql.stats.jdbc.JDBCStatsPublisher.connect(JDBCStatsPublisher.java:68)

  at org.apache.hadoop.hive.ql.exec.FileSinkOperator.publishStats(FileSinkOperator.java:843)

  at org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:752)

  at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:558)

  at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:567)

  at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:567)

  at org.apache.hadoop.hive.ql.exec.ExecReducer.close(ExecReducer.java:305)

  at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:509)

  at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:447)

  at org.apache.hadoop.mapred.Child$4.run(Child.java:270)

  at java.security.AccessController.doPrivileged(Native Method)

  at javax.security.auth.Subject.doAs(Subject.java:396)

  at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)

  at org.apache.hadoop.mapred.Child.main(Child.java:264)

 

2013-05-19 16:16:24,328 ERROR org.apache.hadoop.hive.ql.exec.FileSinkOperator: StatsPublishing error: cannot connect to database

2013-05-19 16:16:24,328 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: TABLE_ID_1_ROWCOUNT:1

2013-05-19 16:16:24,328 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: 5 Close done

2013-05-19 16:16:24,328 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 4 Close done

2013-05-19 16:16:24,332 INFO org.apache.hadoop.mapred.Task: Task:attempt_201305071944_0930_r_000088_1 is done. And is in the process of commiting

2013-05-19 16:16:24,411 INFO org.apache.hadoop.mapred.Task: Task 'attempt_201305071944_0930_r_000088_1' done.

 

While in the regular one it looks like this :

 

2013-05-20 15:30:13,870 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 4 forwarding 1 rows

2013-05-20 15:30:13,870 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: 5 forwarding 1 rows

2013-05-20 15:30:13,870 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: Final Path: FS maprfs:/user/hadoop/tmp/hive/hive_2013-05-20_15-21-43_925_8527022279162962096/_tmp.-ext-10000/000001_1

2013-05-20 15:30:13,870 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: Writing to temp file: FS maprfs:/user/hadoop/tmp/hive/hive_2013-05-20_15-21-43_925_8527022279162962096/_task_tmp.-ext-10000/_tmp.000001_1

2013-05-20 15:30:13,875 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: New Final Path: FS maprfs:/user/hadoop/tmp/hive/hive_2013-05-20_15-21-43_925_8527022279162962096/_tmp.-ext-10000/000001_1

2013-05-20 15:30:13,899 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 4 forwarding 10 rows

2013-05-20 15:30:13,899 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: 5 forwarding 10 rows

2013-05-20 15:30:13,903 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 4 forwarding 100 rows

2013-05-20 15:30:13,903 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: 5 forwarding 100 rows

 

Root Cause:

while if you set the Metastore to MySQL, it will still create a local derby instance for jobstats with the following configuration:

 

HIVESTATSDBCLASS("hive.stats.dbclass",

  "jdbc:derby"), // other options are jdbc:mysql and hbase as defined in StatsSetupConst.java

  HIVESTATSJDBCDRIVER("hive.stats.jdbcdriver",

  "org.apache.derby.jdbc.EmbeddedDriver"), // JDBC driver specific to the dbclass

  HIVESTATSDBCONNECTIONSTRING("hive.stats.dbconnectionstring",

  "jdbc:derby:;databaseName=TempStatsStore;create=true"), // automatically create database

Solution:

Update following configurations in hive-site.xml to create stats db in the MySQL.

<property>

<name>hive.stats.dbclass</name>

<value>jdbc:mysql</value>

</property>

 

<property>

<name>hive.stats.jdbcdriver</name>

<value>com.mysql.jdbc.Driver</value>

</property>

 

<property>

<name>hive.stats.dbconnectionstring</name>

<value>jdbc:mysql://localhost/stats_db?createDatabaseIfNotExist=true&amp;user=user;password=password</value>

</property>

 

Attachments

    Outcomes