Hive CLI fails with error "Metastore contains multiple versions"

Document created by Hao Zhu Employee on Feb 7, 2016
Version 1Show Document
  • View in full screen mode

Author: Hao Zhu

 

Original Publication Date: September 2, 2015

 

Environment :

Hive 0.13

Symptom:

Hive CLI fails with error "Metastore contains multiple versions".

 

Below is a sample stacktrace:

Exception in thread "main" java.lang.RuntimeException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient 
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:346)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Caused by: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient
at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1412)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:62)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:72)
at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2511)
at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:2523)
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:340)
... 7 more
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1410)
... 12 more
Caused by: MetaException(message:Metastore contains multiple versions)
at org.apache.hadoop.hive.metastore.ObjectStore.getMSchemaVersion(ObjectStore.java:6373)
at org.apache.hadoop.hive.metastore.ObjectStore.getMetaStoreSchemaVersion(ObjectStore.java:6335)
at org.apache.hadoop.hive.metastore.ObjectStore.checkSchema(ObjectStore.java:6294)
at org.apache.hadoop.hive.metastore.ObjectStore.verifySchema(ObjectStore.java:6282)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:108)
at com.sun.proxy.$Proxy9.verifySchema(Unknown Source)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:476)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:523)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:397)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.<init>(HiveMetaStore.java:356)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:54)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:59)
at org.apache.hadoop.hive.metastore.HiveMetaStore.newHMSHandler(HiveMetaStore.java:4944)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:171)

Root Cause

There are 2 scenarios where Hive metastore will modify "VERSION" table.

1. hive.metastore.schema.verification=false AND get "NULL" when fetching the schema version from backend database due to some reason.

Hive will insert a new row in VERSION table in this case:

INSERT INTO `VERSION` (`VER_ID`,`VERSION_COMMENT`,`SCHEMA_VERSION`) VALUES (16,'Set by MetaStore','0.13.0')

2. hive.metastore.schema.verification=true AND metastore schema version is the same as Hive distribution.

Hive will update the existing row in VERSION table in this case:

UPDATE `VERSION` SET `SCHEMA_VERSION`='0.13.0',`VERSION_COMMENT`='Set by MetaStore' WHERE `VER_ID`=16

Actually only scenario 1 will add more rows into VERSION table. Setting hive.metastore.schema.verification=true can avoid it;

 

Scenario 2 is actually a different bug HIVE-9749, the "if-else" logic is wrong in Hive 0.13 code and it is fixed in Hive 1.2.

 

The correct behavior of scenario 2 should be:  Hive should update VERSION table only when hive.metastore.schema.verification=false AND metastore schema version is different than Hive distribution.

 

But anyway, scenario 2 will never insert duplicate rows into VERSION table.

 

See the code of metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java

 

private synchronized void checkSchema() throws MetaException {

  // recheck if it got verified by another thread while we were waiting

  if (isSchemaVerified.get()) {

  return;

  }

 

  boolean strictValidation =

  HiveConf.getBoolVar(getConf(), HiveConf.ConfVars.METASTORE_SCHEMA_VERIFICATION);

  // read the schema version stored in metastore db

  String schemaVer = getMetaStoreSchemaVersion();

  if (schemaVer == null) {

  // metastore has no schema version information

  if (strictValidation) {

  throw new MetaException("Version information not found in metastore. ");

  } else {

  LOG.warn("Version information not found in metastore. "

  + HiveConf.ConfVars.METASTORE_SCHEMA_VERIFICATION.toString() +

  " is not enabled so recording the schema version " +

  MetaStoreSchemaInfo.getHiveSchemaVersion());

  setMetaStoreSchemaVersion(MetaStoreSchemaInfo.getHiveSchemaVersion(),

  "Set by MetaStore");

  }

  } else {

  // metastore schema version is different than Hive distribution needs

  if (strictValidation) {
  if (!schemaVer.equalsIgnoreCase(MetaStoreSchemaInfo.getHiveSchemaVersion())) {
  throw new MetaException("Hive Schema version "

  + MetaStoreSchemaInfo.getHiveSchemaVersion() +

  " does not match metastore's schema version " + schemaVer +

  " Metastore is not upgraded or corrupt");

  } else {

  LOG.warn("Metastore version was " + schemaVer + " " +

  HiveConf.ConfVars.METASTORE_SCHEMA_VERIFICATION.toString() +

  " is not enabled so recording the new schema version " +

  MetaStoreSchemaInfo.getHiveSchemaVersion());

  setMetaStoreSchemaVersion(MetaStoreSchemaInfo.getHiveSchemaVersion(),

  "Set by MetaStore");

  }

  }

  }

  isSchemaVerified.set(true);

  return;

  }

See the Hive 1.2 code of metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java:

private synchronized void checkSchema() throws MetaException {

  // recheck if it got verified by another thread while we were waiting

  if (isSchemaVerified.get()) {

  return;

  }

 

  boolean strictValidation =

  HiveConf.getBoolVar(getConf(), HiveConf.ConfVars.METASTORE_SCHEMA_VERIFICATION);

  // read the schema version stored in metastore db

  String schemaVer = getMetaStoreSchemaVersion();

  if (schemaVer == null) {

  if (strictValidation) {

  throw new MetaException("Version information not found in metastore. ");

  } else {

  LOG.warn("Version information not found in metastore. "

  + HiveConf.ConfVars.METASTORE_SCHEMA_VERIFICATION.toString() +

  " is not enabled so recording the schema version " +

  MetaStoreSchemaInfo.getHiveSchemaVersion());

  setMetaStoreSchemaVersion(MetaStoreSchemaInfo.getHiveSchemaVersion(),

  "Set by MetaStore " + USER + "@" + HOSTNAME);

  }

  } else {

  // metastore schema version is different than Hive distribution needs

  if (schemaVer.equalsIgnoreCase(MetaStoreSchemaInfo.getHiveSchemaVersion())) {
  LOG.debug("Found expected HMS version of " + schemaVer);

  } else {

  if (strictValidation) {
  throw new MetaException("Hive Schema version "

  + MetaStoreSchemaInfo.getHiveSchemaVersion() +

  " does not match metastore's schema version " + schemaVer +

  " Metastore is not upgraded or corrupt");

  } else {

  LOG.error("Version information found in metastore differs " + schemaVer +

  " from expected schema version " + MetaStoreSchemaInfo.getHiveSchemaVersion() +

  ". Schema verififcation is disabled " +

  HiveConf.ConfVars.METASTORE_SCHEMA_VERIFICATION + " so setting version.");

  setMetaStoreSchemaVersion(MetaStoreSchemaInfo.getHiveSchemaVersion(),

  "Set by MetaStore " + USER + "@" + HOSTNAME);

  }

  }

  }

  isSchemaVerified.set(true);

  return;

  }

 

So from above codes, Hive 1.2 just switched the two if-clauses for scenario 2.

Although scenario 2 is also buggy, but it is not impacting this multiple versions issue.

 

Solution:

1. To fix this issue and bring Hive back online, we need to delete the bad rows from "VERSION" table in backend database.

 

For example, if current "VERSION" looks as below:

 

mysql> select * from version;

+--------+----------------+------------------+

| VER_ID | SCHEMA_VERSION | VERSION_COMMENT |

+--------+----------------+------------------+

| 1 | 0.13.0 | Set by MetaStore |

| 6 | 0.13.0 | Set by MetaStore |

| 11 | 0.13.0 | Set by MetaStore |

| 16 | 0.13.0 | Set by MetaStore |

| 21 | 0.13.0 | Set by MetaStore |

| 26 | 0.13.0 | Set by MetaStore |

| 31 | 0.13.0 | Set by MetaStore |

+--------+----------------+------------------+

7 rows in set (0.01 sec)

 

 

The fix is :

delete from version where ver_id<>1;

2. To avoid Hive metastore implicitly inserting into VERSION table, please set hive.metastore.schema.verification to true in hive-site.xml on all the nodes which will connect to this metastore database.

1 person found this helpful

Attachments

    Outcomes