AnsweredAssumed Answered

Issue with JDBC connection and Hiveserver

Question asked by dimamah on Jul 30, 2013
I'm running tens of Hiveservers on different ports in the cluster and the users are connection to the Hiveserver using a JDBC connection either from Java or from DBVisualizer. 
Every once in a while when trying to connect to a specific Hiveserver an error is thrown : 
`FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask`  
 
Looking in the Hiveserver logs there seems to occur two types of errors: 
  
`FAILED: Error in metadata: javax.jdo.JDOObjectNotFoundException: No such database row
FailedObject:70747[OID]org.apache.hadoop.hive.metastore.model.MSerDeInfo
NestedThrowables: org.datanucleus.exceptions.NucleusObjectNotFoundException: No such database row
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask
`

OR

`FAILED: Error in metadata: javax.jdo.JDOObjectNotFoundException: No such database row
FailedObject:178940[OID]org.apache.hadoop.hive.metastore.model.MStorageDescriptor
NestedThrowables: org.datanucleus.exceptions.NucleusObjectNotFoundException: No such database row
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask`

In addition, I'd seen these errors also when issuing a `DROP TABLE` command. 
I suspected the local metastore (Set on MYSQL) and investigating further,
I'd set a sniffer on the Hive->Metastore connection and i see that while those errors occur the following SQLS are executed :   
`SELECT A0.NAME,A0.SLIB FROM SERDES A0 WHERE A0.SERDE_ID = 70747` 
AND 
`SELECT A0.INPUT_FORMAT,A0.IS_COMPRESSED,A0.LOCATION,A0.NUM_BUCKETS,A0.OUTPUT_FORMAT FROM SDS A0 WHERE A0.SD_ID = 178940` 

Correspondingly to the errors above. 
The connection seems obvious, the first time hive tries to get information on a SERDE with the ID 70747, Fails (Because this SERDE really doesn't exist in that table) and thus getting `No such database row` error. 
Same thing happens with the StorageDescriptor. 

My questions are

 1. Why does the Hiveserver tries to run those queries against the metastore?
 2. Where does it get those IDs from? (This is the first query against the metastore!)
 3. How can i investigate the situation further?
 4. Can it be connected to some caching in the Hiveserver and maybe leftovers from previous sessions?

Currently to workaround this problem i have to kill the specific Hiveserver process and restart it. 
In addition i didn't manage to reproduce the problem (Thus i don't know whats causing it) but it seems to happen every few days.

Outcomes