AnsweredAssumed Answered

Node Manager getting shutdown by non-existent fuse client?

Question asked by jgschmitz on Jul 3, 2018
Latest reply on Jul 3, 2018 by jgschmitz

One of our nodemanagers keeps crashing this is on a licensed 6.0.1 cluster - its complaining about Corruption: 3 missing files; e.g.: /tmp/hadoop-mapr/yarn-nm-recovery/yarn-nm-state/000015.sst but these files don't exist on any other node on the cluster either and nodemanager is running fine on all those - has anyone seen this issue? There is a mention of fuse in the logs but the fuse client is not installed here - thanks!!  mapr6.0.1 nodemanager

 

Error starting NodeManager
org.apache.hadoop.service.ServiceStateException: org.fusesource.leveldbjni.internal.NativeDB$DBException: Corruption: 3 missing files; e.g.: /tmp/hadoop-mapr/yarn-nm-recovery/yarn-nm-state/000015.sst
at org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:59)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:172)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartRecoveryStore(NodeManager.java:158)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:196)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:476)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:524)
Caused by: org.fusesource.leveldbjni.internal.NativeDB$DBException: Corruption: 3 missing files; e.g.: /tmp/hadoop-mapr/yarn-nm-recovery/yarn-nm-state/000015.sst
at org.fusesource.leveldbjni.internal.NativeDB.checkStatus(NativeDB.java:200)
at org.fusesource.leveldbjni.internal.NativeDB.open(NativeDB.java:218)
at org.fusesource.leveldbjni.JniDBFactory.open(JniDBFactory.java:168)
at org.apache.hadoop.yarn.server.nodemanager.recovery.NMLeveldbStateStoreService.initStorage(NMLeveldbStateStoreService.java:930)
at org.apache.hadoop.yarn.server.nodemanager.recovery.NMStateStoreService.serviceInit(NMStateStoreService.java:204)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
... 5 more
2018-07-03 15:45:16,968 INFO org.apache.hadoop.yarn.server.nodemanager.NodeManager: SHUTDOWN_MSG:

Outcomes