AnsweredAssumed Answered

Problem reading HFile Trailer from file

Question asked by cleonn on May 2, 2013
Latest reply on May 2, 2013 by cleonn
After some upgrading and reconfiguration of the M3 cluster we're testing on, I get these errors in /opt/mapr/hbase/hbase-0.94.5/logs/hbase-root-regionserver-<node>.log:

<pre>
013-05-02 17:25:28,985 INFO org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Opening of region {NAME => '.META.,,1', STARTKEY => '', ENDKEY => '', ENCODED => 1028785192,} failed, marking as FAILED_OPEN in ZK
2013-05-02 17:25:29,004 INFO org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Opening of region {NAME => '.META.,,1', STARTKEY => '', ENDKEY => '', ENCODED => 1028785192,} failed, marking as FAILED_OPEN in ZK
2013-05-02 17:25:29,004 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:60020-0x23e659ac1a00010 Attempt to transition the unassigned node for 1028785192 from RS_ZK_REGION_OPENING to RS_ZK_REGION_FAILED_OPEN failed, the node existed but was version 3017297 not the expected version 3017296
2013-05-02 17:25:29,005 WARN org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Unable to mark region {NAME => '.META.,,1', STARTKEY => '', ENDKEY => '', ENCODED => 1028785192,} as FAILED_OPEN. It's likely that the master already timed out this open attempt, and thus another RS already has the region.
2013-05-02 17:25:29,202 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Received request to open region: .META.,,1.1028785192
2013-05-02 17:25:29,251 INFO org.apache.hadoop.hbase.regionserver.HRegion: Setting up tabledescriptor config now ...
2013-05-02 17:25:29,256 INFO org.apache.hadoop.hbase.regionserver.Store: time to purge deletes set to 0ms in store info
2013-05-02 17:25:29,271 ERROR org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Failed open of region=.META.,,1.1028785192, starting to roll back the global memstore size.
java.io.IOException: java.io.IOException: org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem reading HFile Trailer from file maprfs:/hbase/.META./1028785192/info/7affd224dec0493a970d3894f0e90279
at org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:597)
at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:510)
at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:4177)
at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:4125)
at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:339)
at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:110)
at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:169)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:722)
Caused by: java.io.IOException: org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem reading HFile Trailer from file maprfs:/hbase/.META./1028785192/info/7affd224dec0493a970d3894f0e90279
at org.apache.hadoop.hbase.regionserver.Store.loadStoreFiles(Store.java:433)
at org.apache.hadoop.hbase.regionserver.Store.<init>(Store.java:240)
at org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:3141)
at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:572)
at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:570)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
... 3 more
Caused by: org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem reading HFile Trailer from file maprfs:/hbase/.META./1028785192/info/7affd224dec0493a970d3894f0e90279
at org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:545)
at org.apache.hadoop.hbase.io.hfile.HFile.createReaderWithEncoding(HFile.java:589)
at org.apache.hadoop.hbase.regionserver.StoreFile$Reader.<init>(StoreFile.java:1261)
at org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:512)
at org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:603)
at org.apache.hadoop.hbase.regionserver.Store$1.call(Store.java:409)
at org.apache.hadoop.hbase.regionserver.Store$1.call(Store.java:404)
... 8 more
Caused by: java.lang.IllegalArgumentException: Invalid HFile version: 0 (expected to be between 1 and 2)
at org.apache.hadoop.hbase.io.hfile.HFile.checkFormatVersion(HFile.java:732)
at org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:323)
at org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:543)
... 14 more
</pre>

How do fix the HFile? I've tried to run hbck with -fix, -repair and an assortment of different flags on the region, but to no avail:

<pre>sudo -u mapr ./bin/hbase hbck -repair .META.</pre>

Gives:

<pre>
13/05/02 17:31:02 INFO util.NativeCodeLoader: Loaded the native-hadoop library
13/05/02 17:31:02 INFO security.JniBasedUnixGroupsMapping: Using JniBasedUnixGroupsMapping for Group resolution
Allow checking/fixes for table: .META.
13/05/02 17:31:02 INFO mapr.TableMappingRulesFactory: Could not instantiate TableMappingRules class, assuming HBase only cluster.
13/05/02 17:31:02 INFO mapr.TableMappingRulesFactory: 'mapr-hbase-dbclient' package is required to access MapRDB tables.
13/05/02 17:31:02 INFO zookeeper.ZooKeeper: Client environment:zookeeper.version=3.4.5-1392090, built on 09/30/2012 17:52 GMT
13/05/02 17:31:02 INFO zookeeper.ZooKeeper: Client environment:java.version=1.7.0_21
13/05/02 17:31:02 INFO zookeeper.ZooKeeper: Client environment:java.vendor=Oracle Corporation
13/05/02 17:31:02 INFO zookeeper.ZooKeeper: Client environment:java.home=/usr/lib/jvm/java-7-openjdk-amd64/jre
13/05/02 17:31:02 INFO zookeeper.ZooKeeper: Client environment:java.library.path=/opt/mapr/hadoop/hadoop-0.20.2/bin/../lib/native/Linux-amd64-64::/opt/mapr/hbase/hbase-0.94.5/bin/../lib/native/Linux-amd64-64
13/05/02 17:31:02 INFO zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/tmp
13/05/02 17:31:02 INFO zookeeper.ZooKeeper: Client environment:java.compiler=<NA>
13/05/02 17:31:02 INFO zookeeper.ZooKeeper: Client environment:os.name=Linux
13/05/02 17:31:02 INFO zookeeper.ZooKeeper: Client environment:os.arch=amd64
13/05/02 17:31:02 INFO zookeeper.ZooKeeper: Client environment:os.version=3.2.0-41-generic
13/05/02 17:31:02 INFO zookeeper.ZooKeeper: Client environment:user.name=mapr
13/05/02 17:31:02 INFO zookeeper.ZooKeeper: Client environment:user.home=/home/mapr
13/05/02 17:31:02 INFO zookeeper.ZooKeeper: Client environment:user.dir=/opt/mapr/hbase/hbase-0.94.5
13/05/02 17:31:02 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=mapr005.data:5181,mapr003.data:5181,mapr002.data:5181,mapr001.data:5181 sessionTimeout=180000 watcher=hconnection
13/05/02 17:31:02 INFO zookeeper.RecoverableZooKeeper: The identifier of this process is 32556@mapr002
13/05/02 17:31:02 INFO zookeeper.ClientCnxn: Opening socket connection to server mapr002.data/10.21.0.22:5181. Will not attempt to authenticate using SASL (unknown error)
13/05/02 17:31:02 INFO zookeeper.ClientCnxn: Socket connection established to mapr002.data/10.21.0.22:5181, initiating session
13/05/02 17:31:02 WARN zookeeper.ClientCnxnSocket: Connected to an old server; r-o mode will be unavailable
13/05/02 17:31:02 INFO zookeeper.ClientCnxn: Session establishment complete on server mapr002.data/10.21.0.22:5181, sessionid = 0x13e659ac1970021, negotiated timeout = 40000
Version: 0.94.5-mapr
13/05/02 17:31:03 INFO util.HBaseFsck: Loading regioninfos HDFS
13/05/02 17:31:03 INFO util.HBaseFsck: Loading HBase regioninfo from HDFS...
13/05/02 17:31:03 INFO util.HBaseFsck: Checking HBase region split map from HDFS data...
13/05/02 17:31:03 INFO util.HBaseFsck: No integrity errors.  We are done with this phase. Glorious.
13/05/02 17:31:03 INFO util.HBaseFsck: Loading regionsinfo from the .META. table
Exception in thread "main" org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=10, exceptions:
Thu May 02 17:31:03 CEST 2013, org.apache.hadoop.hbase.client.ScannerCallable@39f5329f, org.apache.hadoop.hbase.NotServingRegionException: org.apache.hadoop.hbase.NotServingRegionException: Region is not online: .META.,,1
Thu May 02 17:31:04 CEST 2013, org.apache.hadoop.hbase.client.ScannerCallable@39f5329f, org.apache.hadoop.hbase.NotServingRegionException: org.apache.hadoop.hbase.NotServingRegionException: Region is not online: .META.,,1
Thu May 02 17:31:05 CEST 2013, org.apache.hadoop.hbase.client.ScannerCallable@39f5329f, org.apache.hadoop.hbase.NotServingRegionException: org.apache.hadoop.hbase.NotServingRegionException: Region is not online: .META.,,1
Thu May 02 17:31:06 CEST 2013, org.apache.hadoop.hbase.client.ScannerCallable@39f5329f, org.apache.hadoop.hbase.NotServingRegionException: org.apache.hadoop.hbase.NotServingRegionException: Region is not online: .META.,,1
Thu May 02 17:31:08 CEST 2013, org.apache.hadoop.hbase.client.ScannerCallable@39f5329f, org.apache.hadoop.hbase.NotServingRegionException: org.apache.hadoop.hbase.NotServingRegionException: Region is not online: .META.,,1
Thu May 02 17:31:10 CEST 2013, org.apache.hadoop.hbase.client.ScannerCallable@39f5329f, org.apache.hadoop.hbase.NotServingRegionException: org.apache.hadoop.hbase.NotServingRegionException: Region is not online: .META.,,1
Thu May 02 17:31:14 CEST 2013, org.apache.hadoop.hbase.client.ScannerCallable@39f5329f, org.apache.hadoop.hbase.NotServingRegionException: org.apache.hadoop.hbase.NotServingRegionException: Region is not online: .META.,,1
Thu May 02 17:31:18 CEST 2013, org.apache.hadoop.hbase.client.ScannerCallable@39f5329f, org.apache.hadoop.hbase.NotServingRegionException: org.apache.hadoop.hbase.NotServingRegionException: Region is not online: .META.,,1
Thu May 02 17:31:26 CEST 2013, org.apache.hadoop.hbase.client.ScannerCallable@39f5329f, org.apache.hadoop.hbase.NotServingRegionException: org.apache.hadoop.hbase.NotServingRegionException: Region is not online: .META.,,1
Thu May 02 17:31:42 CEST 2013, org.apache.hadoop.hbase.client.ScannerCallable@39f5329f, org.apache.hadoop.hbase.NotServingRegionException: org.apache.hadoop.hbase.NotServingRegionException: Region is not online: .META.,,1

at org.apache.hadoop.hbase.client.ServerCallable.withRetries(ServerCallable.java:183)
at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:206)
at org.apache.hadoop.hbase.client.MetaScanner.access$000(MetaScanner.java:54)
at org.apache.hadoop.hbase.client.MetaScanner$1.connect(MetaScanner.java:133)
at org.apache.hadoop.hbase.client.MetaScanner$1.connect(MetaScanner.java:130)
at org.apache.hadoop.hbase.client.HConnectionManager.execute(HConnectionManager.java:383)
at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:130)
at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:105)
at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:83)
at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:67)
at org.apache.hadoop.hbase.util.HBaseFsck.loadMetaEntries(HBaseFsck.java:2584)
at org.apache.hadoop.hbase.util.HBaseFsck.onlineConsistencyRepair(HBaseFsck.java:384)
at org.apache.hadoop.hbase.util.HBaseFsck.onlineHbck(HBaseFsck.java:437)
at org.apache.hadoop.hbase.util.HBaseFsck.exec(HBaseFsck.java:3633)
at org.apache.hadoop.hbase.util.HBaseFsck.run(HBaseFsck.java:3452)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
at org.apache.hadoop.hbase.util.HBaseFsck.main(HBaseFsck.java:3446)

Outcomes