AnsweredAssumed Answered

Why does HBase regions not get into recovered from transitioned state?

Question asked by mufeed on Aug 10, 2016
Latest reply on Aug 10, 2016 by mufeed

Environment:

MapR 5.0

HBase 0.98

 

Symptom:

At one point during data load using HBase bulk load method, the following messages are seen:

2016-07-28 07:26:05,265 - INFO [LoadIncrementalHFiles-41:RpcRetryingCaller] - Call exception, tries=30, retries=35, retryTime=470265ms, msg=row '849771775062013-14675201' on table 'abc' at region=esr,849771775062013-14675201,1469138118402.ebd358a7405ac676f9eb533d6bd29abe., hostname=hostname.domain,60020,1467496913740, seqNum=7964

2016-07-28 07:26:05,965 - INFO [LoadIncrementalHFiles-42:RpcRetryingCaller] - Call exception, tries=30, retries=35, retryTime=470509ms, msg=row '570176031062013-1467671' on table 'abc' at region=esr,570176031062013-1467671,1469058049924.4a0818fce2ec0f199935ac8bdf5ddf98., hostname=hostname.domain,60020,1467496913740, seqNum=8199

 

After the retries (indicated by the log events above), the processing gets interrupted with the following exception:

Thu Jul 28 07:18:15 UTC 2016, org.apache.hadoop.hbase.client.RpcRetryingCaller@3f72c8f1, org.apache.hadoop.hbase.NotServingRegionException: org.apache.hadoop.hbase.NotServingRegionException: Region abc,29622 
6516062013-14679575,1469152929191.9a6ef74e6db04f84b7e441327acb4c29. is not online on hostname.domain,60020,1469530054000 
at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:2820) 
at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:4397) 
at org.apache.hadoop.hbase.regionserver.HRegionServer.bulkLoadHFile(HRegionServer.java:3418) 
at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:30948) 
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2095) 
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:101) 
at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130) 
at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:107) 
at java.lang.Thread.run(Thread.java:745)

 

While the MapR status was as follows:

Hbase shell: (status ’abc’)

45 servers, 4 dead, 33.8222 average load

MapR cluster status can be viewed using the 'maprcli dashboard info' command or the UI.

 

From status ’detailed’:

4 regionsInTransition

{4a0818fce2ec0f199935ac8bdf5ddf98 state=FAILED_OPEN, ts=1469530405937, server=null}

{9a6ef74e6db04f84b7e441327acb4c29 state=FAILED_OPEN, ts=1469530382583, server=null}

{ebd358a7405ac676f9eb533d6bd29abe state=FAILED_OPEN, ts=1469530211058, server=null}

{fe5810cd3b3e19bf7b7e466a0cd7e26a state=FAILED_OPEN, ts=1469530406553, server=null}

….

4 dead servers

hostname.domain,60020,1469526502913

hostname1.domain,60020,1469650152815

hostname2.domain,60020,1469526503958

hostname3.domain,60020,1469526502932

Outcomes