AnsweredAssumed Answered

Node Alarm Heartbeat Processing Slow

Question asked by thatguy on Apr 30, 2014
Latest reply on Dec 7, 2016 by mufeed
Greetings.  I have a 7 node, 1 user cluster that keeps tripping "Node Alarm Heartbeat Processing Slow" alarms during queries.

I found this doc page:
http://doc.mapr.com/display/MapR/Alarms+Reference#AlarmsReference-HBRegionHeartbeatProcessingSlow

That, in part says to do this:
If this alarm occurs frequently, investigate what might be causing the relevant node or nodes to be busy, or whether the CLDB nodes have enough resources to handle their load.

Each node has 128GB RAM and 24 cores with 10 spinning disks for data.  I originally noticed the errors while inserting data, but now I'm seeing it as I query as well.  The latest instance was a simple, hive query of select * from table where field = 'y' or field = 'z'; 

I'm using M3 version 3.1.0.23703.GA if that matters.

I'm the only one running queries on this cluster and I only ran one query at a time. 

Oddly, the only node that doesn't show this alarm is running all this:
cldb
fileserver
metrics
tasktracker
zookeeper

I'm at a loss for where to look to diagnose this.  Any ideas?

Thanks,
thatguy

Outcomes