Author: Najmuddin Chirammal
Original Publication Date: June 26, 2015
MapR UI (MCS) shows Nodemanager/ResourceManager/HistoryServer down even though the services are running on respective nodes.
- MapR Cluster 4.0.1,4.0.2 and 4.1.0
- Using Yarn Services. (NodeManager,ResourceManager,HistoryServer)
- Copy the current NodeManager, HistorySserver & ResourceManager pid files to /opt/mapr/pid directory.
# cp -p /tmp/yarn-mapr-nodemanager.pid /tmp/yarn-mapr-resourcemanager.pid /tmp/mapred-mapr-historyserver.pid /opt/mapr/pid/
If -p option is not used, make sure the destination file has same permission/ownership as the source (Should be owned by MAPR_USER)
- Update PID file location
Add/Modify the following environment variables in /opt/mapr/conf/env.sh.
- Run following commands as MAPR_USER (mapr by default) on respective nodes to make sure the status reported correctly.
su - mapr -c '/opt/mapr/hadoop/hadoop-2.4.1/sbin/mr-jobhistory-daemon.sh status historyserver'
su - mapr -c '/opt/mapr/hadoop/hadoop-2.4.1/sbin/yarn-daemon.sh status nodemanager'
su - mapr -c '/opt/mapr/hadoop/hadoop-2.4.1/sbin/yarn-daemon.sh status resourcemanager'
Status of Yarn services are determined by checking PID status read from the pid file created by respective services. If the PID files get removed, it'd generate a false alarm and the services would be listed as down. Since yarn services stores PID files under '/tmp' by default, mostly the issue is triggered by 'tmpwatch' (which cleans /tmp based on the options passed to it, many Linux distributions have 'tmpwatch' cron job enabled by default). This issue can be resolved by moving the PID files to /opt/mapr/pid directory.