AnsweredAssumed Answered

NodeManager Localizer RPC cannot bind to port 8040

Question asked by reza on Mar 14, 2016
Latest reply on May 23, 2016 by zeeshan
Branched to a new discussion

I have a simple cluster of a control node and datanode. When observing the web control I get the following alarm summary

 

Can not determine if service: nodemanager is running. Check logs at: /opt/mapr/hadoop/hadoop-2.4.1/logs/

by observing the logs in:

# tail -50 /opt/mapr/hadoop/hadoop-2.4.1/logs/yarn-mapr-nodemanager-maprdata-virtual-machine.log
        at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
        at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:358)
        at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:404)
Caused by: java.net.BindException: Problem binding to [0.0.0.0:8040] java.net.BindException: Address already in use; For more details see:  http://wiki.apache.org/hadoop/BindException
        at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:719)
        at org.apache.hadoop.ipc.Server.bind(Server.java:421)
        at org.apache.hadoop.ipc.Server$Listener.<init>(Server.java:563)
        at org.apache.hadoop.ipc.Server.<init>(Server.java:2153)
        at org.apache.hadoop.ipc.RPC$Server.<init>(RPC.java:897)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server.<init>(ProtobufRpcEngine.java:505)
        at org.apache.hadoop.ipc.ProtobufRpcEngine.getServer(ProtobufRpcEngine.java:480)
        at org.apache.hadoop.ipc.RPC$Builder.build(RPC.java:742)
        at org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl.createServer(RpcServerFactoryPBImpl.java:169)
        at org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl.getServer(RpcServerFactoryPBImpl.java:132)
        ... 13 more
2016-03-15 16:40:15,963 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping NodeManager metrics system...
2016-03-15 16:40:15,964 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NodeManager metrics system stopped.
2016-03-15 16:40:15,964 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NodeManager metrics system shutdown complete.
2016-03-15 16:40:15,964 FATAL org.apache.hadoop.yarn.server.nodemanager.NodeManager: Error starting NodeManager
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.net.BindException: Problem binding to [0.0.0.0:8040] java.net.BindException: Address already in use; For more details see:  http://wiki.apache.org/hadoop/BindException
        at org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl.getServer(RpcServerFactoryPBImpl.java:139)
        at org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC.getServer(HadoopYarnProtoRPC.java:65)
        at org.apache.hadoop.yarn.ipc.YarnRPC.getServer(YarnRPC.java:54)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService.createServer(ResourceLocalizationService.java:284)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService.serviceStart(ResourceLocalizationService.java:264)
        at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
        at org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:120)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceStart(ContainerManagerImpl.java:300)
        at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
        at org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:120)
        at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceStart(NodeManager.java:197)
        at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
        at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:358)
        at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:404)
Caused by: java.net.BindException: Problem binding to [0.0.0.0:8040] java.net.BindException: Address already in use; For more details see:  http://wiki.apache.org/hadoop/BindException
        at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:719)
        at org.apache.hadoop.ipc.Server.bind(Server.java:421)
        at org.apache.hadoop.ipc.Server$Listener.<init>(Server.java:563)
        at org.apache.hadoop.ipc.Server.<init>(Server.java:2153)
        at org.apache.hadoop.ipc.RPC$Server.<init>(RPC.java:897)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server.<init>(ProtobufRpcEngine.java:505)
        at org.apache.hadoop.ipc.ProtobufRpcEngine.getServer(ProtobufRpcEngine.java:480)
        at org.apache.hadoop.ipc.RPC$Builder.build(RPC.java:742)
        at org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl.createServer(RpcServerFactoryPBImpl.java:169)
        at org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl.getServer(RpcServerFactoryPBImpl.java:132)
        ... 13 more
2016-03-15 16:40:15,967 INFO org.apache.hadoop.yarn.server.nodemanager.NodeManager: SHUTDOWN_MSG:

 

I checked the process on port number 8040

 

# lsof -i tcp:8040                     

COMMAND  PID USER   FD   TYPE DEVICE SIZE/OFF NODE NAME

java2206 mapr  221u  IPv6  17527 0t0  TCP *:8040 (LISTEN)

# kill -9 2206

# service mapr-warden restart

 

However I still get the same problem, any suggestion? I have tried rebooting the maching as well!

Outcomes