AnsweredAssumed Answered

Failed to create and mount local mapreduce volume

Question asked by seamus on Jul 11, 2017
Latest reply on Jul 20, 2017 by mufeed

Hi all,

   I add a d3 node to my cluster. The Mapr version is 5.2.0.

   I had disksetup my disk. I excute :/opt/mapr/server/disksetup -F /opt/mapr/conf/disks.txt, success. The disksetup command create a disktab file: /dev/sdb1 BB06B2A4-34DB-74A2-6186-028FE0645900.

   The node's disk:

Header 1

[root@d3 conf]# fdisk -l

Disk /dev/sdb: 13837.8 GB, 13837848084480 bytes
255 heads, 63 sectors/track, 1682355 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x561cae7b

Device Boot Start End Blocks Id System
/dev/sdb1 1 32636 262148638+ 83 Linux
/dev/sdb2 32637 65272 262148670 83 Linux
/dev/sdb3 65273 97908 262148670 5 Extended
/dev/sdb4 97909 267349 1361034832+ 83 Linux
/dev/sdb5 65273 97908 262148638+ 83 Linux

Disk /dev/sda: 161.1 GB, 161060945920 bytes
255 heads, 63 sectors/track, 19581 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x092c156c

Device Boot Start End Blocks Id System
/dev/sda1 * 1 64 512000 83 Linux
Partition 1 does not end on cylinder boundary.
/dev/sda2 64 19582 156772352 8e Linux LVM

Disk /dev/mapper/VolGroup-lv_root: 53.7 GB, 53687091200 bytes
255 heads, 63 sectors/track, 6527 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000


Disk /dev/mapper/VolGroup-lv_swap: 16.1 GB, 16101933056 bytes
255 heads, 63 sectors/track, 1957 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000


Disk /dev/mapper/VolGroup-lv_home: 90.7 GB, 90743767040 bytes
255 heads, 63 sectors/track, 11032 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

 

Header 1
[root@d3 conf]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/VolGroup-lv_root 50G 2.9G 44G 7% /
tmpfs 32G 0 32G 0% /dev/shm
/dev/sda1 485M 32M 428M 7% /boot
/dev/mapper/VolGroup-lv_home 84G 184M 79G 1% /home
[root@d3 conf]#

 

 

Then i excute :/opt/mapr/server/configure.sh -C d0 -Z d0,d1,d2 -RM d0,d1  -HS d2 -N mapr.earlydata.com

The mapr-warden start.

I get error on createNMVolume.5000.log

Header 1

...

2017-07-11 22:30:12 INFO A new NodeManager volume will be created.
2017-07-11 22:30:12 DEBUG Will launch command "maprcli volume create -name mapr.d3.infopower.com.local.mapred -path /var/mapr/local/d3.infopower.com/mapred -replication 1 -localvolumehost d3.infopower.com -localvolumeport 5660 -shufflevolume true -rereplicationtimeoutsec 300" with a command attempt timeout of 60 seconds a maximum of 3 attempts and a sleep time of 1 seconds between failed command attempts
2017-07-11 22:30:12 DEBUG Launching "maprcli volume create -name mapr.d3.infopower.com.local.mapred -path /var/mapr/local/d3.infopower.com/mapred -replication 1 -localvolumehost d3.infopower.com -localvolumeport 5660 -shufflevolume true -rereplicationtimeoutsec 300"
2017-07-11 22:30:16 DEBUG Command attempt 1 failed with return code 1 after 4 seconds, sleeping for 1 seconds
2017-07-11 22:30:17 DEBUG Launching "maprcli volume create -name mapr.d3.infopower.com.local.mapred -path /var/mapr/local/d3.infopower.com/mapred -replication 1 -localvolumehost d3.infopower.com -localvolumeport 5660 -shufflevolume true -rereplicationtimeoutsec 300"
2017-07-11 22:30:20 DEBUG Command attempt 2 failed with return code 1 after 3 seconds, sleeping for 1 seconds
2017-07-11 22:30:21 DEBUG Launching "maprcli volume create -name mapr.d3.infopower.com.local.mapred -path /var/mapr/local/d3.infopower.com/mapred -replication 1 -localvolumehost d3.infopower.com -localvolumeport 5660 -shufflevolume true -rereplicationtimeoutsec 300"
2017-07-11 22:30:25 DEBUG Command attempt 3 failed with return code 1 after 4 seconds, sleeping for 1 seconds
2017-07-11 22:30:26 FATAL Command did not complete successfully after 3 attempts and after 14 seconds.
2017-07-11 22:30:26 INFO The command run was:
maprcli volume create -name mapr.d3.infopower.com.local.mapred -path /var/mapr/local/d3.infopower.com/mapred -replication 1 -localvolumehost d3.infopower.com -localvolumeport 5660 -shufflevolume true -rereplicationtimeoutsec 300

2017-07-11 22:30:26 INFO The output of the last failed command attempt:
ERROR (5) - Volume Creation Failed: Could not create root container.
2017-07-11 22:31:06 INFO This script was called with the arguments: d3.infopower.com /var/mapr/local/d3.infopower.com/mapred /var/mapr/local/d3.infopower.com/mapred/nodeManager yarn
2017-07-11 22:31:06 INFO Checking if MapRFS is online
2017-07-11 22:31:06 DEBUG Will launch command "hadoop fs -stat /" with a command attempt timeout of 60 seconds a maximum of 1000 attempts and a sleep time of 1 seconds between failed command attempts
2017-07-11 22:31:06 DEBUG Launching "hadoop fs -stat /"
2017-07-11 22:31:08 DEBUG Command attempt 1 completed successfully in 2 seconds
2017-07-11 22:31:08 DEBUG Command completed successfully after 1 attempts and after 2 seconds

...

2017-07-11 22:32:38 INFO A new NodeManager volume will be created.
2017-07-11 22:32:38 DEBUG Will launch command "maprcli volume create -name mapr.d3.infopower.com.local.mapred -path /var/mapr/local/d3.infopower.com/mapred -replication 1 -localvolumehost d3.infopower.com -localvolumeport 5660 -shufflevolume true -rereplicationtimeoutsec 300" with a command attempt timeout of 60 seconds a maximum of 3 attempts and a sleep time of 1 seconds between failed command attempts
2017-07-11 22:32:38 DEBUG Launching "maprcli volume create -name mapr.d3.infopower.com.local.mapred -path /var/mapr/local/d3.infopower.com/mapred -replication 1 -localvolumehost d3.infopower.com -localvolumeport 5660 -shufflevolume true -rereplicationtimeoutsec 300"
2017-07-11 22:32:41 DEBUG Command attempt 1 failed with return code 1 after 3 seconds, sleeping for 1 seconds
2017-07-11 22:32:42 DEBUG Launching "maprcli volume create -name mapr.d3.infopower.com.local.mapred -path /var/mapr/local/d3.infopower.com/mapred -replication 1 -localvolumehost d3.infopower.com -localvolumeport 5660 -shufflevolume true -rereplicationtimeoutsec 300"
2017-07-11 22:32:47 DEBUG Command attempt 2 failed with return code 1 after 5 seconds, sleeping for 1 seconds
2017-07-11 22:32:48 DEBUG Launching "maprcli volume create -name mapr.d3.infopower.com.local.mapred -path /var/mapr/local/d3.infopower.com/mapred -replication 1 -localvolumehost d3.infopower.com -localvolumeport 5660 -shufflevolume true -rereplicationtimeoutsec 300"
2017-07-11 22:33:02 DEBUG Command attempt 3 failed with return code 1 after 14 seconds, sleeping for 1 seconds
2017-07-11 22:33:03 FATAL Command did not complete successfully after 3 attempts and after 25 seconds.
2017-07-11 22:33:03 INFO The command run was:
maprcli volume create -name mapr.d3.infopower.com.local.mapred -path /var/mapr/local/d3.infopower.com/mapred -replication 1 -localvolumehost d3.infopower.com -localvolumeport 5660 -shufflevolume true -rereplicationtimeoutsec 300

2017-07-11 22:33:03 INFO The output of the last failed command attempt:
ERROR (5) - Volume Creation Failed: Could not create root container.

 

 

I get error on yarn-mapr-nodemanager-d3.infopower.com.log

Header 1

...

2017-07-11 22:32:20,275 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ev
ent.LocalizerEventType for class org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerTracker
2017-07-11 22:32:20,314 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: The Auxilurary Service named 'mapreduce_shuffle' in the configu
ration is for class org.apache.hadoop.mapred.ShuffleHandler which has a name of 'httpshuffle'. Because these are not the same tools trying to send ServiceData and r
ead Service Meta Data may have issues unless the refer to the name in the config.
2017-07-11 22:32:20,315 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Adding auxiliary service httpshuffle, "mapreduce_shuffle"
2017-07-11 22:32:20,427 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: The Auxilurary Service named 'mapr_direct_shuffle' in the confi
guration is for class com.mapr.hadoop.mapred.LocalVolumeAuxService which has a name of 'direct_shuffle'. Because these are not the same tools trying to send Service
Data and read Service Meta Data may have issues unless the refer to the name in the config.
2017-07-11 22:32:20,427 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Adding auxiliary service direct_shuffle, "mapr_direct_shuffle"
2017-07-11 22:32:20,612 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Using ResourceCalculatorPlugin : org.apache.
hadoop.yarn.util.LinuxResourceCalculatorPlugin@363a52f
2017-07-11 22:32:20,612 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Using ResourceCalculatorProcessTree : null
2017-07-11 22:32:20,612 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Physical memory check enabled: true
2017-07-11 22:32:20,612 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Virtual memory check enabled: false
2017-07-11 22:32:20,626 INFO org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Initialized nodemanager for null: physical-memory=31429 virtual-memory
=66001 virtual-cores=10 disks=0.5
2017-07-11 22:32:20,687 INFO org.apache.hadoop.ipc.CallQueueManager: Using callQueue class java.util.concurrent.LinkedBlockingQueue
2017-07-11 22:32:20,717 INFO org.apache.hadoop.ipc.Server: Starting Socket Reader #1 for port 45389
2017-07-11 22:32:20,750 INFO org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl: Adding protocol org.apache.hadoop.yarn.api.ContainerManagementProtocol
PB to the server
2017-07-11 22:32:20,750 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: Blocking new container-requests as container manager r
pc server is still starting.
2017-07-11 22:32:20,751 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting
2017-07-11 22:32:20,751 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 45389: starting
2017-07-11 22:32:20,763 INFO org.apache.hadoop.yarn.server.nodemanager.security.NMContainerTokenSecretManager: Updating node address : d3.infopower.com:45389
2017-07-11 22:32:20,902 INFO org.apache.hadoop.ipc.CallQueueManager: Using callQueue class java.util.concurrent.LinkedBlockingQueue
2017-07-11 22:32:20,904 INFO org.apache.hadoop.ipc.Server: Starting Socket Reader #1 for port 8040
2017-07-11 22:32:20,908 INFO org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl: Adding protocol org.apache.hadoop.yarn.server.nodemanager.api.Localiza
tionProtocolPB to the server
2017-07-11 22:32:20,909 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting
2017-07-11 22:32:20,909 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 8040: starting

...

2017-07-11 22:32:20,911 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Localizer started on port 8040
2017-07-11 22:32:20,912 INFO com.mapr.hadoop.mapred.LocalVolumeAuxService: Checking for local volume. If volume is not present command will create and mount it. Com
mand invoked is : /opt/mapr/server/createTTVolume.sh d3.infopower.com /var/mapr/local/d3.infopower.com/mapred /var/mapr/local/d3.infopower.com/mapred/nodeManager ya
rn
2017-07-11 22:33:03,415 ERROR com.mapr.hadoop.mapred.LocalVolumeAuxService: Failed to create and mount local mapreduce volume at /var/mapr/local/d3.infopower.com/ma
pred. Please see logs at /opt/mapr/logs/createNMVolume.log
2017-07-11 22:33:03,416 ERROR com.mapr.hadoop.mapred.LocalVolumeAuxService: Command ran /opt/mapr/server/createTTVolume.sh d3.infopower.com /var/mapr/local/d3.infop
ower.com/mapred /var/mapr/local/d3.infopower.com/mapred/nodeManager yarn
2017-07-11 22:33:03,416 ERROR com.mapr.hadoop.mapred.LocalVolumeAuxService: Command output
2017-07-11 22:33:03,417 INFO org.apache.hadoop.service.AbstractService: Service direct_shuffle failed in state STARTED; cause: ExitCodeException exitCode=203: /opt/
mapr/server/createTTVolume.sh: line 155: [: true: integer expression expected

ExitCodeException exitCode=203: /opt/mapr/server/createTTVolume.sh: line 155: [: true: integer expression expected

at org.apache.hadoop.util.Shell.runCommand(Shell.java:545)
at org.apache.hadoop.util.Shell.run(Shell.java:456)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:722)
at com.mapr.hadoop.mapred.LocalVolumeAuxService.initVolume(LocalVolumeAuxService.java:260)
at com.mapr.hadoop.mapred.LocalVolumeAuxService.serviceStart(LocalVolumeAuxService.java:229)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.serviceStart(AuxServices.java:173)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
at org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:120)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceStart(ContainerManagerImpl.java:468)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
at org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:120)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceStart(NodeManager.java:267)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:477)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:524)
2017-07-11 22:33:03,422 INFO com.mapr.hadoop.mapred.LocalVolumeAuxService: Shutting down Volume Checker service
2017-07-11 22:33:03,422 INFO com.mapr.hadoop.mapred.LocalVolumeAuxService: Cleanup service shutdown
2017-07-11 22:33:03,422 INFO org.apache.hadoop.service.AbstractService: Service org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices failed in sta
te STARTED; cause: org.apache.hadoop.service.ServiceStateException: ExitCodeException exitCode=203: /opt/mapr/server/createTTVolume.sh: line 155: [: true: integer e
xpression expected

...

org.apache.hadoop.service.ServiceStateException: ExitCodeException exitCode=203: /opt/mapr/server/createTTVolume.sh: line 155: [: true: integer expression expected

at org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:59)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:204)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.serviceStart(AuxServices.java:173)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
at org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:120)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceStart(ContainerManagerImpl.java:468)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
at org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:120)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceStart(NodeManager.java:267)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:477)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:524)
Caused by: ExitCodeException exitCode=203: /opt/mapr/server/createTTVolume.sh: line 155: [: true: integer expression expected

at org.apache.hadoop.util.Shell.runCommand(Shell.java:545)
at org.apache.hadoop.util.Shell.run(Shell.java:456)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:722)
at com.mapr.hadoop.mapred.LocalVolumeAuxService.initVolume(LocalVolumeAuxService.java:260)
at com.mapr.hadoop.mapred.LocalVolumeAuxService.serviceStart(LocalVolumeAuxService.java:229)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
... 10 more
2017-07-11 22:33:03,424 INFO org.apache.hadoop.service.AbstractService: Service org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl fail
ed in state STARTED; cause: org.apache.hadoop.service.ServiceStateException: ExitCodeException exitCode=203: /opt/mapr/server/createTTVolume.sh: line 155: [: true:
integer expression expected

org.apache.hadoop.service.ServiceStateException: ExitCodeException exitCode=203: /opt/mapr/server/createTTVolume.sh: line 155: [: true: integer expression expected

at org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:59)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:204)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.serviceStart(AuxServices.java:173)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
at org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:120)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceStart(ContainerManagerImpl.java:468)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
at org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:120)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceStart(NodeManager.java:267)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)

 

Any ideas what's wrong?

Thanks.

Outcomes