maprcommunity

Getting Started with the MapR Command Line

Blog Post created by maprcommunity Employee on Nov 24, 2016

Getting Started with the MapR Command Line (Part I)

by Nelson Estrada

Introduction

The Apache Hadoop community has given us a great set of tools that allow us to interact with the Hadoop Distributed File System. These tools obfuscate the complexities of countless machines in the background by showing us one simple and easy to understand interface.

A great tool to get started with Hadoop is hadoop fs . The hadoop fs toolset runs a generic filesystem user client that interacts with the distributed file system similar to the way we interact with a Unix file system (but with a much limited set of commands). One can list, change permissions, move, copy, and perform other operations to files and directories in the distributed file system. A comprehensive list of hadoop fs commands can be found in hadoop.apache.org

However, since HDFS is a append-only file system, and since the hadoop fs toolset was made for HDFS, one cannot find any tools for editing files such as vim or nano. In order to edit files à la linux, MapR provides NFS access to the MapR File System so that all Unix commands can be used on Hadoop1. If you are interested in understanding the importance and the differences between the MapR read/write filesystem and the HDFS append-only filesystem, please read this blog post.

1 another great blog post explaining MapR NFS

Intro to MapRCLI

In addition to all the Hadoop shell commands, MapR provides a fully complementary toolset that builds on Hadoop to give you a lot more power and insight into the MapR File System. These tools are incredibly useful for administrators who operate the Hadoop cluster, as well as for developers trying to debug Hadoop applications. In this blog post (Intro to the MapR Command Line Interface), I will introduce the maprcli node list command and discuss how to use it to learn more about your cluster. In future blog posts, I will cover how to use the maprcli to work with MapR Volumes and Storage Pools, MapR Access Control Lists and much more!

MapR CLI Node List

The first thing to do in an existing cluster is to see how many nodes you are working with, to find out what services are running in the cluster, and to find out where these services are located. The maprcli allows you to see this and a LOT more information pertaining to all the nodes in the cluster. Try it out yourself by running :

$ maprcli node list 

 

This is what the output would look like:

[mapr@ip-10-0-10-183 ~]$ maprcli node list 
IncorrectTopologyAlarm NodeNoHeartbeatAlarm numGetsInLastFifteenMinutes
bytesSent dreads MemoryAllocationAlarm ResourceManagerDown configuredservice
ServiceOozieDownNotRunningAlarm numScansInLastFifteenMinutes DiskFailureAlarm
mtotal HomeMapRFullAlarm racktopo numGetsInLastTenSeconds
numPutsInLastFifteenMinutes dtotal jt-heartbeat ServiceHttpfsDownNotRunningAlarm dwriteK
ServiceTTDownNotRunningAlarm ServiceBeeswaxDownNotRunningAlarm NodeManagerDown hostname
numGetsInLastFiveMinutes JobHistoryServerDown HighMfsMemoryAlarm health disks numGetsInLastMinute
CorePresentAlarm fs-heartbeat dreadK NodeDuplicateHostIdAlarm PamMisconfiguredAlarm LogLevelAlarm
numPutsInLastFiveMinutes MapRfs disks VersionMismatchAlarm ServiceHoststatsDownNotRunningAlarm
NodeMaprUserMismatchAlarm NodeMetricsWriteProblemAlarm davail TimeSkewAlarm numScansInLastFiveMinutes
ServiceHBMasterDownNotRunningAlarm ServiceNFSDownNotRunningAlarm numScansInLastMinute ttmapUsed id
mused cpus utilization numPutsInLastMinute rpcout ttReduceSlots ServiceFileserverDownNotRunningAlarm
ServiceCLDBDownNotRunningAlarm ttReduceUsed blockMovesOut ServiceJTDownNotRunningAlarm numPutsInLastTenSeconds
ttmapSlots dused uptime ServiceHiveDownNotRunningAlarm numResyncSlots maxContainersThreshold
MemorySwapping faileddisks HbProcessingSlow rpcin ip dwrites NodeTooManyContainersAlarm
ServiceWebserverDownNotRunningAlarm rpcs numScansInLastTenSeconds ServiceHueDownNotRunningAlarm TTLocaldirFullAlarm
blockMovesIn ServiceHs2DownNotRunningAlarm DRILLDOWNALARM RootPartitionFullAlarm ServiceHBRegionDownNotRunningAlarm
service bytesReceived healthDesc
0 0 0 2298 0 0 0
fileserver,oozie,historyserver,webserver,nodemanager,drill-bits,nfs,resourcemanager,hoststats 0
0 0 7466 0
/data/default-rack/ip-10-0-10-183.us-west-1.compute.internal 0 0 138
2 0 0 0 0
0 ip-10-0-10-183.us-west-1.compute.internal 0 0 0
0 3 0 0 0 0 0 0
0 0 3 0 0
0 0 137 0 0 0
0 0 0 5002047782745794519 7202
2 64 0 422 0 0 0
0 false 0 0 0 0 Sat
Jan 17 12:41:09 EST 1970 0 16 50000 0 0
0 192 10.0.10.183 0 0 0 0
0 0 0 false 0
0 0 0
fileserver,oozie,historyserver,webserver,nodemanager,drill-bits,nfs,resourcemanager,hoststats 3935 Healthy 0
0 0 1118 0 0 0
nodemanager,cldb,drill-bits,fileserver,nfs,hoststats 0
0 0 7466 0
/data/default-rack/ip-10-0-10-184.us-west-1.compute.internal 0 0 139
2 0 0 0 0
0 ip-10-0-10-184.us-west-1.compute.internal 0 0 0
0 3 0 0 0 0 0 0
0 0 3 0 0
0 0 138 0 0 0
0 0 0 2239863513150400624 4467
2 1 0 308 0 0 0
0 false 0 0 0 0 Sat
Jan 17 12:41:09 EST 1970 0 16 50000 0 0
0 128 10.0.10.184 0 0 0 0
0 0 0 false 0
0 0 0
nodemanager,cldb,drill-bits,fileserver,nfs,hoststats 1332 Healthy 0
0 0 1267 0 0 0
fileserver,nodemanager,drill-bits,hoststats 0
0 0 7466 0
/data/default-rack/ip-10-0-10-185.us-west-1.compute.internal 0 0 139
2 0 0 0 0
0 ip-10-0-10-185.us-west-1.compute.internal 0 0 0
2 3 0 0 0 0 0 0
0 0 3 0 0
0 0 138 1447317627228 0 0
0 0 0 6343417936365485436 2987
2 1 0 307 0 0 0
0 false 0 0 0 0 Sat
Jan 17 12:41:09 EST 1970 0 16 50000 0 0
0 128 10.0.10.185 0 0 0 0
0 0 0 false 0
0 0 0
fileserver,nodemanager,drill-bits,hoststats 673 One or
more alarms raised

 

Actually, if you tried the command, you would have seen that the amount of information outputted is way too much to digest. A different approach is to have the data return in JSON format by simply running:

$ maprcli node list -json 

This is what the output would look like:

[mapr@ip-10-0-10-183 ~]$ maprcli node list -json
{
"timestamp":1447349930438,
"timeofday":"2015-11-12 12:38:50.438 GMT-0500",
"status":"OK",
"total":3,
"data":[
{
"id":"5002047782745794519",
"ip":"10.0.10.183",
"hostname":"ip-10-0-10-183.us-west-1.compute.internal",
"racktopo":"/data/default-rack/ip-10-0-10-183.us-west-1.compute.internal",
"health":0,
"healthDesc":"Healthy",
"service":"fileserver,oozie,historyserver,webserver,nodemanager,drill-bits,nfs,resourcemanager,hoststats",
"configuredservice":"fileserver,oozie,historyserver,webserver,nodemanager,drill-bits,nfs,resourcemanager,hoststats",
"fs-heartbeat":0,
"jt-heartbeat":2,
"dtotal":138,
"dused":0,
"davail":137,
"rpcs":0,
"rpcin":197,
"rpcout":497,
"disks":3,
"MapRfs disks":3,
"faileddisks":0,
"dreads":0,
"dreadK":0,
"dwrites":13,
"dwriteK":136,
"cpus":2,
"utilization":74,
"uptime":"Sat Jan 17 12:41:09 EST 1970",
"mtotal":7466,
"mused":7202,
"ttmapSlots":0,
"ttmapUsed":0,
"ttReduceSlots":0,
"ttReduceUsed":0,
"bytesReceived":3868,
"bytesSent":2229,
"numResyncSlots":16,
"blockMovesOut":false,
"blockMovesIn":false,
"maxContainersThreshold":50000,
"numPutsInLastTenSeconds":0,
"numPutsInLastMinute":0,
"numPutsInLastFiveMinutes":0,
"numPutsInLastFifteenMinutes":0,
"numGetsInLastTenSeconds":0,
"numGetsInLastMinute":0,
"numGetsInLastFiveMinutes":0,
"numGetsInLastFifteenMinutes":0,
"numScansInLastTenSeconds":0,
"numScansInLastMinute":0,
"numScansInLastFiveMinutes":0,
"numScansInLastFifteenMinutes":0,
"LogLevelAlarm":0,
"ServiceCLDBDownNotRunningAlarm":0,
"ServiceFileserverDownNotRunningAlarm":0,
"ServiceJTDownNotRunningAlarm":0,
"ServiceTTDownNotRunningAlarm":0,
"ServiceHBMasterDownNotRunningAlarm":0,
"ServiceHBRegionDownNotRunningAlarm":0,
"ServiceNFSDownNotRunningAlarm":0,
"ServiceWebserverDownNotRunningAlarm":0,
"ServiceHoststatsDownNotRunningAlarm":0,
"DiskFailureAlarm":0,
"VersionMismatchAlarm":0,
"TimeSkewAlarm":0,
"HbProcessingSlow":0,
"RootPartitionFullAlarm":0,
"HomeMapRFullAlarm":0,
"CorePresentAlarm":0,
"HighMfsMemoryAlarm":0,
"PamMisconfiguredAlarm":0,
"TTLocaldirFullAlarm":0,
"NodeNoHeartbeatAlarm":0,
"NodeMaprUserMismatchAlarm":0,
"NodeDuplicateHostIdAlarm":0,
"NodeMetricsWriteProblemAlarm":0,
"NodeTooManyContainersAlarm":0,
"IncorrectTopologyAlarm":0,
"ServiceHueDownNotRunningAlarm":0,
"ServiceHttpfsDownNotRunningAlarm":0,
"ServiceBeeswaxDownNotRunningAlarm":0,
"ServiceHiveDownNotRunningAlarm":0,
"ServiceHs2DownNotRunningAlarm":0,
"ServiceOozieDownNotRunningAlarm":0,
"NodeManagerDown":0,
"MemorySwapping":0,
"DRILLDOWNALARM":0,
"MemoryAllocationAlarm":0,
"ResourceManagerDown":0,
"JobHistoryServerDown":0
},
{
"id":"2239863513150400624",
"ip":"10.0.10.184",
"hostname":"ip-10-0-10-184.us-west-1.compute.internal",
"racktopo":"/data/default-rack/ip-10-0-10-184.us-west-1.compute.internal",
"health":0,
"healthDesc":"Healthy",
"service":"nodemanager,cldb,drill-bits,fileserver,nfs,hoststats",
"configuredservice":"nodemanager,cldb,drill-bits,fileserver,nfs,hoststats",
"fs-heartbeat":0,
"jt-heartbeat":2,
"dtotal":139,
"dused":0,
"davail":138,
"rpcs":0,
"rpcin":198,
"rpcout":496,
"disks":3,
"MapRfs disks":3,
"faileddisks":0,
"dreads":0,
"dreadK":0,
"dwrites":0,
"dwriteK":0,
"cpus":2,
"utilization":1,
"uptime":"Sat Jan 17 12:41:09 EST 1970",
"mtotal":7466,
"mused":4467,
"ttmapSlots":0,
"ttmapUsed":0,
"ttReduceSlots":0,
"ttReduceUsed":0,
"bytesReceived":2510,
"bytesSent":3769,
"numResyncSlots":16,
"blockMovesOut":false,
"blockMovesIn":false,
"maxContainersThreshold":50000,
"numPutsInLastTenSeconds":0,
"numPutsInLastMinute":0,
"numPutsInLastFiveMinutes":0,
"numPutsInLastFifteenMinutes":0,
"numGetsInLastTenSeconds":0,
"numGetsInLastMinute":0,
"numGetsInLastFiveMinutes":0,
"numGetsInLastFifteenMinutes":0,
"numScansInLastTenSeconds":0,
"numScansInLastMinute":0,
"numScansInLastFiveMinutes":0,
"numScansInLastFifteenMinutes":0,
"LogLevelAlarm":0,
"ServiceCLDBDownNotRunningAlarm":0,
"ServiceFileserverDownNotRunningAlarm":0,
"ServiceJTDownNotRunningAlarm":0,
"ServiceTTDownNotRunningAlarm":0,
"ServiceHBMasterDownNotRunningAlarm":0,
"ServiceHBRegionDownNotRunningAlarm":0,
"ServiceNFSDownNotRunningAlarm":0,
"ServiceWebserverDownNotRunningAlarm":0,
"ServiceHoststatsDownNotRunningAlarm":0,
"DiskFailureAlarm":0,
"VersionMismatchAlarm":0,
"TimeSkewAlarm":0,
"HbProcessingSlow":0,
"RootPartitionFullAlarm":0,
"HomeMapRFullAlarm":0,
"CorePresentAlarm":0,
"HighMfsMemoryAlarm":0,
"PamMisconfiguredAlarm":0,
"TTLocaldirFullAlarm":0,
"NodeNoHeartbeatAlarm":0,
"NodeMaprUserMismatchAlarm":0,
"NodeDuplicateHostIdAlarm":0,
"NodeMetricsWriteProblemAlarm":0,
"NodeTooManyContainersAlarm":0,
"IncorrectTopologyAlarm":0,
"ServiceHueDownNotRunningAlarm":0,
"ServiceHttpfsDownNotRunningAlarm":0,
"ServiceBeeswaxDownNotRunningAlarm":0,
"ServiceHiveDownNotRunningAlarm":0,
"ServiceHs2DownNotRunningAlarm":0,
"ServiceOozieDownNotRunningAlarm":0,
"NodeManagerDown":0,
"MemorySwapping":0,
"DRILLDOWNALARM":0,
"MemoryAllocationAlarm":0,
"ResourceManagerDown":0,
"JobHistoryServerDown":0
},
{
"id":"6343417936365485436",
"ip":"10.0.10.185",
"hostname":"ip-10-0-10-185.us-west-1.compute.internal",
"racktopo":"/data/default-rack/ip-10-0-10-185.us-west-1.compute.internal",
"health":2,
"healthDesc":"One or more alarms raised",
"service":"fileserver,nodemanager,drill-bits,hoststats",
"configuredservice":"fileserver,nodemanager,drill-bits,hoststats",
"fs-heartbeat":0,
"jt-heartbeat":2,
"dtotal":139,
"dused":0,
"davail":138,
"rpcs":0,
"rpcin":192,
"rpcout":482,
"disks":3,
"MapRfs disks":3,
"faileddisks":0,
"dreads":0,
"dreadK":0,
"dwrites":0,
"dwriteK":0,
"cpus":2,
"utilization":11,
"uptime":"Sat Jan 17 12:41:09 EST 1970",
"mtotal":7466,
"mused":2987,
"ttmapSlots":0,
"ttmapUsed":0,
"ttReduceSlots":0,
"ttReduceUsed":0,
"bytesReceived":555,
"bytesSent":1402,
"numResyncSlots":16,
"blockMovesOut":false,
"blockMovesIn":false,
"maxContainersThreshold":50000,
"numPutsInLastTenSeconds":0,
"numPutsInLastMinute":0,
"numPutsInLastFiveMinutes":0,
"numPutsInLastFifteenMinutes":0,
"numGetsInLastTenSeconds":0,
"numGetsInLastMinute":0,
"numGetsInLastFiveMinutes":0,
"numGetsInLastFifteenMinutes":0,
"numScansInLastTenSeconds":0,
"numScansInLastMinute":0,
"numScansInLastFiveMinutes":0,
"numScansInLastFifteenMinutes":0,
"LogLevelAlarm":0,
"ServiceCLDBDownNotRunningAlarm":0,
"ServiceFileserverDownNotRunningAlarm":0,
"ServiceJTDownNotRunningAlarm":0,
"ServiceTTDownNotRunningAlarm":0,
"ServiceHBMasterDownNotRunningAlarm":0,
"ServiceHBRegionDownNotRunningAlarm":0,
"ServiceNFSDownNotRunningAlarm":0,
"ServiceWebserverDownNotRunningAlarm":0,
"ServiceHoststatsDownNotRunningAlarm":0,
"DiskFailureAlarm":0,
"VersionMismatchAlarm":0,
"TimeSkewAlarm":1447317627228,
"HbProcessingSlow":0,
"RootPartitionFullAlarm":0,
"HomeMapRFullAlarm":0,
"CorePresentAlarm":0,
"HighMfsMemoryAlarm":0,
"PamMisconfiguredAlarm":0,
"TTLocaldirFullAlarm":0,
"NodeNoHeartbeatAlarm":0,
"NodeMaprUserMismatchAlarm":0,
"NodeDuplicateHostIdAlarm":0,
"NodeMetricsWriteProblemAlarm":0,
"NodeTooManyContainersAlarm":0,
"IncorrectTopologyAlarm":0,
"ServiceHueDownNotRunningAlarm":0,
"ServiceHttpfsDownNotRunningAlarm":0,
"ServiceBeeswaxDownNotRunningAlarm":0,
"ServiceHiveDownNotRunningAlarm":0,
"ServiceHs2DownNotRunningAlarm":0,
"ServiceOozieDownNotRunningAlarm":0,
"NodeManagerDown":0,
"MemorySwapping":0,
"DRILLDOWNALARM":0,
"MemoryAllocationAlarm":0,
"ResourceManagerDown":0,
"JobHistoryServerDown":0
}
]
}

That looks a lot better, but unless you are trying to do a full audit, there is still too much information and it will be very hard to find any specific things you might be looking for. In order to hone in on specific information, you can pass the “-columns” argument and specify the key (column names) for the data that you want. By default, the hostname and IP of the nodes are returned, but if you want to see what the services in the cluster are, try this:

$ maprcli node list -columns service
[mapr@ip-10-0-10-183 ~]$ maprcli node list -columns service
service hostname ip
fileserver,oozie,historyserver,webserver,nodemanager,drill-bits,nfs,resourcemanager,hoststats ip-10-0-10-183.us-west-1.compute.internal 10.0.10.183
nodemanager,cldb,drill-bits,fileserver,nfs,hoststats ip-10-0-10-184.us-west-1.compute.internal 10.0.10.184
fileserver,nodemanager,drill-bits,hoststats ip-10-0-10-185.us-west-1.compute.internal 10.0.10.185

 

This is incredibly useful, as we now know what we are working with in our cluster. The command told us exactly how many nodes are running, the services running in each node, and also the hostname and IPs associated with each node. How about the rest of the information? How can you leverage the rest of the possible information that you don’t even know exists? To do this, simply list the column names of all the information this command can output for each node by running:

for f in `maprcli node list | head -1`; do echo $f; done | sort

Now that you know where each service is, you can manage each of them by using the same tool with different arguments. One reason you may want to stop/start services is so that new configuration changes can kick in. For example, if you are running Spark jobs that require more memory than the default allocated memory, you might want to change the "yarn.scheduler.maximum-allocation-mb” property in the yarn-site.xml. In order for YARN to know the new configuration, it is necessary to restart the Resource Manager as follows:

maprcli node services -name resourcemanager -action stop -nodes  
<space separated RM hostnames>

Verify that the services are no longer running:

maprcli node list -columns service

And start the Resource Manager again:

maprcli node services -name resourcemanager -action start -nodes  
<space separated RM hostnames>

Similarly, you can simply replace the initial “stop" with “restart.”

Play around with this tool. There are countless other options and arguments you can provide to this command to get lots of new information you didn’t know you could get.

In the next blog post, we will cover how to create MapR volumes, how to set volume-specific characteristics such as replication factors, quotas, and permissions, and how they can be easily used for HA and Disaster Recovery.

If you have any questions about using the MapR Command Line, please ask them in the comments section below.

Related Content

-     Learn how to Install a MapR Cluster | MapR Academy  

- MapR installation

- Sandbox installation

-    Using the MapR Installer 

 

Content Originally posted in MapR Converge Blog post, visit here

Subscribe to Converge Blog

 

 

Liked this content? Click like or leave a comment below

Outcomes