maprcommunity

How to Create Instant MapR Clusters with Docker

Blog Post created by maprcommunity Employee on Dec 19, 2016

How to Create Instant MapR Clusters with Docker

By Mitra Kaseebhotla

 

Here at MapR, developer productivity is critical to us. In order to keep our pace of innovation high and give customers more choice and flexibility in Apache Hadoop and other open source projects we ship with the MapR Distribution for Hadoop, we apply DevOps methodologies as widely as we can. One critical piece of this is ensuring we can rapidly test our builds to ensure quality in the codebase. Automation is key here, which is what allows us to integrate all the latest innovations across multiple releases from the community in our Hadoop distribution. For example, we test and support Hadoop 2.7 with Drill 1.1 and Hive 1.0, Hadoop 2.6 with Drill 1.2 and Spark 1.3.1, and so on. For customers supporting 50 or more applications on a single MapR cluster there are many combinations possible within the MapR Distribution, which allows them to upgrade applications incrementally, saving lots of time and money.

To deliver this fast pace of innovation, we’ve been using Docker extensively. Rather than using physical servers or VMs to provision this multitude of test clusters, we build and maintain Docker images of MapR that can be provisioned on demand. This has reduced the deployment time of a test cluster from hours to seconds!

In this post, we will share the tools and methodology we use to create these Dockerized MapR clusters. We expect that you’ll find these useful as well, both to learn MapR and to test out new applications.

Goals:

  • Create a multi-node MapR cluster.
  • The cluster nodes need to be accessible outside the host running the containers.
  • Launch clusters of different sizes.
  • Use real disks to achieve realistic performance.

 

Requirements:

  • Server running CentOS/RHEL 7.x with 16GB+ RAM
  • Docker 1.6.0+
  • sshpass installed
  • Free, unmounted physical disks to be attached to the MapR node containers

 

Network Set-up: While working towards these goals, the networking requirement was one of the critical pieces. The containers/cluster nodes need to be accessible from outside(routable). We don’t want to have a complex network setup.

Step1 : Set up a bridge interface which is routable. (Eg : br0) Ref

Here is a config example on CentOS 7.0 server:

# cat /etc/sysconfig/network-scripts/ifcfg-br0  DEVICE="br0" ONBOOT=yes IPV6INIT=no BOOTPROTO=static TYPE=Bridge NAME="br0" IPADDR=10.10.101.135 NETMASK=255.255.255.0 GATEWAY=10.10.101.1 #  # cat /etc/sysconfig/network-scripts/ifcfg-enp4s0  DEVICE="enp4s0" ONBOOT=yes IPV6INIT=no BOOTPROTO=none HWADDR="0c:c4:7a:58:7d:19" TYPE=Ethernet NAME="enp4s0"BRIDGE=br0#

Step 2 : Get a free range of routable IP addresses from the network admin to be used for the containers in the same vlan as the bridge IP address.
Eg: We got 10.10.101.16/29 - This gives IPs 10.10.101.17 to 10.10.101.22 (for containers)

Docker configuration:
Configure docker with the following options:

     -b=bridge-inf --fixed-cidr=x.x.x.x/mask     Eg:  -b=br0 --fixed-cidr=10.10.101.8/29       This gives the containers the routable IP addresses in the abovementioned range.

Disks for the Containers:
Each container requires one disk drive or partition to be used for MapR.
Generate a list of disks and put one per each line in a text file.

Eg : # cat /tmp/disklist.txt  /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf

If there are a greater number of disks in the text file than the containers requested, the remaining disks are added to the first container.

Download and run the script: launch-cluster.sh from here 4.0.2, 4.1.0, 5.0.0

 Usage : ./launch-cluster.sh ClusterName NumberOfNodes MemSize-in-kB Path-to-DisklistFileEg: # ./launch-cluster.sh  demo 4 16384000 /tmp/disklist.txt  Control Node IP : 10.10.101.21          Starting the cluster: https://10.10.101.21:8443/    login:mapr   password:mapr Data Nodes : 10.10.101.22,10.10.101.17,10.10.101.18 #

Launch MapR management console with control node IP: https://10.10.101.21:8443 (from the output of the above example)

In this blog post, you’ve learned how to create instant MapR clusters with Docker. If you have any further questions, please ask them in the comments section below.

Are you interested in reading more about working with Docker and MapR? Read the blog post My Experience with Running Docker Containers on Mesos.

 

 

Related Content

 

On docker

How to deploy MapR, Mesos, Marathon, Docker and Spark and run your first containers and jobs 

MapR Docker container instructions.docx 

Connect Docker containers securely to MapR-FS using the MapR POSIX Client 

 

 

Visit the The Exchange for:

- tutorials

- ebooks

- whiteboard Walkthroughs 

and more

 

Content Originally posted in MapR Converge Blog post, visit here

Subscribe to Converge Blog

Outcomes