How to enable High Availability on Spark with Zookeeper

Document created by sumesh_kurup on Feb 8, 2016
Version 1Show Document
  • View in full screen mode

Author: Sumesh Kurup

 

Original Publication Date: December 1, 2014

 

Spark Standalone cluster is a Master-Slave architecture. Having a single standalone master creates a single point of failure. In order to remedy this single point of failure, we will be using one of the available HA options which is based on Zookeeper’s Standby Masters (Standby Masters with ZooKeeper).

 

Zookeeper provides us the mechanism for leader election where you can configure multiple masters in the cluster for HA purposes which will be connected to the same Zookeeper instance. One master instance will take the role of a master and others would be in the standby mode. If the current master dies, Zookeeper will elect another standby instance as a Master, recover the older Master's state and then resume the scheduling . The process will usually take around 1-2 minutes .Since the information has been persisted on the filesystem, including Worker, Driver and Application information, this only impacts the scheduling of the new jobs without impacting any current running jobs.

 

Configuration

First we need to have an established Zookeeper cluster. Start the Spark Master on multiple nodes and ensure that these nodes have the same Zookeeper configuration for ZooKeeper URL and directory. The master can be added or removed at any time. If a failover of the active master occurs, the newly elected Master will contact all the previously registered Applications and Workers to inform them that a new master has been elected and the registered applications and Workers gets registered with the newly elected Master.

 

In order to enable this recovery mode, set SPARK_DAEMON_JAVA_OPTS in /opt/mapr/spark/spark-<version>/conf/spark-env.sh using this configuration:

System property

Meaning

spark.deploy.recoveryMode

Set to ZOOKEEPER to enable standby Master recovery mode (default: NONE).

spark.deploy.zookeeper.url

The ZooKeeper cluster url (e.g., n1a:5181,n2a:5181,n3a:5181).

spark.deploy.zookeeper.dir

The directory in ZooKeeper to store recovery state (default: /spark). This can be optional

For example:

export SPARK_DAEMON_JAVA_OPTS="-Dspark.deploy.recoveryMode=ZOOKEEPER  -Dspark.deploy.zookeeper.url=n1a:5181,n2a:5181,n3a:5181 
-Djava.security.auth.login.config=/opt/mapr/conf/mapr.login.conf -Dzookeeper.sasl.client=false"

 

Test Scenario

In our test environment we have configured two masters - n1a is the current master and n2a is the standby master.

  1. Ensure that the zookeeper cluster is up and running . This can be verified by running the following command on all Zookeeper nodes to confirm all nodes are active in the quorum:
$ service mapr-zookeeper qstatus

      2.  Start the Spark Master services on all the master nodes as follows if not started already by Warden:

$ maprcli node services -name spark-master -action start -nodes `hostname -f`
This command must be run on all nodes where the Spark master is configured. This action can be done from the MCS as well. In the MCS click on "SparkMaster" under "Services" on the right hand side of the MCS page to load the list of nodes where "SparkMaster" is configured. Click on the node names individually and under the "Manage Node Services" section on the subsequent tab there are options to "stop/start" or "restart" the services .

 

      3. Once the services are started the Spark Master is elected via Zookeeper . In our setup the host "n1a" was elected as Master .    

      4. Proceed to start the Spark workers. The workers can be started by running the following on the active master:

$ /opt/mapr/spark/spark-1.1.0/sbin/start-slaves.sh 
10.10.70.181: starting org.apache.spark.deploy.worker.Worker, logging to /opt/mapr/spark/spark-1.1.0/logs/spark-mapr-org.apache.spark.deploy.worker.Worker-1-n4a.mycluster2.com.out
10.10.70.180: starting org.apache.spark.deploy.worker.Worker, logging to /opt/mapr/spark/spark-1.1.0/logs/spark-mapr-org.apache.spark.deploy.worker.Worker-1-n3a.mycluster2.com.out

The master and standby configuration in the test setup looks as below :

Current Master on n1a

doc-1237 image.png

 

Current Standby Master on n2a

doc-1237 image 2.png

5.  To verify the failover works as expected shutdown the active master by running the following command on the master node:

$ maprcli node services -name spark-master -action stop -nodes `hostname -f`

       6.  The standby master will become the active master and after a couple of minutes the workers will be re-registered to the new master. Ex:

 

Current Active Master on n2a

 

doc-1237 image 3.png

 

Reference: https://spark.apache.org/docs/0.9.0/spark-standalone.html#standby-masters-with-zookeeper

Attachments

    Outcomes