Quickstart for Kafka connecting to Spark streaming on MapR cluster

Document created by Hao Zhu Employee on Aug 30, 2016Last modified by Hao Zhu Employee on Aug 30, 2016
Version 3Show Document
  • View in full screen mode

Goal:

This is a quick-start for Kafka connecting to Spark streaming on MapR cluster.

Note: This is just to illustrate the scenario that Spark Streaming's source is coming from Kafka. But Kafka itself is not officially supported by MapR.

Env:

MapR 5.1 with below packages installed:

mapr-kafka-0.9.0.201602181842-1.noarch

mapr-spark-1.5.2.201602261506-1.noarch

Solution:

1. Modify the zookeeper port in /opt/mapr/kafka/kafka-0.9.0/config/zookeeper.properties

clientPort=5181

Note: 5181 is the default zookeeper port in MapR cluster.

2. Modify the zookeeper connection string in /opt/mapr/kafka/kafka-0.9.0/config/server.properties

zookeeper.connect=localhost:5181

3. Start Kafka server

cd /opt/mapr/kafka/kafka-0.9.0

./bin/kafka-server-start.sh ./config/server.properties

Note: You can also run it in "screen" or in background.

4. Create a new topic named "mytopic" in Kafka

cd /opt/mapr/kafka/kafka-0.9.0

./bin/kafka-topics.sh --create --zookeeper localhost:5181 --replication-factor 1 --partitions 1 --topic mytopic

5. Start Kafka producer in one screen

cd /opt/mapr/kafka/kafka-0.9.0

./bin/kafka-console-producer.sh --broker-list localhost:9092 --topic mytopic

6. Start Spark Streaming sample job "KafkaWordCount" in a new screen

cd /opt/mapr/spark/spark-1.5.2

MASTER=yarn-client ./bin/run-example org.apache.spark.examples.streaming.KafkaWordCount localhost:5181 mygroup mytopic 4

7. Type some words in the screen of Kafka Producer

test

abc

123

done

8. Watch the output from the screen of Spark Streaming sample job

-------------------------------------------

Time: 1472588988000 ms

-------------------------------------------

(test,1)

 

-------------------------------------------

Time: 1472588990000 ms

-------------------------------------------

(abc,1)

(test,1)

 

-------------------------------------------

Time: 1472588992000 ms

-------------------------------------------

(abc,1)

(test,1)

 

-------------------------------------------

Time: 1472588994000 ms

-------------------------------------------

(123,1)

(abc,1)

(test,1)

 

-------------------------------------------

Time: 1472588996000 ms

-------------------------------------------

(123,1)

(abc,1)

(test,1)

 

-------------------------------------------

Time: 1472588998000 ms

-------------------------------------------

(done,1)

(123,1)

(abc,1)

(test,1)

1 person found this helpful

Attachments

    Outcomes