What you'll need:
- A MapR sandbox or cluster running v5.1.0 or later
- A functional yum/apt repo for installing MapR ecosystem packages
First, a little background. MapR Streams is a publish/subscribe message system that works similarly to Apache Kafka. It uses the same APIs, which means that programs written to the Kafka 0.9.0 API will work with MapR streams with just a little configuration change. MapR ships some tools that also ship with Apache Kafka, and we can use those tools to publish and consume our first messages with very little effort.
So, if you've got a sandbox or cluster ready, the first thing you'll need to do is install the mapr-kafka package, which provides a number of kafka utilities. The ones we'll be using are the kafka-console-producer and the kafka-console-consumer. Both will allow you to produce and consume messages from a shell prompt with no coding needed.
Install mapr-kafka on a RHEL/CentOS machine:
yum install mapr-kafka
apt-get install mapr-kafka
Streams in MapR contain topics, and they exist in the file system namespace right alongside files and tables, so they are created with path names, just like files. Let's create a stream now. On a cluster node, run:
maprcli stream create -path /tmp/myfirststream
Now, open up two terminal windows. You can connect to the same node of the cluster, or different nodes; the following will work either way.
In the first window we'll run our producer, which will send messages into a topic on our just-created stream. The topic will be created automatically. The topic name is after the ':' following the stream path:
/opt/mapr/kafka/kafka-0.9.0/bin/kafka-console-producer.sh --topic /tmp/myfirststream:topic1 --broker-list this.will.be.ignored:9092
In the second window, type this:
/opt/mapr/kafka/kafka-0.9.0/bin/kafka-console-consumer.sh --topic /tmp/myfirststream:topic1 --new-consumer --bootstrap-server this.will.be.ignored:9092
Note the --broker-list and --bootstrap-server arguments. Because this program comes from the Apache Kafka project, and because Apache Kafka has a notion of a broker listening on a TCP port, it still requires us to specify something for the broker list argument. In MapR Streams, the MapR client library takes care of figuring out how to connect to the topic, so you don't have to provide this information. Since these programs require the options to be set, we set it to something - anything with a hostname and port number - knowing that it will be ignored by the MapR Streams client implementation.
The producer will start running and will not exit; it's waiting for input. You can type some stuff. Type whatever you like, and press enter. Shortly after you hit enter, you should see your typed text appear in the window where you ran the console consumer.
You just produced and consumed your first message using MapR Streams! And you didn't have to write any code.
From here, you can try some of the following:
- If you run the producer and consumer on the same node, try starting one or the other on a different node.
- Try running another consumer on second node, and producing some messages. What happens?