AnsweredAssumed Answered

Problems with Spark 1.6.1-1607

Question asked by jamesrgrinter on Aug 5, 2016
Latest reply on Sep 14, 2016 by asukhenko

I've tried upgrading mapr-spark in a copy of the 5.1.0 GA sandbox to the 2016/07 release. I updated mapr-kafka to the corresponding 2016/07 update as well, per the Spark 1.6.1-1607 Release Notes

 

[root@maprdemo ~]# rpm -qa | egrep 'spark|kafka'

mapr-spark-1.6.1.201607242143-1.noarch

mapr-kafka-0.9.0.201607141833-1.noarch

mapr-spark-historyserver-1.6.1.201607242143-1.noarch

 

After this update, my previously working copy of Carol McDonald's mapr-streams-sparkstreaming-hbase example no longer works.

 

These updates have made extensive changes to the Spark KafkaRDD Class/methods, and it now throws an exception (because of a Scala Assert) instead of waiting until more data comes in:

 

[user01@maprdemo ~]$ spark-submit --class solution.SensorStreamConsumer --master local[2] ms-sparkstreaming-1.0.jar

16/08/05 10:36:15 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

start streaming

[Stage 1:>                                                          (0 + 2) / 3]16/08/05 10:36:23 ERROR Executor: Exception in task 0.0 in stage 1.0 (TID 1)

java.lang.AssertionError: assertion failed: Failed to get records for /user/user01/pump:sensor)

|0 1012443 after polling for 1000

  at scala.Predef$.assert(Predef.scala:179)

  at org.apache.spark.streaming.kafka.v09.KafkaRDD$KafkaRDDIterator.fetchBatch(KafkaRDD.scala:203)

  at org.apache.spark.streaming.kafka.v09.KafkaRDD$KafkaRDDIterator.hasNext(KafkaRDD.scala:173)

...

 

Has anyone else updated to the 2016/07 updates, and is Spark streaming still working, or do you experience the same problem?

Outcomes