AnsweredAssumed Answered

What is the point of Spark Streaming back-pressure?

Question asked by john.humphreys on Sep 6, 2017
Latest reply on Sep 20, 2017 by john.humphreys

Following up from another question Spark Streaming - Takes more than max records (MapR Streams) that I have where spark-streaming with MapR seems to be breaching/ignoring spark.streaming.kafka.maxRatePerPartition sometimes:

 

What is the point of spark back-pressure when compared to spark.streaming.kafka.maxRatePerPartition?  If we have a max-rate per partition then we can calculate a maximum load per batch and ensure we can handle it by tuning our job.  The existence of back-pressure makes me wonder if spark.streaming.kafka.maxRatePerPartition is unreliable.

 

Can anyone explain why back-pressure exists and how it relates to spark.streaming.kafka.maxRatePerPartition?

Outcomes