AnsweredAssumed Answered

Is it possible to read most recent 'n' messages from a MapR Stream Topic

Question asked by rajdevireddy on Aug 6, 2017
Latest reply on Aug 8, 2017 by rajdevireddy

Hi all,

 

I have been trying to write a consumer that can read and return most recent 'n' messages from a stream topic. I believe I had it working a couple of days back but now it does not work any more. Here is what I did

 

1. Create a consumer and assign it all partitions from my topic (I have 4 partitions) 

2. Then for each partition call consumer.seekToEnd and then call consumer.position and note that offset.

3. Then do (end-offset - 25)  (this is where I have questions) and set that number as new offset using consumer.seek on that partition.

4. Do this on all 4 partitions

5. Then call poll. 

 

This worked a couple of days back but now the poll returns a ConsumerRecords result set that is empty. Data is being continuously written to the stream topic that I am interested in and also other topics/streams so the en-offset might have changed now compared to the earlier test.  Now my suspicion is that offsets may not be that simple a concept as I assumed and I am probably doing this wrong. I read this blog post

High Speed Kafka API Publish Subscribe Streaming Architecture: How it works at the message level | MapR 

and it says this about offsets  "Each message is given an offset, which is a sequentially numbered ID". Given this is there any way for me to know that if end-offset for a stream-topic partition at this moment is "2114042" then what will be the offset of say 10th message from that position going backwards on the stream? Now this "2114042" is definitely not message count on partition as there are not that many message on the topic. My idea of just substracting 25 from each partitions last-offsets and then calling poll is returning empty ConsumerRecords. Anything else I can try to make this work?

 

Thanks in advance...

Outcomes