I have a pretty standard Java app for MapR streams.
- Multiple copies of the app can run.
- Each copy spawns up a number of Kafka consumers.
- The only topic being targeted has 16 partitions.
- It has been running for ~6 months without issue.
- I restart the app every day via cron (just to keep things clean).
- A couple days ago when it restarted, MapR still thought the dead copies of the application were assigned to the consumer groups.
- I used the /stream/assign/list endpoint in the REST API to verify which hosts/processes were "using" the partitions. The processes were definitely dead.
- I left it for a day hoping it would time out, and it didn't.
What I did?
After waiting a day, I tried:
- Making new Kafka consumers and doing various operations to trigger a re-balance (no effect).
- Manually deleting the assignments via the internal MapR-DB table for the stream (no effect) - and yeah I know, I wouldn't normally play around with the internals via the JSON table. I suspect this one didn't work as there is probably a cluster-side process already holding the same data in memory.
- Deleting and recreating the consumer group with the same offsets (can't recreate it, it hangs when you try to use that name).
- Finally, I created a new consumer group with the same offsets and moved my app over to that (which worked fine). This is clearly not a happy solution though.
- Has anyone seen these issues before?
- Is it possibly fixed in a new version?
- Is there a workaround to avoid changing the consumer group?