AnsweredAssumed Answered

Mapr streams disk space utilization

Question asked by ani.desh1512 on Jul 18, 2017
Latest reply on Jul 20, 2017 by maprcommunity

Hi,
Heres the scenario:

  1. The mapr version is 5.2.1
  2. Initially, we had 3 disks for each of 5 nodes of our mapr cluster each of 1 TB.
  3. We created a topic called /apps/channel_data/stream:content (18 partitions, lz4 compression, ttl 0)
  4. This topic is inside a specific volume called “channel_data” which does NOT have disk quota and has replication of three.
  5. We then started producing to this topic with almost 3 million posts per minute.
  6. We, then, realized that we will infact need extra space, and hence added a 3TB disk to ALL the nodes of our cluster.
    1. This was done by: maprcli disk add -disks /dev/xvdg -host <ip_address>

  7. After some time, this is what we noticed on one of the node:
    1. The data in our volume was as follows:
    2. The cluster utilization is as follows:
  8. This is when we started getting error: “No space left on device (28) null” while posting to that topic

 

So, my question is: Is this the expected behavior? Is there a way to repartition the stream so that the newly added disk gets used as much as possible and there is uniform distribution of data across all disks? Or did we mess up adding the disks appropriately?

Outcomes