You can think about overhead in two categories, per message and per row.
Each MapR Streams message is stored as a document in MapR-DB JSON format. This introduces some overhead for the timestamp, field names, and so on. The overhead is about 20 bytes per message. For large messages, the overhead is insignificant. For smaller messages (such as 200 byte messages), the overhead per message is about 10 percent.
Internally, MapR Streams stores messages together in a single row as long as the messages do not exceed 16K. That row is identified with a row key. The row key consists of:
Therefore, the overhead per row is 14 plus the topic name length bytes. For a reasonable topic name of 18 characters or less, that’s just 32 bytes of overhead in 64 messages. It’s less than 1 byte/message and not relevant unless the messages are minuscule. If the messages exceed 16K, then each message receives its own row. At this point, the 32 bytes of overhead is also insignificant.
Retrieving data ...