How Does MapR-DB Compare to Cassandra, MongoDB, and other NoSQL Databases?
MapR-DB has some clear advantages over other NoSQL databases in the market. Some of these include tight analytics integration, native JSON support, granular access controls, and performance and scalability.
First, MapR-DB is integrated with a full stack of analytics (Hadoop and Spark) so you can store operational data in MapR-DB and run analytics on that data in real-time since all data resides in the same location. There’s no need to copy the data to a separate analytics cluster as is required with other NoSQL databases. The copying process that introduces significant latency for analyzing data, and it precludes the ability to run real-time analytics. Some NoSQL databases claim tight Hadoop and Spark integration, but for most production environments, they still require separate database and analytical clusters.
Second, MapR-DB has native JSON document support. This means that MapR-DB reads/writes JSON data at the element level, so if you request a single element within a JSON document, only that element is read from disk and transported over the wire to the application. Some NoSQL databases handle this properly while others are actually key-value stores with complex JSON processing built into the API. This type of architecture is problematic for performance as well as for concurrency, in which there’s a risk of accidentally overwriting another client’s updates. To resolve the concurrency issue, complex application code must be written to ensure no accidentally overwrites, which puts the burden on the application developer. Cassandra claims JSON support but what’s offered is a layer/schema that must be predefined and cannot be updated on the fly, thus negating the flexible, schemaless advantages of having a JSON document database.
Third, MapR-DB provides access controls at the element level. Other NoSQL databases (with the exception of MarkLogic) only allow users to set permissions at the document level. This means that if you want to store “public” information with “private” information (like a social security number), you need to create two separate records for a given person. For example, you would create one record to store a person’s address and have looser permissions on that record, but then have a separate record for that person that includes their social security number. This pattern adds operational and application development complexity.
Finally, MapR-DB is faster and more scalable. Internal benchmarks have shown MapR-DB to be faster than Cassandra, HBase, and MongoDB using the YCSB benchmark tests. At the same time, MapR-DB provides strong consistency so that applications will always get the correct answer, unlike the “eventual consistency” databases which provides speed benefits at the cost of data accuracy.
For a recent independent comparison between HBase, Cassandra and MapR DB, see here:
Analyzing the Performance of MapR-DB, a NoSQL Database in the MapR Converged Data Platform | MapR
The net outcome was that MapR-DB outperformed both HBase and Cassandra by a substantial margin both in terms of throughput and 90-th %-ile latency.
Here, for instance, is the latency result (smaller is better):
Interesting article about MapR DB and its superior performance
Thank you for sharing. In the community, we are holding an "Ask Us Anything MapR-DB" from 10/2 to 10/6/17. Please see
Retrieving data ...