AnsweredAssumed Answered

What level of IOPS can MapR provide given basic assumptions?

Question asked by jacques on May 4, 2012
Latest reply on May 4, 2012 by srivas
Logs of PR and benchmarks are available for how well MapR performs for streaming reads.  The presentations from last year's Hadoop Summit put MapR somewhere around 75-85% of "bare" performance for streaming reads.  I'm curious how well MapR performs for random read workloads.  Does MapR achieve this same level of overhead for random reads?

While YCSB testing and workloads are useful to give some indicator of this, that testing is substantially impacted by the particular application that one is using. Have you provided any benchmarks on pure random read performance at the file system level?  I could see two key numbers for this: one including metadata lookup, the other would be steady state testing that is just focused on seeks post acquisition of block locations.

I'm sure the short answer would be that MapR performs better than HDFS.  What is the longer answer when comparing to traditional IOPS benchmarks?  Can a 1000 drive MapR cluster beat a 1000 drive EMC VNX 7500?

Of course certain reasonable assumptions would need to be made.  Among those would probably be: the data being read has locality, using standard 7200rpm drives, compression and striping disabled, etc.

An extension of this question would be whether or not the overhead scales well (e.g. using a cluster of ssds).