AnsweredAssumed Answered

MapR cluster instances on Amazon EMR

Question asked by elleg on Jul 3, 2014
Latest reply on Jul 7, 2014 by elleg

We are trying to estimate (a very rough ballpark) the cluster costs if we were to run mapr on amazon EMR. I see the following cluster recommendation in your docs:

    Standard 100TB Rack Configuration

    20 standard nodes
    (20 x 12 x 2 TB storage; 3x replication, 25% margin)
    48-port 1 Gb/s rack-top switch with 4 x 10Gb/s uplink
    Add second switch if each node uses 4 network interfaces

    To grow the cluster, just add more nodes and racks, adding additional service instances as needed. MapR rebalances the cluster automatically.

However, I am unable to correlate this back to the instances available on Amazon on [this][1] page.

1) They have listed Standard On-Demand instances, and in your docs, you mention "standard" nodes. Are these the same? In terms of actual instance types on EC2, is that an m1 or m3 or something else? There is also "High-Storage On-Demand Instances" in that list. Should we be using something like those for storage instead? The storage specs of standard and high-storage seem quite different, so we are not sure which one would be recommended.

2) When amazon talks about the MapR + EMR cost, does the EMR cost affect only things like instances  running m/r jobs?

Thanks for your help!