How do we recover data if the cluster itself stopped working which is having all the nodes
I'm not sure I understand the question....
Data is replicated across the cluster. By default its a 3X replication. (This is configurable BTW).
With 3 copies, if there is a missing or bad block, it compares the 3 and if 2 match, a new third block is replicated and the bad block is recycled. (gone) . If none of the blocks match then the block is bad. (I'm not sure what MapR will do then...
(Note: Its not normal for all 3 copies or two of the copies to go bad at the same time. It can happen...)
Data can then be replicated to another cluster. Again you have the same scenario.
If your cluster crashes, and when you recover the cluster... and the data is missing... I'm not sure what you can do to recover the data.
So we are not sure how to recover data if all 3 copies on the same cluster and cluster crashed
Unfortunately, yes. Distributed file systems like MapR-FS work on the assumption that it is rare for two or three copies of the data to crash at the same time. To make this even more unlikely, you can use mirrors or other forms of backup. But in the rare event that all your data was on one cluster, and the entire cluster crashed, and you did not mirror the cluster remotely, then the data might be lost. There are ways to recover data from dead hard drives, but without more information about your specific situation I cannot begin to advise what your next steps might be.
Please mention few ways to recover data from dead hard drives.
You can't, they are Dead.
I think that there are some companies that specialize in data recovery, but they are expensive, offer no guarantees and may not accept your drives since the data is in a MapRFS format.
Can you describe your cluster? What happened where the drives died?
In general, if you have data that is high value data, you can increase the replication factor to 5X (always use an odd number of copies) and these copies will be spread across the cluster (Assuming you have more than 5 data nodes)
You don't want to do this for all of your data because it gets expensive to keep a lot of copies of the data around.
You can also back these data sets up to other systems (HA cluster) or a SAN, or even tape. It depends on what you have on hand, the size of the data, and its value.
Adding to what Michael Segel said, if your cluster is configured with the default settings, all your data should have been replicated three times. With proper distribution of data across nodes and racks using volumes and topologies, this should help prevent data loss. For example, if your data is distributed across Rack 1, Rack 2, and Rack 3, but Rack 2 goes down (let's say someone accidentally unplugged it), you will still have copies of your data on Rack 1 and Rack 3.
Mirrors can provide another level of replication and high availability. If Rack 1, 2, and 3 are all in the same data center, and that entire data center goes down (let's say because of a natural disaster), then yes you would lose all of your data. A remote mirror to another data center in another location could prevent this.
You can learn more about volumes and topologies in ADM 201 - Configure a MapR Cluster, about mirrors in ADM 202 - Data Access and Protection, and read about a fun example of why you might use remote mirrors in Oceans’ Data Part 2: Building the Greco Player Tracker with MapR.
Hope this helps!
One thing to mention...
Its possible that you can loose a large enough portion of your cluster.. without it going down, or losing data, where you can not function. While the OP talked about losing data, its also important to remember that you also need a certain percentage of your storage free so you can work with the data.
Great point, Michael! I think these considerations are discussed in ESS 102 – MapR Converged Data Platform Essentials and ADM 200 - Install a MapR Cluster! Hopefully the OP can use these MapR Academy resources to help diagnose his particular problem.
Retrieving data ...