I would like to know the redundancy of historical data in Hadoop. I assume the 3 node replication factor into picture only when a mapreduce process is initiated. That is whatever it was told about storage pools, containers, volumes etc.
If I have 500 TB of historical data, where does it stores. What is the redundancy of this historical data.