As I'm a beginner, this question may not make sense at all.
If so, please pardon me.
* From what I understand:
Hadoop ecosystem's main pieces are HDFS and MapReduce for distributed data and parallel processing.
MongoDB, etc. are NoSQL DBMS that have the sharding feature for distributed data and parallel processing.
* My question:
If I use MongoDB on 100 nodes, it can already do the distributed data and parallel processing by its own sharding feature and there's no need to use HDFS + MapReduce, right?
Or, do people still want to use a sharding-supported NoSQL DBMS like MongoDB on top of the multi-node Hadoop cluster?
If so, why?
Aren't they redundant?
(HDFS + MapReduce do distribution, etc.... NoSQL by itself can take care of distributed data and parallel processing, right?)
Thanks in advance for clarifying this!