AnsweredAssumed Answered

HDFS (OS) vs. Sharding (DB)

Question asked by lifelearner on May 21, 2016


As I'm a beginner, this question may not make sense at all.

If so, please pardon me.


* From what I understand:

Hadoop ecosystem's main pieces are HDFS and MapReduce for distributed data and parallel processing.

MongoDB, etc. are NoSQL DBMS that have the sharding feature for distributed data and parallel processing.



* My question:

If I use MongoDB on 100 nodes, it can already do the distributed data and parallel processing by its own sharding feature and there's no need to use HDFS + MapReduce, right?


Or, do people still want to use a sharding-supported NoSQL DBMS like MongoDB on top of the multi-node Hadoop cluster?

If so, why?

Aren't they redundant?

(HDFS + MapReduce do distribution, etc.... NoSQL by itself can take care of distributed data and parallel processing, right?)



Thanks in advance for clarifying this!



- Young