Nutch Hadoop crawler overhead on small cluster, small data

Question asked by sergiu on Sep 25, 2012

I am going to use Nutch+Hadoop to create a distributed crawler.
I wonder, is this suitable in order to process a small amount of data, an amount of GB of data - tens of GB of data, on a small cluster, 5-10 nodes? Does Nutch, Hadoop involve an important overhead on this small cluster, small data?

