AnsweredAssumed Answered

Write to a specific machine in a cluster?

Question asked by anthony.kalinde on Jun 22, 2015
Latest reply on Jun 24, 2015 by anthony.kalinde
Hi,

I have a small low-budget 5 node cluster on ec2. On which will be running several proofs of concept that require click stream data. The initial plan was to have it so that I can shut down 4 of the 5 nodes and leave the one node collecting clicks (expecting small volume), and only bring up the other nodes when the cluster had to run some jobs, most likely once a day, to start.

I am wondering if it is a combination of topologies and volume settings will allow to achieve this?

The clickstream server is called divolte from godatadriven. It runs fine when I have the 4 nodes on, when I try it with the 4 nodes off, I get an error from divolte that it can't find an output file on mapr-fs.

 I gather this is probably obvious with the right information and I have been reading the topology and volume docs, and I was fiddling with the MCS topology and volume options hoping to find a way to limit the click data to the first node, but I don't quite see how the volumes and topology go beyond a logical setting.

I suppose MapR and hadoop in general aren't intended to run this way. But it would be quite a saving if I could shut down the nodes I am not using until I need them.

I have a trial M7 license could I drop my current topology.
And say take 4 nodes off line create a new /data where data should live and then bring up the other nodes for data /processing?
 
Any help is much appreciated.

The cluster setup
![alt text][2]
  [2]: /storage/temp/256-screen-shot-2015-06-23-at-011814.png

Outcomes