I was working with a 10 machine cluster with a replication factor of 5. Machine 4 and 5 were the zookeeper, cldb, and webserver nodes. The rest were nodes.
Machines 4 and 5 have since died. I am trying now to bring back the servers. What I have done so far
1) Install zookeepers on machines 6 and 7. They are online and running.
2) Install cldb on Machine 1.
This is as far as I have gotten. When I restart machine 1, cldb service runs for a second, and then goes away. I notice this when I type in jps
I look into cat /opt/mapr/logs/mfs and it has these lines at the end
2016-07-13 23:19:01,9012 INFO fs/server/container/containerreport.h:71 x.x.0.0:0 ID : 228012258
2016-07-13 23:19:01,9012 INFO fs/server/container/containerreport.h:71 x.x.0.0:0 ID : 29322979
2016-07-13 23:19:01,9012 INFO fs/server/container/containerreport.h:71 x.x.0.0:0 ID : 124810100
2016-07-13 23:25:43,8437 ERROR cldbha.cc:965 x.x.0.0:0 Failed to reach CLDB node due to error Read-only file system (30) for operation 2345.33 at 10.252.101.70:7222. Will retry after finding CLDB master.
2016-07-13 23:25:43,8443 ERROR cldbha.cc:698 x.x.0.0:0 Got error Read-only file system (30) while trying to register with CLDB 10.252.101.70:7222
2016-07-13 23:25:46,8457 ERROR cldbha.cc:698 x.x.0.0:0 Got error Connection reset by peer (104) while trying to register with CLDB 10.252.101.70:7222
2016-07-13 23:26:55,9686 INFO fileserver.cc:9508 x.x.0.0:0 CLDB asked me to accept StoragePool 61b80d536dda768b005605e8c9005b40
2016-07-13 23:26:55,9686 INFO fileserver.cc:9518 x.x.0.0:0 SP with id 61b80d536dda768b005605e8c9005b40, already accepted.
2016-07-13 23:26:55,9686 INFO cldbha.cc:732 x.x.0.0:0 Re-established communication link with CLDB master at 10.252.101.70:7222.
2016-07-13 23:26:55,9901 ERROR fileserver.cc:8090 x.x.0.0:0 heartbeat thread didn't get response for 72147 msec
2016-07-13 23:26:55,9901 INFO fileserver.cc:9048 x.x.0.0:0 recieved updated no-compress list from cldb: bz2,gz,tgz,tbz2,zip,z,Z,mp3,jpg,jpeg,mpg,mpeg,avi,gif,png,lzo,j
Is there something I can do to make this work?