AnsweredAssumed Answered

ConnectionLost.. Reconnecting.. when running simple test (newly configured system)

Question asked by marek on Feb 29, 2012
Latest reply on Mar 2, 2012 by marek
Hello,

I am having problem running hadoop test job(s) on a newly configured system.
I am new to hadoop and this is my first setup using mapr; thus it is very likely that I am missing something "basic" here...

My setup:

1. I am using "MapR Virtual Machine" (M3 Demo VM) as the "test cluster". I can access the cluster via https://mapr-desktop:8443 just fine, all services seems to be running ok, etc.

2. I created another vmware machine to be used as a "client"; running CentOS-6.2 & Java 1.7.0. Installed "mapr-client" from repo, and run the "configure.sh" (/opt/mapr/server/configure.sh -N MyCluster -c -C mapr-desktop:7222)

3. Running really basic commands on the client (such as "hadoop fs -ls") seems to be ok.  To prepare for the tests I created "/myvolume" on MapR-FS and copied some data files there ("hadoop fs -copyFromLocal bigfile.txt /myvolume/in"). So far everything was ok.

HOWEVER, when I try to run some of the examples from "hadoop-*-test.jar", the job fails "ConnectionLost.. Reconnecting.." errors (see below).

Any ideas what I am missing here? I am not even sure of the nature of the problem in the first place. Is this something related to "incorrect/incomplete configuration" of the client?
Or, is this something related to networking (however both cluster and client can "talk/ping" each other). Is there some other "missing service" (zookeeper?) that needs to be running?

Is there some other info (log files, config files) I should provide you?
Any ideas are appreciated.. need help :)

Thanks,
Marek


--------
[marek@mapr-client ~]$ hadoop jar $HADOOP_INSTALL/hadoop-*-test.jar TestDFSIO -write -nrFiles 10
TestDFSIO.0.0.4

12/02/29 17:29:18 INFO fs.TestDFSIO: nrFiles = 10

12/02/29 17:29:18 INFO fs.TestDFSIO: fileSize (MB) = 1.0

12/02/29 17:29:18 INFO fs.TestDFSIO: bufferSize = 1000000

12/02/29 17:29:19 INFO fs.TestDFSIO: baseDir = /benchmarks/TestDFS

12/02/29 17:29:19 INFO fs.TestDFSIO: creating control file: 1048576 bytes, 10 files

12/02/29 17:29:19 INFO fs.TestDFSIO: created control files for: 10 files

12/02/29 17:29:19 INFO fs.JobTrackerWatcher: findJobTrackerAddr: ConnectionLost, Reconnecting... Current ZooKeeper Server: localhost:5181

12/02/29 17:29:20 ERROR fs.MapRClient: Retrying...Fetching new Zookeeper locations from CLDB.  Attempt #1

12/02/29 17:29:22 INFO fs.JobTrackerWatcher: findJobTrackerAddr: ConnectionLost, Reconnecting... Current ZooKeeper Server: localhost:5181

12/02/29 17:29:22 ERROR fs.MapRClient: Retrying...Fetching new Zookeeper locations from CLDB.  Attempt #2
...


Outcomes