Author: Jitendra Yadav, last modified by Sreedhar Alagonda on May 18, 2015
Original Publication Date: May 1, 2015
MapR 3.x, 4.x
After configuring the spark on MapR we tried to run a sample spark program but the program throwing below error.
MASTER=spark://node1:7077 /opt/mapr/spark/spark-1.2.0/bin/run-example org.apache.spark.examples.SparkPi
15/02/03 22:47:18 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2015-02-03 22:47:18,6530 ERROR Client fs/client/fileclient/cc/client.cc:394 Thread: 7270 Failed to initialize client for cluster my.cluster.com, error Connection reset by peer(104)
Exception in thread "main" java.io.IOException: Could not create FileClient at com.mapr.fs.MapRFileSystem.lookupClient(MapRFileSystem.java:502)
Note: All other mapreduce jobs are working fine and even hadoop fs -ls / also working fine from same node.
This problem generally occurs when the client is not able to connect cldb node, In that case the cldb was running on multiple NIC's and unfortunately spark client doesn't included MAPR_SUBNET variable in it caused this problem. We need to make sure that MAPR_SUBNET is set under /opt/mapr/conf/env.sh and same should be accessible from spark also. There was an issue where spark shell is not including MAPR_HOME while starting.
Set below beriable in spark-env.sh file and run same example.export MAPR_HOME=/opt/mapr
export MAPR_SUBNET directly in spark-env.sh file.