Spark: Could not create FileClient -connection reset by peer

Document created by wade on Feb 27, 2016
Version 1Show Document
  • View in full screen mode

Author: Jitendra Yadav, last modified by Sreedhar Alagonda on May 18, 2015

 

Original Publication Date: May 1, 2015

 

Environment
MapR 3.x, 4.x
Spark 1.2.0

Symptom

After configuring the spark on MapR we tried to run a sample spark program but the program throwing below error.

 

MASTER=spark://node1:7077 /opt/mapr/spark/spark-1.2.0/bin/run-example org.apache.spark.examples.SparkPi

15/02/03 22:47:18 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

2015-02-03 22:47:18,6530 ERROR Client fs/client/fileclient/cc/client.cc:394 Thread: 7270 Failed to initialize client for cluster my.cluster.com, error Connection reset by peer(104)

Exception in thread "main" java.io.IOException: Could not create FileClient at com.mapr.fs.MapRFileSystem.lookupClient(MapRFileSystem.java:502)

at com.mapr.fs.MapRFileSystem.lookupClient(MapRFileSystem.java:563)

at com.mapr.fs.MapRFileSystem.getMapRFileStatus(MapRFileSystem.java:1201)

at com.mapr.fs.MapRFileSystem.getFileStatus(MapRFileSystem.java:847)

at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1434)

at org.apache.spark.util.FileLogger.createLogDir(FileLogger.scala:99)

at org.apache.spark.util.FileLogger.start(FileLogger.scala:91)

Note: All other mapreduce jobs are working fine and even hadoop fs -ls / also working fine from same node.

Root Cause

This problem generally occurs when the client is not able to connect cldb node, In that case the cldb was running on multiple NIC's and
unfortunately  spark client doesn't included MAPR_SUBNET variable in it caused this problem. We need to make sure that MAPR_SUBNET is set under /opt/mapr/conf/env.sh and same should be accessible from spark also. There was an issue where spark shell is not including MAPR_HOME while starting.

Solution

Set below beriable in spark-env.sh file and run same example.export MAPR_HOME=/opt/mapr

OR

export MAPR_SUBNET directly in spark-env.sh file.

Attachments

    Outcomes