AnsweredAssumed Answered

Spark 1.4.0 Compiled for MapR

Question asked by mandoskippy on Jun 16, 2015
Latest reply on Jun 16, 2015 by mandoskippy
Hey all, I know Spark 1.4.0 isn't officially supported, but I guess I am hoping for some insight on this problem as it appears to by filesystem compatibility issue (or something in how spark is interacting with MapRFS, not in how spark itself works).

Basically, I compiled Spark 1.4.0 for MapR 4.1 using the profile added below.  It compiled fine. I know I've had issues previously with Spark on Mesos (1.3.1) however, Spark on Yarn and Spark Local worked fine.

Now with Spark 1.4.0, I am getting errors on Mesos, yarn, and locally, and it seems specific to the mapr lib that it complains about being loaded by something else.    Once again, I know Spark is not supported, but understanding the interaction here from the MapR FS would be really helpful to myself and likely others.


What I am trying to run in bin/pyspark:

from pyspark import SparkContext, SparkConf
from pyspark.sql import SQLContext, Row, HiveContext
sparkhc = HiveContext(sc)
test = sparkhc.sql("show tables")
for r in test.collect():
  print r





Error:

ava.lang.ExceptionInInitializerError
at com.mapr.fs.ShimLoader.load(ShimLoader.java:227)
at org.apache.hadoop.conf.CoreDefaultProperties.<clinit>(CoreDefaultProperties.java:59)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:274)
at org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:1857)
at org.apache.hadoop.conf.Configuration.getProperties(Configuration.java:2072)
at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2282)
at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2234)
at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2151)
at org.apache.hadoop.conf.Configuration.set(Configuration.java:1002)
at org.apache.hadoop.conf.Configuration.set(Configuration.java:974)
at org.apache.hadoop.mapred.JobConf.setJar(JobConf.java:518)
at org.apache.hadoop.mapred.JobConf.setJarByClass(JobConf.java:536)
at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:430)
at org.apache.hadoop.hive.conf.HiveConf.initialize(HiveConf.java:1366)
at org.apache.hadoop.hive.conf.HiveConf.<init>(HiveConf.java:1332)
at org.apache.spark.sql.hive.client.ClientWrapper.<init>(ClientWrapper.scala:99)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.spark.sql.hive.client.IsolatedClientLoader.liftedTree1$1(IsolatedClientLoader.scala:170)
at org.apache.spark.sql.hive.client.IsolatedClientLoader.<init>(IsolatedClientLoader.scala:166)
at org.apache.spark.sql.hive.HiveContext.metadataHive$lzycompute(HiveContext.scala:212)
at org.apache.spark.sql.hive.HiveContext.metadataHive(HiveContext.scala:175)
at org.apache.spark.sql.hive.HiveContext$$anon$2.<init>(HiveContext.scala:370)
at org.apache.spark.sql.hive.HiveContext.catalog$lzycompute(HiveContext.scala:370)
at org.apache.spark.sql.hive.HiveContext.catalog(HiveContext.scala:369)
at org.apache.spark.sql.hive.HiveContext$$anon$1.<init>(HiveContext.scala:382)
at org.apache.spark.sql.hive.HiveContext.analyzer$lzycompute(HiveContext.scala:382)
at org.apache.spark.sql.hive.HiveContext.analyzer(HiveContext.scala:381)
at org.apache.spark.sql.SQLContext$QueryExecution.assertAnalyzed(SQLContext.scala:920)
at org.apache.spark.sql.DataFrame.<init>(DataFrame.scala:131)
at org.apache.spark.sql.DataFrame$.apply(DataFrame.scala:51)
at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:744)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379)
at py4j.Gateway.invoke(Gateway.java:259)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:207)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at com.mapr.fs.ShimLoader.loadNativeLibrary(ShimLoader.java:335)
at com.mapr.fs.ShimLoader.load(ShimLoader.java:210)
... 45 more
Caused by: java.lang.UnsatisfiedLinkError: Native Library /tmp/mapr-darkness-libMapRClient.1.4.0.so already loaded in another classloader
at java.lang.ClassLoader.loadLibrary1(ClassLoader.java:1931)
at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1890)
at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1851)
at java.lang.Runtime.load0(Runtime.java:795)
at java.lang.System.load(System.java:1062)
at com.mapr.fs.shim.LibraryLoader.load(LibraryLoader.java:29)
... 51 more





Profile Added to pom.xml

   mapr4.1
   
     <hadoop.version>2.5.1-mapr-1503</hadoop.version>
     <yarn.version>2.5.1-mapr-1503</yarn.version>
     <hbase.version>0.98.9-mapr-1503</hbase.version>
     <zookeeper.version>3.4.5-mapr-1503</zookeeper.version>
     <protobuf.version>2.5.0</protobuf.version>
   
   
     <dependency>
       <groupId>org.apache.curator</groupId>
       <artifactId>curator-recipes</artifactId>
       <version>2.4.0</version>
       <exclusions>
         <exclusion>
           <groupId>org.apache.zookeeper</groupId>
           <artifactId>zookeeper</artifactId>
         </exclusion>
       </exclusions>
     </dependency>
     <dependency>
       <groupId>org.apache.zookeeper</groupId>
       <artifactId>zookeeper</artifactId>
       <version>3.4.5-mapr-1503</version>
     </dependency>
   
 

Outcomes