AnsweredAssumed Answered

Spark on MapR - Using Hadoop Provided

Question asked by mandoskippy on Nov 10, 2015
Latest reply on Dec 7, 2015 by kingmesal
Based on some JIRAs and Git Pull requests, it looks like the latest versions of Spark have removed the MapR profiles from the Spark Pom for building. The recommend method is to use the -Phadoop-provided profile.

When I do that, Spark doesn't allow me to access MapRFS giving me "no filesystem for scheme MapRFS"

I also found this post where pwendell states that we need to add the MapR hadoop bindings to the class at runtime.  I guess I am lost on how to do this, adding to classpath? Adding somewhere else?  How do I setup Spark to be able to see MapR when I run it? 

Thanks!

Post:
 we are actually recommending that MapR users use the "hadoop provided" builds that became available in Spark 1.4. You just add the MapR hadoop bindings to the class at runtime. Is there any reason you can't do that? I think MapR's own Spark distribution is using those as well.
Links:

https://github.com/apache/spark/pull/7047
https://github.com/apache/spark/pull/8338
https://issues.apache.org/jira/browse/SPARK-6196

Outcomes