AnsweredAssumed Answered

Unable to connect to mapr cluster following mapr article on mapr + sparklyr integration

Question asked by reedv on Jul 2, 2018
Latest reply on Jul 10, 2018 by reedv

Following a MapR article about integrating sparklyr and RStudio with our mapr cluster (https://community.mapr.com/community/products/mapr-converged-platform/data-refinery/blog/2018/03/23/how-to-using-r-studio-with-the-mapr-data-science-refinery), finding that cannot connect to the cluster in R via:

library(sparklyr)
options("sparklyr.verbose" = TRUE)
Sys.setenv(SPARK_HOME="/opt/mapr/spark/spark-2.1.0")
sc <- spark_connect(master = "http://localhost:8998",method = "livy")

and getting error

Error in curl::curl_fetch_memory(url, handle = handle) :
Failed connect to localhost:8998; Connection refused

Several variations tried (noting that mapr004 is the node that says hosts the spark history and thrift server in the MCS):

sc <- spark_connect(master = "http://mapr004:8998",method = "livy")

and also

sc <- spark_connect(master = "local")

Error in spark_version_from_home(spark_home, default = spark_version) : Failed to detect version from SPARK_HOME or SPARK_HOME_VERSION. Try passing the spark version explicitly.

even though checking Sys.getenv() shows SPARK_HOME=/opt/mapr/spark/spark-2.1.0.

Noting that the article says that the steps in the article currently only work for sparklyr <= 0.7.0, tried reinstalling the sparklyr version 0.7.0 via

devtools::install_github("rstudio/sparklyr@v0.7.0", force = TRUE)

and retrying the above, but the same problem occurs.

Outcomes