How to connect to Hive from R using JDBC driver?

Document created by prasoon Employee on Feb 8, 2016
Version 1Show Document
  • View in full screen mode

Author: Prasoon Keloth

 

Original Publication Date: January 19, 2015

Goal:

How to connect to Hive from R using JDBC driver.

Solution:

1. Export required environment variables.

Assume Hive version is 0.13, and Hadoop version is 0.20.2:

export HADOOP_CMD=/opt/mapr/hadoop/hadoop-0.20.2/bin/hadoop 
export LD_LIBRARY_PATH=/opt/mapr/lib:$LD_LIBRARY_PATH
export HADOOP_CONF_DIR=/opt/mapr/hadoop/hadoop-0.20.2/conf
export HIVE_HOME=/opt/mapr/hive/hive-0.13/

2. From "R" shell, run below commands to access Hive table.

For example:

library("rJava") 
library("RJDBC")
drv <- JDBC("org.apache.hive.jdbc.HiveDriver", "/opt/mapr/hive/hive-0.13/lib/hive-jdbc-0.13.0-mapr-1410.jar")
for(jar in list.files('/opt/mapr/hive/hive-0.13/lib/')){ .jaddClassPath(paste("/opt/mapr/hive/hive-0.13/lib/",jar,sep=""))}
for(jar in list.files('/opt/mapr/lib')){ .jaddClassPath(paste("/opt/mapr/lib/",jar,sep=""))}
conn <- dbConnect(drv, "jdbc:hive2://<hs2 node IP>:10000/default", "<user>", "<pass>")
dbGetQuery(conn, "select * from testTable limit 5")

Note for above sample commands:

 

a. Make sure the Hive JDBC driver path is correct:

/opt/mapr/hive/hive-0.13/lib/hive-jdbc-0.13.0-mapr-1410.jar

 

b. Input correct HiveServer2 node IP, hive username and password.

<hs2 node IP>,<user>,<pass>

Attachments

    Outcomes