How to Use Kylin on MapR 5.2

Document created by Rachel Silver Employee on Dec 18, 2016Last modified by aalvarez on Jan 25, 2017
Version 14Show Document
  • View in full screen mode


Note: This is an update to previous steps to address a bug found in Kylin versions 1.5.4-1.6.0.


Apache Kylin™ is an open source Distributed Analytics Engine designed to provide a SQL interface and multi-dimensional analysis (OLAP) on Hadoop supporting extremely large datasets, originally contributed from eBay Inc.


Apache Kylin™ lets you query big Hive tables at sub-second latency in three simple steps:


  1. Identify a set of Hive tables in star schema.
  2. Build a cube from the Hive tables in an offline batch process.
  3. Query the Hive tables using SQL and get results in sub-seconds, via Rest API, ODBC, or JDBC. (From Kylin docs)


After hearing significant interest from our customers, we worked with the Kylin support team to find a successful integration path.


Note: This article describes how to run Kylin on HBase, not using the HBase APIs to connect to MapR-DB.


Sample Environment and Version Information


The relevant software versions that we will be working with are as follows:



Kylin Install


To begin, you'll have to retrieve the Kylin 1.6.0 for HBase 1.x binary file and unzip it. It's important to have this directory be owned and accessible by a user with MapReduce job permissions (ex. 'mapr'). The following directions will create a directory called /opt/kylin owned by the default cluster user and unzip the Kylin binary to that place:


mkdir -p /opt/kylin

wget -P /tmp/

tar -xzf /tmp/apache-kylin-1.6.0-hbase1.x-bin.tar.gz -C /opt/kylin

chown -R mapr:mapr /opt/kylin


Change to your MapR Cluster user:


      su mapr

Next, set the KYLIN_HOME variable to point to this location:

      export KYLIN_HOME=/opt/kylin/apache-kylin-1.6.0-hbase1.x-bin


To address an issue related to the Apache Calcite version (KYLIN-2094), please delete all of the Kylin JDBC JAR files in the $KYLIN_HOME/lib directory before starting for the first time:


   rm -r /opt/kylin/apache-kylin-1.6.0-hbase1.x-bin/lib/kylin-jdbc-*.jar




Starting Kylin for the First Time


To start Kylin, run the following as the cluster user:

$KYLIN_HOME/bin/ start

On the first start, it may take a few minutes to create the initial Hive and HBase tables. When it's done, visit the Kylin Web UI by replacing <host> in this web address with your hostname for the server you've installed Kylin on:




Log In with Username ADMIN and Password KYLIN as shown:


Screen Shot 2016-05-10 at 7.26.59 PM.png


Building a Sample Cube


Once you've confirmed that you have access to the Kylin WebUI, you can load the provided sample data by running the following (taken from Kylin docs):



[mapr@ip-172-31-15-151 root]$ $KYLIN_HOME/bin/
KYLIN_HOME is set to /opt/kylin/apache-kylin-1.6.0-hbase1.x-bin
Going to create sample tables in hive


Sample cube is created successfully in project 'learn_kylin'.
Restart Kylin server or reload the metadata from web UI to see the change.

To restart Kylin, please run the following and then log into the WebUI again to continue:


$KYLIN_HOME/bin/ stop

$KYLIN_HOME/bin/ start


In the WebUI, select "learn_kylin” from the project drop-down list:


Screen Shot 2016-05-11 at 3.00.55 PM.png


Select "build" from the Action/s menu for the kylin_sales_cube and then set the end date to today to load the entire data set (10,000 records):


Screen Shot 2016-05-11 at 3.02.39 PM.png


You can follow the progress of this build process in the Monitor tab. When it reaches 100%, we can move on to running a sample query.


Screen Shot 2016-05-11 at 4.24.00 PM.png



Queries are run from the Insight tab. Below is a test query with expected results that you can run:


select part_dt, sum(price) as total_selled, count(distinct seller_id) as sellers from kylin_sales group by part_dt order by part_dt


Screen Shot 2016-05-11 at 4.25.33 PM.png


Screen Shot 2016-05-16 at 1.26.30 PM.png



Links to Further Information

2 people found this helpful