[MapR Talk] Spark SQL & Machine Learning - A Practical Demonstration

Document created by aalvarez on Jul 28, 2016Last modified by maprcommunity on May 31, 2017
Version 3Show Document
  • View in full screen mode


In an effort to help grow organic communities interested in new technologies, MapR speakers around the world provide technical talks on numerous topics. Please browse our MapR Talks Directory  to learn how to search and request a talk in seconds.



This presentation explores how SQL developers can deliver powerful machine learning applications by leveraging Spark's SQL and MLlib libraries.  A brief overview covering Spark components and architecture kicks things off, and then we dive right in with a live demonstration of loading and querying data using Spark SQL.  Next, we'll examine the basics of machine learning algorithms and workflows before getting under the hood of a Spark MLlib-based recommendation engine.  Our final demonstration looks at how familiar tools can be used to query our recommendation data before we wrap up with a survey of real-world use cases.



• Spark Background/Overview - Brief Spark background, the Spark+Hadoop team, Spark's five main components, How to download a ready-to-use sandbox VM

• Spark SQL Architecture - Features, Languages, How DataFrames work, The SQLContext, Data sources

• Demo #1: Loading And Querying a Dataset with Spark SQL - Live demonstration of setting up a SQLContext, loading it with data, and running queries against it

• Machine Learning with Spark MLlib - Collaborative filtering basics, Alternating Least Squares (ALS) algorithm, General machine learning workflow

• Demo #2: Under The Hood With A Spark MLlib Recommendation Engine - Recommender model code review and live demonstration of training-test loop iterations

• Demo #3 Putting It All Together - Live demonstration of how to leverage Spark SQL ODBC/JDBC connectivity to query recommendation data using familiar tools

• Some Real-World Use Cases - Basically answer the questions "What's it good for?" and "Who's using this?"


Prerequisite Knowledge

• Basic knowledge of typical object-oriented programming languages and concepts is helpful

• Basic understanding of databases, filesystems, and SQL


Audience Type



Location Availability & Request Link

North America. Please refer to the MapR Talks Directory  for specific countries.

You can request this talk here: Speaker Request


Related Resources

Find all Meetups and Events resources.

Find all MapR Talks: mapr talk

Learn more about Mapr Talks and how to book a speaker: Meetup and Event Organizers Resources