Richmond Apache Spark Meetup Group - July 27, 2016 - VA

Document created by aalvarez on Jul 11, 2016Last modified by aalvarez on Sep 1, 2016
Version 7Show Document
  • View in full screen mode


Date:July 27, 2016

Ippon USA

2700 East Cary Street
The Power plant at Lucky Strike
Richmond, VA

Registration Link:

Ticket PriceFree



The Richmond Big Data Group is a community for users, enthusiasts, and explorers of Apache Spark ( in the area of Richmond, Virginia. Apache Spark is a fast and general open source engine for large-scale data processing, supporting SQL, streaming, and complex analytics.


During this event Craig Warman, Solutions Engineer/Architect at MapR, will be delivering his talk "Spark SQL & Machine Learning - A Practical Demonstration". Join this event here:



  • Spark SQL & Machine Learning - A Practical Demonstration by Craig Warman - Solutions Engineer/Architect at MapR


This presentation explores how SQL developers can deliver powerful machine learning applications by leveraging Spark's SQL and MLlib libraries.  A brief overview covering Spark components and architecture kicks things off, and then we dive right in with a live demonstration of loading and querying data using Spark SQL.  Next, we'll examine the basics of machine learning algorithms and workflows before getting under the hood of a Spark MLlib-based recommendation engine.  Our final demonstration looks at how familiar tools can be used to query our recommendation data before we wrap up with a survey of real-world use cases.



• Spark Background/Overview - Brief Spark background, the Spark+Hadoop team, Spark's five main components, How to download a ready-to-use sandbox VM

• Spark SQL Architecture - Features, Languages, How DataFrames work, The SQLContext, Data sources

• Demo #1: Loading And Querying a Dataset with Spark SQL - Live demonstration of setting up a SQLContext, loading it with data, and running queries against it

• Machine Learning with Spark MLlib - Collaborative filtering basics, Alternating Least Squares (ALS) algorithm, General machine learning workflow

• Demo #2: Under The Hood With A Spark MLlib Recommendation Engine - Recommender model code review and live demonstration of training-test loop iterations

• Demo #3 Putting It All Together - Live demonstration of how to leverage Spark SQL ODBC/JDBC connectivity to query recommendation data using familiar tools

• Some Real-World Use Cases - Basically answer the questions "What's it good for?" and "Who's using this?"


Prerequisite Knowledge

• Basic knowledge of typical object-oriented programming languages and concepts is helpful

• Basic understanding of databases, filesystems, and SQL


Audience Type



Parking and Directions


Please park in lots marked Odell (one by the office, two on top), or on Pear Street.







On apache spark




On Craig Warman:


Find more on: spark hadoop mapr



Add a comment below. Not sure how?, check this: How do I add a comment or reply to a question in the community?