|Date:||July 27, 2016|
2700 East Cary Street
ABOUT THE EVENT
The Richmond Big Data Group is a community for users, enthusiasts, and explorers of Apache Spark (http://spark.apache.org/) in the area of Richmond, Virginia. Apache Spark is a fast and general open source engine for large-scale data processing, supporting SQL, streaming, and complex analytics.
During this event Craig Warman, Solutions Engineer/Architect at MapR, will be delivering his talk "Spark SQL & Machine Learning - A Practical Demonstration". Join this event here: http://www.meetup.com/Richmond-Apache-Spark-Meetup/events/232385539/
- Spark SQL & Machine Learning - A Practical Demonstration by Craig Warman - Solutions Engineer/Architect at MapR
This presentation explores how SQL developers can deliver powerful machine learning applications by leveraging Spark's SQL and MLlib libraries. A brief overview covering Spark components and architecture kicks things off, and then we dive right in with a live demonstration of loading and querying data using Spark SQL. Next, we'll examine the basics of machine learning algorithms and workflows before getting under the hood of a Spark MLlib-based recommendation engine. Our final demonstration looks at how familiar tools can be used to query our recommendation data before we wrap up with a survey of real-world use cases.
• Spark Background/Overview - Brief Spark background, the Spark+Hadoop team, Spark's five main components, How to download a ready-to-use sandbox VM
• Spark SQL Architecture - Features, Languages, How DataFrames work, The SQLContext, Data sources
• Demo #1: Loading And Querying a Dataset with Spark SQL - Live demonstration of setting up a SQLContext, loading it with data, and running queries against it
• Machine Learning with Spark MLlib - Collaborative filtering basics, Alternating Least Squares (ALS) algorithm, General machine learning workflow
• Demo #2: Under The Hood With A Spark MLlib Recommendation Engine - Recommender model code review and live demonstration of training-test loop iterations
• Demo #3 Putting It All Together - Live demonstration of how to leverage Spark SQL ODBC/JDBC connectivity to query recommendation data using familiar tools
• Some Real-World Use Cases - Basically answer the questions "What's it good for?" and "Who's using this?"
• Basic knowledge of typical object-oriented programming languages and concepts is helpful
• Basic understanding of databases, filesystems, and SQL
Parking and Directions
Please park in lots marked Odell (one by the office, two on top), or on Pear Street.
- Craig Warman
OTHER RELATED MATERIALS
On apache spark
- Free Hadoop Training: Spark Essentials - Apache Spark Essentials
- Getting Started with Apache Spark - ebook
- Apache Spark Use Case for Better Drug Discovery - Whiteboard Walkthrough - YouTube
- Apache Spark vs. Apache Flink - Whiteboard Walkthrough - YouTube
- Live Demo: Apache Spark on MapR with MLlib - YouTube
- Free Code Friday - Machine Learning with Apache Spark - YouTube
- MapR Sandbox VM download page: MapR Sandbox for Hadoop | MapR
On Craig Warman:
HAVE A BURNING QUESTION?
Add a comment below. Not sure how?, check this: How do I add a comment or reply to a question in the community?