Big Data, Advanced Analytics, Machine Learning Meetup - May 17, 2016 - Green Bay, WI

Document created by aalvarez on May 13, 2016Last modified by aalvarez on May 17, 2016
Version 5Show Document
  • View in full screen mode

MATERIALS

 

SUMMARY

Date:May 16, 2016
Location:

Titletown Tap Room

320 N. Broadway, Green Bay, WI

Time:18:00 - 20:00
Registration Link:http://www.meetup.com/BAMDataScience/events/229518056/#

                       

ABOUT THE EVENT

The BAM - Big Data, Advanced Analytics, and Machine Learning is a community for anyone interested in data science and machine learning. All levels of interest and skill are welcome. Their events will have speakers on topics like Hadoop, machine learning algorithms, Python, R, and the many related tools and languages that keep popping up.  At our May event, we'll have speakers from IBM and MapR presenting on some great tools.

 

AGENDA

  • Self-Service Data Exploration and Nested Data Analytics on Hadoop - Introduction to Apache Drill by Andrew Goade, System Engineer at MapR

SQL is one of the most widely used languages to access, analyze, and manipulate structured data. As Hadoop gains traction within enterprise data architectures across industries, the need for SQL for both structured and loosely-structured data on Hadoop is growing rapidly. Apache Drill started off with the audacious goal of delivering consistent, millisecond ANSI SQL query capability across wide range of data formats. At a high level, this translates to two key requirements – Schema Flexibility and Performance. Apache Drill provides the users the ability to interact with big data on Hadoop much faster and far more easily using the familiar SQL language. Users are no longer dependent on central IT teams and DBAs to produce schemas and then maintain them when the structure changes for a few records. Drill alleviates the pain associated with structuring unstructured data before one gains any insights by providing a simple mechanism to query any dataset on Hadoop - be it flat files, parquet or JSON files or tables within an HBase table. This session will give you an overview of several different use cases that enterprises are testing Drill for.

 

  • Apache Spark Overview by Bruce Fischer, IBM Spark Technologist

• What is Apache Spark

• Spark Resilient Distributed Dataset (i.e. RDD)

• Spark SQL

• Spark Data Frames

• Spark Machine Learning

 

MORE DETAILS

 

 

MATERIALS

 

Drill in 10 Minutes

MapR Sandbox with Apache Drill

Explore Drill tutorials

             

Explore Apache Spark Resources & Product Information
Get started with spark

                Spark on MapR Sandbox

                Getting Started with Apache Spark | Free ebook

              Free Training on Apache Spark Essentials

             

 


Attachments

    Outcomes