Seattle Data Science Meetup - Apr 21, 2016

Document created by aalvarez on Mar 28, 2016Last modified by aalvarez on Mar 28, 2016
Version 2Show Document
  • View in full screen mode


Date:April 21, 2016


111 S. Jackson St., Seattle, WA

Registration Link:



Data Scientists in Seattle are doing incredible work: making graph models of symptoms and human disease, extracting insight from huge amounts of real-time data, and building tools to make the whole process easier. Seattle Data Science Meetup was founded  to provide them with a continued opportunity to promote and contribute to this expanding field.  



  • Putting Apache Drill to Use: Best Practices for Production Deployments - Neeraja Rentachintala -  Director of Product Management

Apache Drill is the industry's first schema-free SQL query engine for big data. With its flexibility to explore both structured and complex datasets on the fly from a variety of data sources combined with its distributed query processing capabilities that provide low latency performance at petabyte scale, Drill is getting rapidly adopted by organizations to open up their Hadoop deployments to wide variety of users in a self service fashion. This session provides deep dive into the use cases ofApache Drill and best practices for deploying it in production using real customer examples. We will start with an introduction to Apache Drill , how it fits into the Hadoop eco system and quickly delve into the topics that matter for production rollout such as the  data ingestion methodologies , data model trade offs, storage format selection, picking the data layout for optimal performance, query design tips & tricks and finally wrap up a preview of the road ahead for the project in 2016.