VIDEO: Optimizing Performance with Distributed TensorFlow

Video created by slimbaltagi on Dec 18, 2017

    This meetup, held December 12, 2017 at Mesosphere HQ in San Francisco, features talks from Mesosphere, PipelineAI, Kiwi, and MapR Technologies.

     

    Agenda below:

     

    Talk 1: Running Distributed TensorFlow with GPUs on Mesos with DC/OS (by Kevin Klues, Engineering Manager @ Mesosphere) Running distributed TensorFlow is challenging, especially if you want to train large models on your own infrastructure. In this talk, I will present an open source TensorFlow framework for distributed training on DC/OS. This framework takes the pain out of deploying distributed TensorFlow, so you can spend less time worrying about your deployment strategy and more time building out your model. I will begin with a quick introduction to distributed TensorFlow on DC/OS, followed by a live demo. Speaker Bio: Kevin Klues is an Engineering Manager at Mesosphere where he leads the DC/OS Cluster Operations team. Prior to joining Mesosphere, Kevin worked at Google on an experimental operating system for data centers called Akaros. He and a few others founded the Akaros project while working on their Ph.Ds at UC Berkeley. In a past life, Kevin was a lead developer of the TinyOS project, working at Stanford University, the Technical University of Berlin, and the CSIRO in Australia. When not working, you can usually find Kevin on a snowboard or up in the mountains in some capacity or another.

     

    Talk 2: Using the TensorFlow Estimator and Experiment APIs for End-to-End, "Train-to-Serve" Model Training, Optimizing, and Serving GPU-based TensorFlow AI Models from Research to Production using PipelineAI (by Chris Fregly, Founder and Research Engineer @ PipelineAI) (More details to come...) Speaker Bio: Chris Fregly is Founder and Research Engineer at PipelineAI, a Streaming Machine Learning and Artificial Intelligence Startup based in San Francisco. He is also an Apache Spark Contributor, a Netflix Open Source Committer, founder of the Global Advanced Spark and TensorFlow Meetup, author of the O’Reilly Training and Video Series titled, "High Performance TensorFlow in Production." Previously, Chris was a Distributed Systems Engineer at Netflix, a Data Solutions Engineer at Databricks, and a Founding Member and Principal Engineer at the IBM Spark Technology Center in San Francisco.

     

    Talk 3: Using TensorFlow for Deep Learning on Autonomous Vehicles at Kiwi Campus (Christian Garcia, Deep Learning Engineer @ Kiwi) Summary: In this talk we will learn about our goals, challenges, and general approach taken at Kiwi Campus to create a fleet of autonomous delivery robots. We will explore some of the basic models and architectures used for solving various aspects of this problem using Deep Learning. Speaker Bio: Expert Data Scientist and Developer with background in math and physics. Extremely passionate about programming, deep learning, and deep reinforcement learning.

     

    Talk 4: Kubernetes + GPUs + Distributed TensorFlow + Streaming Data (Dong Meng, Data Scientist and Engineer @ MapR) Based on this blog post: https://mengdong.github.io/2017/07/15... Speaker Bio: Dong is a Data Scientist - and Systems Engineer - with MapR Technologies. He specializes in Kubernetes, GPUs, TensorFlow, and Streaming Data.