Intro to Building a Distributed Pipeline for Real Time Analysis of Uber's Data:  Feb. 22, 2018 - Atlanta Meetup

Document created by maprcommunity Employee on Jan 23, 2018
Version 1Show Document
  • View in full screen mode

SUMMARY

Date:

Feb. 22, 2018

6:30PM to 9:30PM Eastern 

Topic:Intro to Building a Distributed Pipeline for Real Time Analysis of Uber's Data
Registration:

Intro to Building a Distributed Pipeline for Real Time Analysis of Uber's Data | Atlanta Apache Spark User Group (Atlant…  

ABOUT THE EVENT

In this talk we will look at a solution that combines real-time data streams with machine learning to analyze and visualize popular Uber trip locations in New York City. You will see the end-to-end process required to build this application using Apache APIs for Kafka, Spark, and HBase.

 

According to Gartner, by 2020, smart cities will be using about 1.39 billion connected cars, IoT sensors and devices. The analysis of behavior patterns within cities will allow optimization of traffic, better planning decisions, and smarter advertising. You may be excited about the possibilities of exploiting data streams to gain actionable insights from continuously produced data in real-time but you may find it difficult to conceptualize how to implement such a solution. We will walk you through an architecture that combines data streaming with machine learning to enhance Uber trip data to analyze and visualize the most popular pick-up/drop-off locations by date and time so that drivers’ locations could be optimized and priced according to demand. The presentation will consist of four sections:

 

• Introduction to Spark machine learning for developers

 

• Kafka and Spark Streaming

 

• Real time dashboard using a micro service framework

 

• Using the Spark HBase connector for parallel writes and reads

 

SPEAKER

carol mcdonald, MapR Solutions Architect, specializing in Apache Kafka, Apache HBase, Apache Drill, Apache Spark, and machine learning  in healthcare, finance, and telecom sectors

 

RELATED

kafka

spark

hbase

Attachments

    Outcomes