Seattle Apache Kafka Meetup Group - October 24, 2016 - WA

Document created by aalvarez on Oct 24, 2016Last modified by maprcommunity on May 31, 2017
Version 7Show Document
  • View in full screen mode


Scroll down to the "MATERIALS" section of this page to see some speakers slides & other related content.  



Date:October 24, 2016

Microsoft City Center Plaza

555 110th Ave NE, Bellevue, WA

Registration Link: 

Ticket PriceFree



Seattle Kafka Meetup Group is a group for Seattle/Eastside users interested in sharing knowledge about Apache Kafka - a high throughput distributed pub/sub messaging system. Apache Kafka is an extremely successful message queue for stream processing systems. The adoption of Kafka for near real time data processing has been increasing tremendously. The goal of this group is to bring together people of similar interests to discuss features, best practices for operations and deployment, and case studies of applications built using Kafka.


During this event William Ochandarena, Senior Director of Product Management MapR, will be delivering his talk "Building a Streaming Systems of Record with MapR Streams".



  • Apache Kafka at LinkedIn - "Multi-Tier, Multi-Tenant, Multi-Problem Kafka" - Todd Palino - Linkedin

 At LinkedIn, the Kafka infrastructure is run as a service: the Streaming team develops and deploys Kafka, but is not the producer or consumer of the data that flows through it. With multiple datacenters, and numerous applications sharing these clusters, we have developed an architecture with multiple pipelines and multiple tiers. Most days, this works out well, but it has led to many interesting problems. Over the years we have worked to develop a number of solutions, most of them open source, to make it possible for us to reliably handle over a trillion messages a day.


  • Apache Kafka on Azure HDInsight - Raghav Mohan, Program Manager for Azure Big Data.

At Microsoft, we have run Kafka workloads at scale via on-premise solutions. Recently, we have onboarded certain Kafka workloads to HDinsight - a fully managed cloud service powered by Azure. 

We will detail the challenges faced for creating a managed cloud Kafka service, and the obstacles faced for moving Kafka workloads on-prem hosts to cloud services.

  • Automating partition management in Kafka - Som Sahu, Microsoft.

If you run a Kafka cluster in Production environment, you may already be familiar with the imbalance in Partition distribution among Kafka brokers over time as disks, machines and new topics are added to the cluster. Som will talk about an automatic way to detect and fix the imbalance by distributing Kafka partitions evenly across Kafka brokers. This is a proven approach in Microsoft Kafka Cluster that could bring down operational overhead significantly.

This presentation explores real-time event streaming with Kafka and MapR Streams.  We’ll start with the basics, look at a few real-world use cases, and then deep-dive on how these concepts can be extended to build a next-generation system of record.  In doing so, we’ll talk about the relationship of streams to databases, which are historically thought of as the system of record, and talk about how some really hard data management problems are solved using this approach, such as synchronization of multi-model databases, data versioning, and data lineage auditing..



Todd Palino is a Staff Site Reliability Engineer at LinkedIn, tasked with keeping Zookeeper, Kafka, and Samza deployments fed and watered. He is responsible for architecture, day-to-day operations, and tools development, including the creation of an advanced monitoring and notification system. Previously, Todd was a Systems Engineer at Verisign, developing service management automation for DNS, networking, and hardware management, as well as managing hardware and software standards across the company. In his spare time, Todd is the developer of the open source project Burrow, a Kafka consumer monitoring tool, and can be found sharing his experience on Apache Kafka at industry conferences and tech talks. He is also in the middle of co-authoring Kafka: The Definitive Guide, soon to be available from O’Reilly Media. When that’s not keeping him busy, you’ll find him out on the trails, training for his next marathon.


William Ochandarena is Senior Director of Product Management at MapR, where he is responsible for streams and cross-platform services like containers, clouds, security, and user experience. Before entering the big data space he spent several years at Cisco managing data center switching products. He has an engineering degree from Rensselaer Polytechnic Institute and an MBA from Santa Clara University.






On streaming:

Free eBook - Streaming Architect  MapR ebook - Introduction to Apache Flink


On  mapr-streams :





Find more on: kafka mapr-streams



Add a comment below. Not sure how?, check this: How do I add a comment or reply to a question in the community?