Scroll down to the "MATERIALS" section of this page to see some speakers slides & other related content.
|Date:||October 24, 2016|
555 110th Ave NE, Bellevue, WA
ABOUT THE EVENT
Seattle Kafka Meetup Group is a group for Seattle/Eastside users interested in sharing knowledge about Apache Kafka - a high throughput distributed pub/sub messaging system. Apache Kafka is an extremely successful message queue for stream processing systems. The adoption of Kafka for near real time data processing has been increasing tremendously. The goal of this group is to bring together people of similar interests to discuss features, best practices for operations and deployment, and case studies of applications built using Kafka.
During this event William Ochandarena, Senior Director of Product Management MapR, will be delivering his talk "Building a Streaming Systems of Record with MapR Streams".
- Apache Kafka at LinkedIn - "Multi-Tier, Multi-Tenant, Multi-Problem Kafka" - Todd Palino - Linkedin
At LinkedIn, the Kafka infrastructure is run as a service: the Streaming team develops and deploys Kafka, but is not the producer or consumer of the data that flows through it. With multiple datacenters, and numerous applications sharing these clusters, we have developed an architecture with multiple pipelines and multiple tiers. Most days, this works out well, but it has led to many interesting problems. Over the years we have worked to develop a number of solutions, most of them open source, to make it possible for us to reliably handle over a trillion messages a day.
- Apache Kafka on Azure HDInsight - Raghav Mohan, Program Manager for Azure Big Data.
At Microsoft, we have run Kafka workloads at scale via on-premise solutions. Recently, we have onboarded certain Kafka workloads to HDinsight - a fully managed cloud service powered by Azure.
We will detail the challenges faced for creating a managed cloud Kafka service, and the obstacles faced for moving Kafka workloads on-prem hosts to cloud services.
- Automating partition management in Kafka - Som Sahu, Microsoft.
If you run a Kafka cluster in Production environment, you may already be familiar with the imbalance in Partition distribution among Kafka brokers over time as disks, machines and new topics are added to the cluster. Som will talk about an automatic way to detect and fix the imbalance by distributing Kafka partitions evenly across Kafka brokers. This is a proven approach in Microsoft Kafka Cluster that could bring down operational overhead significantly.
- MapR Streams and Kafka - William Ochandarena - MapR
This presentation explores real-time event streaming with Kafka and MapR Streams. We’ll start with the basics, look at a few real-world use cases, and then deep-dive on how these concepts can be extended to build a next-generation system of record. In doing so, we’ll talk about the relationship of streams to databases, which are historically thought of as the system of record, and talk about how some really hard data management problems are solved using this approach, such as synchronization of multi-model databases, data versioning, and data lineage auditing..
Todd Palino is a Staff Site Reliability Engineer at LinkedIn, tasked with keeping Zookeeper, Kafka, and Samza deployments fed and watered. He is responsible for architecture, day-to-day operations, and tools development, including the creation of an advanced monitoring and notification system. Previously, Todd was a Systems Engineer at Verisign, developing service management automation for DNS, networking, and hardware management, as well as managing hardware and software standards across the company. In his spare time, Todd is the developer of the open source project Burrow, a Kafka consumer monitoring tool, and can be found sharing his experience on Apache Kafka at industry conferences and tech talks. He is also in the middle of co-authoring Kafka: The Definitive Guide, soon to be available from O’Reilly Media. When that’s not keeping him busy, you’ll find him out on the trails, training for his next marathon.
William Ochandarena is Senior Director of Product Management at MapR, where he is responsible for streams and cross-platform services like containers, clouds, security, and user experience. Before entering the big data space he spent several years at Cisco managing data center switching products. He has an engineering degree from Rensselaer Polytechnic Institute and an MBA from Santa Clara University.
MATERIALS - SPEAKER SLIDES
- Will Ochandarena
- Slides: (Seattle Streams Meetup )
MATERIALS - OTHER RELATED CONTENT
- Real Time Credit Card Fraud Detection with Apache Spark and Event Streaming | MapR
- Real-Time Streaming Data Pipelines with Apache APIs: Kafka, Spark Streaming, and HBase | MapR
- High Speed Kafka API Publish Subscribe Streaming Architecture: How it works at the message level | MapR
- Getting Started with Sample Programs for Apache Kafka 0.9 | MapR
- What Will You Do in 2016? Apache Spark, Kafka, Drill and More | MapR
- Streaming Architecture Ebook | MapR (image below)
On mapr-streams :
- Getting Started with MapR Streams | MapR
- Kudu, Kafka and MapR - A Perspective on Trends Toward Data Convergence | MapR
- Streaming with MapR | MapR Whiteboard Walkthrough
- Latest Videos | MapR Whiteboard Walkthroughhttps://www.mapr.com/resources/videos/mapr-streams-under-hood
- Free On-Demand Training - MapR Streams Essentials
- MapR Streams DataSheet
- MapR Streams vs. Apache Kafka – Whiteboard Walkthrough | MapR (below)
- Apache Spark vs. Apache Flink - Whiteboard Walkthrough - YouTube
- MapR Sandbox VM download page: MapR Sandbox for Hadoop | MapR
- How To Set up Racing/Streams Demo on a Stand-Alone Mac
HAVE A BURNING QUESTION?
Add a comment below. Not sure how?, check this: How do I add a comment or reply to a question in the community?