Real-Time Hadoop

File uploaded by slimbaltagi on Feb 3, 2018Last modified by slimbaltagi on Feb 3, 2018
Version 2Show Document
  • View in full screen mode

Slides of a talk given by Ted Dunning at the Strata San Jose conference on March 30th 2016.


"I migrated my topic a bit, but here is the original abstract:

Application developers and architects today are interested in making their applications as real-time as possible. To make an application respond to events as they happen, developers need a reliable way to move data as it is generated across different systems, one event at a time. In other words, these applications need messaging.

Messaging solutions have existed for a long time. However, when compared to legacy systems, newer solutions like Apache Kafka offer higher performance, more scalability, and better integration with the Hadoop ecosystem. Kafka and similar systems are based on drastically different assumptions than legacy systems and have vastly different architectures. But do these benefits outweigh any tradeoffs in functionality? Ted Dunning dives into the architectural details and tradeoffs of both legacy and new messaging solutions to find the ideal messaging system for Hadoop.

Topics include:

* Queues versus logs

* Security issues like authentication, authorization, and encryption

* Scalability and performance

* Handling applications that span multiple data centers

* Multitenancy considerations

* APIs, integration points, and more"