Streaming architecture defines how large volumes of data make their way through an organization. Data is created at a user’s smartphone, or on a sensor inside of a conveyor belt at a factory. That data is sent to a set of backend services that aggregate the data, organizing it and making it available to business analysts, application developers, and machine learning algorithms.
The velocity at which data is created has led to widespread use of the “stream” abstraction–a never ending, append-only array of data. To deal with this volume, streams need to be buffered, batched, cached, mapreduced, machine learned, and munged until they are in a state where they can provide value to the end user.
There are numerous ways that data can travel this path, and in today’s episode we discuss the streaming systems, data lakes, and data warehouses that can be used to build an architecture that makes use of streaming data. Ted Dunning is a chief application architect at MapR, and he joins the show to discuss the patterns that engineering teams are using to build modern streaming architectures.