Adaptive Data Cleansing with StreamSets and Cassandra

Video created by slimbaltagi on Jun 22, 2017

    This is the video recording of a talk given by Pat Patterson at 2016 Cassandra Summit. "Cassandra is a perfect fit for consuming high volumes of time-series data directly from users, devices, and sensors. Sometimes, though, when we consume data from the real world, systematic and random errors creep in. In this session, we'll see how to use open source tools like RabbitMQ and StreamSets Data Collector with Cassandra features such as User Defined Aggregates to collect, cleanse and ingest variable quality data at scale. Discover how to combine the power of Cassandra with the flexibility of StreamSets to implement adaptive data cleansing."