Deep dive of deduplication using Apache Apex and RTS - December 19, 2016 - Remote attendees welcome!

Document created by Patrick Moran on Dec 13, 2016
Version 1Show Document
  • View in full screen mode

SUMMARY

Date:Monday, December 19, 2016
Location:Webcast
Time:6:00am - 7:00am PST
Registration Link:https://www.meetup.com/Big-Data-Hadoop-Ingest-Transform-Apex-Bay-Area-Online/events/235935556/ 

                        

ABOUT THE EVENT

 

This webinar will introduce the De-duplication functionality in Malhar. De-duplication is a very important part of the processing pipeline in ETL workflows. We will introduce the use cases and walk through the implementation details. Next, we'll look at how to configure the Dedup operator for various use cases (time-based expiry as well as batch de-duplication). We will also get into a demonstration of an application which uses De-duplication (Dedup) operator.

 

Presenter: Bhupesh Chadwa is a Software Engineer at DataTorrent and Committer for Apache Apex

 

LOGISTICAL DETAILS

Please RSVP for the event via meetup.com and follow the instructions in the event description.

 

MORE DETAILS

Attachments

    Outcomes