|Date:||Monday, December 19, 2016|
|Time:||6:00am - 7:00am PST|
ABOUT THE EVENT
This webinar will introduce the De-duplication functionality in Malhar. De-duplication is a very important part of the processing pipeline in ETL workflows. We will introduce the use cases and walk through the implementation details. Next, we'll look at how to configure the Dedup operator for various use cases (time-based expiry as well as batch de-duplication). We will also get into a demonstration of an application which uses De-duplication (Dedup) operator.
Presenter: Bhupesh Chadwa is a Software Engineer at DataTorrent and Committer for Apache Apex
Please RSVP for the event via meetup.com and follow the instructions in the event description.
- For deeper engagement with Apache Apex, download, view past meetup webinars, slides, and docs.
- To reduce time to market, look at operable app-templates that you can quickly import and launch.
- Examples: HDFS-Sync, Kafka-HDFS, HDFS-Line-Copy, S3-HDFS and HDFS-Kafka.
- Free DataTorrent Enterprise Edition for qualifying startups. Check it out!