[MapR Talk]Document Classification on Apache Spark

Document created by aalvarez on Nov 30, 2015Last modified by aalvarez on Jul 28, 2016
Version 3Show Document
  • View in full screen mode


In an effort to help grow organic communities interested in new technologies, MapR speakers around the world provide technical talks on numerous topics. Please browse our MapR Talks Directory  to learn how to search and request a talk in seconds.



There are copious tutorials, demos and walk-throughs that illustrate how to apply machine learning algorithms to perfectly-manicured data sets. But this doesn’t reflect real-life situations for those who have big opportunities to find big value. What happens when your dataset is massive and unformatted, such as the internet search history for…everyone? Maybe you built some very good models - now what?

This session takes place at the busy intersection of Big Data, Machine Learning and Business Problems. Being exposed to Apache Spark and its quickly maturing set of machine learning tools, you’ll see how to 1) generate powerful modeling features, 2) apply the appropriate ML algorithms and 3) be able to generate value every time. You’ll also leave with the code to reproduce the results and start creating your own value.


Related Materials to This Talk

Document Classification with Apache Spark - Free Code Fridays


Location Availability & Request Link

North America. Please refer to theMapR Talks Directory  for specific countries.

You can request this talk here: Speaker Request


Related Resources

Find all Meetups and Events resources.

Find all MapR Talks: mapr talk

Learn more about Mapr Talks and how to book a speaker: Meetup and Event Organizers Resources