"Big data presents both enormous challenges and incredible opportunities for companies in today’s competitive environment. To deal with the rapid growth of global data, companies have turned to Hadoop to help them with performing real-time search, obtaining fast and efficient analytics, and predicting behaviors and trends. In this session, we’ll demonstrate how we successfully leveraged Hadoop and its ecosystem components to build a converged data infrastructure to meet these needs.
During this session, we will discuss:
- Tips for building an effective data pipeline
- How we created a benchmarking product using Apache Spark and other components
- How to apply data science techniques in memory for real-time web applications using Spark
- The challenges in running Spark in production
- How we use Hadoop, Kafka, and Solr to provide real-time search"