Long before I joined MapR, I became a little obsessed with big data use cases. At first, it was born out of intellectual curiosity. The Hadoop ecosystem had been anointed as "the next big thing", but early on it was hard to see how big data technology was going to cross the chasm to become an enterprise mainstream technology. Then, as I began working in big data as my day job, it become more a practical matter and was the answer to the second most common question people have of big data:
1. What is big data?
2. Why should I care?
Use cases are the reason people care about big data. After documenting more than 150 separate use cases (so far) by MapR customers, I think I can shed a little light on the subject.
What has become clear after months of research on our customers is that their is a progression of increasing sophistication and creativity that follows an early success with Hadoop and the MapR Converged Data Platform. Those early successes are typically the result of experimenting with more basic use cases like a data warehouse offload, or a data lake, or perhaps analyzing data from application logs or system logs. These are akin to the "hello world" applications that novice programmers used to start with in that they establish a basis of understanding of ingesting, persisting and querying data in a Hadoop system.
From this basis, many customers go on to expand the current state of their BI and analytics capabilities with the introduction of the many new and exciting tools that are now being used in mature big data shops. Roughly speaking, in the graphic above, the early stages are about knowing: updating the state of knowing by using larger data sets.
As our customers get more sophisticated with their use cases, some interesting things happen:
- Scientitic Rigor: They invest data engineering and data science (either adding headcount or engaging MapR data scientists)
- Operational Changes: They begin to develop use cases that change internal processes
- Industry Focus: Their use cases get closer to their core differentiation (and are thus more focused on their industry)
In short, they are evolving their perspective of their data from reporting on past events (ie rearview mirror analytics) to changing their business tactics (predictive analytics) and achieving operational agility. In other words, changing the role of data from a standpoint of knowing, to one of doing.