Everyone is talking about data lakes. The intended use of a data lake is as a central storage facility for performing analytics. But, Jim Scott asks, why have a separate data lake when your entire (or most of your) infrastructure can run directly on top of your storage, minimizing or eliminating the need for data movement, separate processes and clusters, and ETL?
Note: This will be a combined meetup with the Atlanta Apache Spark User Group.
Have a burning question? Ask Craig Warman.