Executive Summary
Ventana Research has conducted quantitative research on the performance of analytics and data organizations for two decades. The research has been conducted against the backdrop of the evolution of data platforms used to store, process and analyze data, driven by innovation at the infrastructure, data processing and interface layers. Data lakes began to emerge 10 years ago in response to the desire for platforms that could be used to economically store and process large volumes of raw data from multiple operational applications in a variety of formats to be queried by multiple business departments for a variety of analytic workloads.
Data lakes are fulfilling that promise. More than one-half of organizations use their data lake to store data from three or more operational data sources, and more than one-half store data using two or more file formats. More than two-thirds are running two or more analytics workloads on their data lakes, and almost 9 in ten expect multiple business departments and functions to benefit from their data lake environments. Benefits enjoyed by those already in production with data lakes include improving communication and knowledge sharing, gaining competitive advantage, and addressing digital transformation priorities.