Big data is valuable. Our benchmark research shows that using big data analytics results in better communication, better alignment of the business, a competitive advantage, better responsiveness and decreased time to market. But big data comes in many forms, from many sources and is not always easy to work with. These complexities have caused many organizations to become too reliant on IT for big data analytics. Nearly two-thirds (61%) of the organizations participating in our benchmark research told us they either rely on IT or require the assistance of IT to create big data analytics. Only one- quarter (24%) make direct, self-service access by line-of-business employees the primary way they provide big data analytics. Those organizations that do provide self-service access have the highest rates of satisfaction — 72 percent compared with 54 percent when IT resources are required.
Too much of the industry’s big data focus has been on managing and storing the data. Organizations have reduced their reliance on relational databases for big data storage and have adopted alternatives such as Apache Hadoop, NoSQL databases and object stores such as Amazon S3. These technologies have enabled organizations to store larger volumes of data and an array of different data types such as log files, JSON documents, text and multimedia files. The situation of using these technologies to store a variety of detailed datasets is generally referred to as a data lake. Our research shows that data lakes can contain traditional sources of data such as transaction data, but also contain external data, documents, event data, machine data, social media, weather data and more. Creating data lakes with more detailed data, more history and more types of data has the potential to enable analyses that were previously impossible or impractical. Entire industries, such as ride sharing and personalized music services, have been enabled by big data.