Apache Hadoop: Background and History
In the spring of 2006, the Apache Software Foundation released Hadoop, a distributed computing framework for managing and analyzing very large amounts of data in a scalable and reliable way. The open-source software was designed to run on clusters of servers ranging from a few nodes to thousands of nodes, allowing users to pool computing power to enable the processing of workloads far more cost-effectively than had been possible earlier.
It’s been more than 10 years since the project began, and in that time an active open-source community has helped mature Apache Hadoop significantly. The community has contributed many enhancements including high availability, governance and analytical processing improvements.