Hadoop Turns 10

by Angela Guess

Andrew Brust reports on ZDnet, “It’s hard to believe, but it’s true. The Apache Hadoop project, the open source implementation of Google’s File System (GFS) and MapReduce execution engine, turned 10 this week. The technology, originally part of Apache Nutch, an even older open source project for Web crawling, was separated out into its own project in 2006, when a team at Yahoo was dispatched to accelerate its development. Doug Cutting, founder of both projects (as well as Apache Lucene), formerly of Yahoo, and presently Chief Architect at Cloudera, wrote a blog post commemorating the birthday of the project, named after his son’s stuffed elephant toy.”

Brust goes on, “In his post, Cutting correctly points out that ‘Traditional enterprise RDBMS software now has competition: open source, big data software.’ The database industry had been in real stasis for well over a decade. Hadoop and NoSQL changed that, and got the incumbent vendors off their duffs and back in the business of refreshing their products with major new features. Microsoft SQL Server now supports columnstore indexes in order to handle analytic queries on large volumes of data and its upcoming 2016 version adds PolyBase functionality for integrated query of data in Hadoop. Meanwhile, Oracle and IBM have added their own Hadoop bridges, along with better handling of semi-structured data.”

Data Topics

Leave a Reply Cancel reply