Back to top

Has Hadoop Failed? That’s the Wrong Question

Recently, some of my colleagues in the big data space have been discussing the success or failure of Hadoop as an enterprise-ready platform. Nothing makes technologists like us happier than debating a legitimately debatable point like whether or not a particular technology has lived up to its promise.

I remember when Bluetooth first hit the scene. It was predicted to be a magical universal connection protocol that was going to eliminate the need for, say, connected printers. We would just walk into a room and our Bluetooth-enabled laptops would automagically connect with a Bluetooth-enabled printer.

It didn’t quite work out that way. In fact, Bluetooth sort of wandered in the wilderness for a while, a solution in search of a problem.

But Bluetooth eventually found its mojo. Smartphones, cars, toasters, the entire Internet of Things: Bluetooth is a big component of the glue holding them all together.

These types of stories have been retold countless times since the advent of computers. Really, since the dawn of human invention.

So that brings us back to Hadoop. The early days for Hadoop were heady ones. It even made the New York Times. This was a technology that was going to change how we collect, store and analyze the massive new streams of data that were being created each year.

And so it did for a few large tech firms with the expertise, manpower and time to make Hadoop work in their environments for their specific use cases.

The big question was, however, could Hadoop be adopted more broadly by large enterprises?

And that’s where Hadoop seemingly ran into some practical limits, as discussed recently in a Datanami interview. This entire discussion focuses on using Hadoop in traditional ways to act as a cheaper version of traditional products like relational databases. And of course, Hadoop is terrible as a relational database. That's why commercial products continue to be the database of choice if security, reliability, performance, and integration are critical. It's not hard to outperform Hive, or be more reliable and robust than the open source technologies that were promised to replace data warehousing.

Hadoop failed only in the sense that inflated expectations could never be met compared to mature commercial offerings.

At Podium Data, we took a different approach: We used Hadoop’s low-cost and scalable performance to create an entirely new approach to data management and self-service data in enterprise.

Trying to build a data marketplace by integrating Hadoop tools is like trying to build ERP from a Linux box and an Eclipse installation.  Sure, the infrastructure is there, but too much development and integration is required to succeed.

Our customers, however, are all successfully running on Hadoop infrastructure and scaling, because they have Podium managing and automating the entire process. These customers don't know or care that Podium uses Hadoop, S3, or Spark; they care that their data is secure, trusted, and easily accessible at a low cost.

Hadoop didn’t fail. Rather, it provided a critical stepping stone of inexpensive scalable parallel infrastructure so that we could reengineer the entire data management process. Our customers measure their success on business impact, not technical scorecards, and have increased analytics productivity 30x, cut data costs 40%, and accelerated time-to-answer 25x.  

That’s the brass ring, and Hadoop helped us reach it.