Podium & MapR Executive Interview: Part I
The democratization of information continues to apply great pressure on data management organizations and big data delivery teams to support a range of unstructured or multi-structured information. Market forces like this represent prosperity for some, and failure for others, evidenced by Forrester’s prediction that "insights-drive businesses will steal $1.2 billion annually in revenue from their competition by 2020."
These forces are why Podium exists - a company that’s built a stand-out big data management platform that’s not just better, but different. One that accelerates data delivery teams’ ability to serve up trusted, business-ready data with an Amazon-like shop-for-data experience.
Recently, Podium teamed up with MapR which represents one of our best examples of a perfect match. Why? Both companies share a similar mission backed by innovation: To deliver an exemplary end-to-end platform wherein businesses can treat enterprise data as an asset, and volumes of data are ingested once and accessible as a single source on premise, in the cloud or in mixed environments.
I had the pleasure of sitting down with both our CEO, Paul Barth, and MapR’s V.P. of Technology, Crystal Valentine. This is Part One of our exchange.
ROI & Business Impact -- What are some examples of organizations that are succeeding in transitioning from legacy to modern big data enterprise platforms and architectures? How are they simplifying business processes, reducing cost, and increasing agility along this journey?
Valentine: The major theme that we're seeing with our customers who are successful is an emphasis on business agility. This notion of agility has come up in a number of different threads. And typically, when people talk about agility, they're talking about technical agility, and being able to deploy applications - or to iterate upon and improve applications - in a continuous fashion. If there is a new opportunity that a business stakeholder is able to identify, and the technologist, architects, and developers are able to deliver on building a solution to that problem quickly, that leads to all sorts of efficiencies. Giving the business an opportunity to gain from a first mover advantage, being able to intercede to improve efficiencies where there's a real-time business process that's potentially suboptimal.
Barth: I really think the themes of agility and speed are really interesting because I don't think they're enemies of themselves. They're not business value, but they enable so much more. And a a measurement we've used for years called “Time To Answer”. TTA is when a business person or business process needs information or analysis, and we ask, how long does it take for that information to be delivered? Is it real time? Is it self-service, and already provisioned? Is it a programming project? And the interesting thing is that when you change the delivery time, not by a few 10's of a percent, but by orders of magnitude, you actually invent or enable new business processes. For example, one unexpected ROI for a customer of ours is their ability to incorporate very detailed global analytics into their M&A process, which has to come up with a price and an offer for a company, usually within two weeks. And this kind of capability was not available before big data platforms that could integrate all the data. Here, Podium reduced their analytics cycle time from 90 days down to 2 days with advanced data and analytics now informing users of dozens of business decisions every month.
Complexity .vs Change -- Most organizations with major digital initiatives cite the overwhelming complexity of existing systems among the biggest barriers to adoption of modern big data solutions. How can businesses get beyond these barriers and move forward?
Valentine: I think that for the last decade the prevailing wisdom on this topic has been; that in order to modernize, one must rip and replace existing legacy applications and replace them with next generation technology that will enable them to be more agile going forward. And the reason that was the main way of thinking about modernization was because the emerging technology that we saw in the mid aughts, big data technology based on open source software often precluded their integration with existing legacy systems.
So, in effect what you would have are two distinct organizations within IT. You'd have the folks who are maintaining the mission-critical legacy systems, and then the folks who are tasked with setting up a "big data platform" in order to try and get incremental value from advanced analytics. However, the goal of having those two systems fully integrated was technically difficult to achieve. Here, we find our customers recognizing that the better alternative is to leverage a platform that can accommodate not just emerging net new applications, but one that can also help to modernize existing legacy applications. So, we discuss with our customers that it's more of a journey that can be incremental in nature, from where they sit today, which is typically dominated by legacy technologies, to the vision they have of being on a more modern platform with agility. It doesn't need to be a lift-and-shift type process, but there can be that sort of seamless bridge from the legacy to the next gen, if you leverage a platform that can accommodate that.
Barth: I like that description, Crystal. That's one of the things I value about the MapR-Podium partnership. Podium, as you know, is very focused on building a comprehensive data catalog of all data assets, no matter where they live. And that includes legacy mainframe files and OVSAM and ancient databases, and XML flat files that are undocumented. And building that catalog with an eye towards modernization, and having a platform like MapR, means that we can modernize the data without having to migrate and rebuild the application. And yet, at the same time, one of the unique things Podium has done is to embrace legacy data as a first-class citizen. For example, we can take a COBOL copybook in a mainframe file, and expose that into an analytics environment through our catalog in a 100% automated way.
We're doing the same with XML, and with semi-structured data. And the intention is to have a robust, automatically created and managed data catalog. And what we want it to be on is a modern data platform, like MapR, so that we can move from transaction logs into streams. We can move from model building into real-time analytics, and we can apply data governance along the way. As long as you have that kind of ability to see all your data assets in a common place, and pick and choose in a natural way how to modernize them, you can address the transformation versus complexity challenge.
Old Meets New -- How should companies think about integrating their existing landscape of legacy data management platforms and applications with modern big data notions, like the data fabric, or an enterprise-scale data lake?
Valentine: This is a great continuation of the discussion we were just having, that fundamentally, the notion of a data fabric ought to include legacy data. It should not be a separate entity that is distinct from your existing mission-critical legacy systems. It needs to be all-inclusive, being able to represent and leverage both net new data sources, including NoSQL data or event-based data, real-time data streams and more. As well as the legacy data, which oftentimes is flat files or XML, or maybe you have relational databases, etc. The fabric should be an all-encompassing abstraction, with an underlying technology platform that's able to accommodate all of those different data types in a single global namespace. Key to being able to really utilize and leverage all those myriad datasets together is an essential bird's eye view of all of the data under management. Here is where the metadata catalog that Podium provides is really essential to making sense of it all, leveraging a unified platform in concert with legacy platforms.
Barth: I'd like to add one other class of legacy platform these days, which is the data warehouse. I was just at a customer who was asking about the road map for the data warehouse, and do they see those data fabric and the data lake overtaking it. And the first thing I say is, "Not right now." That there are database products out there with 20 to 40 years of maturity, that provide incredible robustness in scale and optimization for a certain kind of data processing, namely SQL queries against a relational model. And there's no reason to rip that out or change that just because we have a new technology. That being said, we are seeing this concept of a data fabric, which includes a lake where data is transformed from raw to ready. We're seeing real-time. We're seeing unstructured, and semi-structured data. And we're seeing a much more dynamic environment where people want to go to a catalog and start combining data, finding and shaping new data assets, new data products, out of the data that's already in the marketplace.
What I think is interesting is that when you start looking at this future state, it is one where there's going to be a fit-for-purpose and rightsizing of technologies, traditional technologies along with the new technologies, but that it's being managed in a dynamic way. And that's really what a lot of our capabilities, like Podium’s Data Conductor, which allows you to have the catalog register and manage data, that's not in the Hadoop Ecosystem.