HP HAVEn CTO Mundada on new ways for businesses to gain transformation from big data

Listen to the podcast. Find it on iTunes. Read a full transcript or download a copy. Sponsor: HP.

 

Big data capabilities and advanced business analytics have now become essential to nearly any business development activity.

 

The benefits that enterprises can get if they can get their hands around big data analytics and apply it to business challenges are quickly being documented -- and they come as big new profits and major market advantages. Industries around the world are rapidly seeking transformational projects using big data to gain competitive advantage.

 

As part of the next edition of the HP Big Data Podcast Series, BriefingsDirect sat down with two HP executives to learn how these advanced analytics seekers can best accomplish their goals. The insights gleaned include how companies worldwide are best capturing myriad knowledge, gaining ever deeper analysis, and rapidly and securely making those insights available to more people on their own terms.

 

So join this executive-level discussion highlighting how the latest version of HP HAVEn produces new business analytics value and strategic return with Girish Mundada, Chief Technology Officer for HP HAVEn, and Dan Wood, Worldwide Solution Marketing Lead for Big Data at HP Software. The discussion is moderated by me, Dana Gardner, Principal Analyst at Interarbor Solutions.

 

Here are some excerpts:

Gardner: We’re in a fascinating time because analytics and big data are now top of mind. What was once relegated to a fairly small group of data scientists and analysts as reporting tools -- and I am thinking about business intelligence (BI) -- has really now become a comprehensive capability that’s proving essential to nearly any business strategy.

What’s behind this eagerness to gain big-data capabilities and exploit analytics so broadly?

Wood: We're starting to see some very clear quantification of the value and the benefits of big data. It’s fair to say that big data is probably the hottest topic in the industry.

Wood

There’s a lot of talk across all forms of media about big data right now, but what’s happened is that credible publications like the "Harvard Business Review," for example, have started to put solid numbers around the benefits that enterprises can get if they can get their hands around big-data analytics and apply it to business challenges.

For example, Harvard Business Review is saying that, on average, data-driven organizations will be five percent more productive and six percent more profitable than their competitors.

Worth chasing after

Think about that. A six-percent distinct profitability increase would double the stock price for a lot of organizations. So there really is a prize worth chasing after.

What we’re seeing, Dana, is much more widespread interest across the organization and not just within IT. We’re seeing line-of-business leaders understanding and, in many organizations, actually starting to benefit from big-data analytics.

They’re able to analyze the call logs in a call center, better understand the clickstreams on a website, and better understand how customers are using products. All of these are ways of analyzing large amounts of data and directly tying it to specific line-of-business problems.

That’s where we are right now. Industries around the world are going through transformational projects using big data to gain competitive advantage.

Gardner: It’s interesting too, Dan, that they’re not just taking these as individual data sets and handling them individually, but increasingly businesses are combining them, and finding new relationships, and doing things that they really couldn't have done before.

Wood: Absolutely. It’s the idea of 360-degree view of their internal operations, or of their external customer trends and needs -- and it’s come from combining data sets.

For example, they’re combining social media analytics on customers with the call logs into the call center, with internal systems of record around the customer relationship management (CRM) and ongoing customer transactions. It’s by combining all those insights that the real big-data opportunity reveals itself.

Gardner: And the sources for those insights and data, of course, are across almost any type of information asset. It’s not a just structured data or data that your application standard is around -- it’s getting all the data all of the time.

Wood: That’s right. In some ways, this industry label of big data is perhaps not the most helpful, because it’s not just the volume of data that is the challenge and the opportunity for the business. It’s the variety of sources, as you’ve alluded to, and also the velocity at which that data is moving.

The business needs to get hold of these multiple sources of data and immediately be able to apply the analytics, get the insights, and make the business decisions. This is why still the vast majority of that data that’s available to an enterprise remains dark.

Unused and unexploited

It’s unused and unexploited. Organizations, with their traditional analytics systems, are struggling to get the meaning and insights from all these data types that we mentioned. These include unstructured information, such as social media sentiment, voice recordings, potentially even video recordings, and the structured and semi-structured things like log files and data center data. For many organizations, getting the information quickly enough out of their CRM and enterprise resource planning (ERP) systems is a challenge as well.

Gardner: So we see that there’s a great desire to do this, and there are great returns on being able to do this well. We talked about some of the general challenges. What specifically is holding people up?

Is this an issue of cost, complexity, or skills? Why aren’t companies able to move beyond this small fraction of the available information to which they could be applying such important insight and analytics?

Wood: It’s a complexity and a skills challenge, as you mentioned. The systems they have today, Dana, typically aren’t set up to able to analyze these vast amounts of unstructured information, and also to be able to analyze the structured data at a speed needed by the organization.

Think about the need to analyze immediately a clickstream from an online shopping application or a pay-to-use application that an organization has. That is, a rapid-scale analysis of a large amount of structured data. Typically, the analytic systems that organizations have had aren’t able to cope with that or with the unstructured human information.

This is why HP has created the HAVEn Big Data Platform, and Girish will talk in more detail about this, and how it brings together the analytics engine needed to address these issues.

Just as importantly, there’s the ecosystem around HAVEn, which includes HP experts and services and services from partners, to bring together the skills needed to turn this data collection into useful information.

And there are skills around data scientists, as well -- skills around understanding the right questions the line of business needs to be asking, and understanding actually how to visualize and represent the data.

Gardner: What were the guiding principles that you were thinking of when HAVEn was being put together?

Talking to customers

Mundada: HAVEn came together not by creating it in a dark room somewhere in the back office. It came together by talking to customers. On a regular basis, I meet with some of HP's largest customers worldwide, getting input from them. And they're telling us what their current problems are.

Mundada

Let me see if I can describe the landscape in a typical organization, and we can go from there. You'll see why we created HAVEn.

Let’s visualize four different waves of data. Back in early '60s,'70s, even part of the '80s, mainframes were the primary way to process data, and we used them for operationalizing certain parts of data processing, where data was extremely high-value. If you look at the cost of the systems, it was phenomenal.

Then came the next wave in the ‘80s, where we went into what I call client-server computing, and we already know several companies that were created in this space.

I’ve lived in Silicon Valley for almost 30 years now, and a whole bunch of new companies were born in this space. I worked for a company, Postgres, which became Illustra, then became Informix, and became IBM. If you look at that entire wave of OLTP technologies, we created data-processing technologies designed to solve basic business problems.

Application software was created: CRM, supplier relationship management (SRM), you name it. Many companies that did consulting around that were created, too. That was that second wave after the mainframe.

Then came the third wave, where we took this data from all these transactional systems, brought them together to find out some basic analysis, which we now call business analytics, to find out "who is my most profitable customer, what are they buying, why are they buying," and things of that nature.

We created companies for that wave, too, and many technologies. Exadata, Teradata, Netezza, and a whole bunch of companies and applications were born in that space. That wave lasted for quite a while.

What we're seeing now is that from 2003 onward, something very fundamental has happened. At least, that’s the way I've been seeing this. If you look at the three Vs that Dan has described -- volume, velocity, and variety -- we’re talking about volumes that are growing exponentially. In the past, they were growing linearly. That creates a very different kind of requirement.

More importantly, if you look at the variety that Dan mentioned, that’s really the key driver in my mind. People are now routinely bringing in machine data, human data, and your traditional structured warehouses -- all of them together.

If you visualize a bar graph, you would see that 10 percent of the data that we now can monetize is coming from traditional sources, whereas 90 percent of the data that we need to monetize is now sitting in machine data and human data.

High velocity analytics

What we're trying to do with HAVEn is create a combined platform, where you can combine these three different data types and do very high-velocity analytics.

As a simple example, if you look at Apache Web Server logs, that data is used historically by the security people to see if anybody is breaking in. That data was being used by operational people to see if machines aren’t overloaded.

More importantly the digital marketing guys now want to look at that data to see who's coming to their website, what they’re buying, what they’re not buying, why they’re buying, and which geographies they’re coming from. Then, they want to combine all these data sets with their existing structured data to make sense out of it.

Today, it's a mess in the market. When we talk to our partners and customers, they’re saying that they have point solutions for each of these things, and if you want to combine that data, it’s really hard. That’s why we had to create HAVEn.

HAVEn is the fourth wave. HAVEn is specifically about big data, the fourth wave. If you look at HP’s portfolio, we sell products and services across each of these waves, and the fastest growing wave right now is the big-data wave. It’s growing at about 35 percent a year, according to Gartner, and that's why we're excited about it.

Gardner: Now we know why you created it and what it’s supposed to do. Tell us a little bit more about what’s included in HAVEn and why it is that you’ve been able combine product and platform to solve this very difficult task.

Mundada: If you look at what’s required now to process big data in its entirety, one product no longer can do it all. There is a very famous paper written by some university professors titled “One size does not fit all.” It proves that different data structures are able to solve different kinds of data problems far more efficiently.

One way to think about big data is to think of it as a pile of dirt. It’s a big pile. In that pile, there’s gold, silver, platinum, iron, and other metals you don’t even know. If the cost of mining that data is high, obviously you’re going to go after only the platinum and some known objects that you care about, because that’s all you can afford.

HAVEn is about bringing that cost of processing down to a very, very low level so you can go after more metals. That means you have to bring together a set of technologies to be able to solve this. If you look at the last three years, HP has made very significant amounts of investments in the big-data space.

Best of breed

We bought companies that were best of breed to try to solve specific problems. We bought Autonomy, Vertica, ArcSight, Fortify, TippingPoint, 3PAR Data, and Knightsbridge.

Now, we have a set of technologies to be able to combine them into a unique experience. Think of it almost like Microsoft Office. Before you had Microsoft Office, you would buy a word processor from one company, a spreadsheet from another company, and presentation software from a third company.

Let’s say you wanted to create a simple table. If you had created it in a word processor or even a spreadsheet, you couldn’t mix and match that. It was impossible to mix and match very different types.

Then, Microsoft came to the table and said, “Look, here’s a simplified solution.” If you want to create a table, go ahead and create it in PowerPoint. Or if you want to create more complicated thing, put it in Excel. Then, take that Excel and put it in PowerPoint. Or, you can put the whole thing into a Word document. That was the beauty of what Microsoft did.

We’re trying to do something similar for big data, make it very easy for people to combine all these different engines and the different data types and write simple applications on it.

Gardner: What beyond the products and binding them together makes HAVEn unique?

Mundada: HAVEn is really two different concepts. There’s the HAVEn data platform, which we’ll talk about now, and there’s a HAVEn ecosystem, which I’ll mention in a minute.

HAVEn means Hadoop, Autonomy, Vertica, Enterprise Security, and “n” applications. That’s the acronym. So let’s look at one of these pieces, and why we need an architecture like this.

As I said, today you need to combine different sets of data techniques to solve different problems, and they have to work seamlessly. That’s what we did with HAVEn. I’ve been with HAVEn from day zero, before the project concept started, and I can tell you why and how we added these pieces and how we’re trying to integrate them better.

If you look at Hadoop as an ecosystem part of that HAVEn, our story with Hadoop at HP is that Hadoop is an integral part of HAVEn. We see a lot of our customers and partners betting on Hadoop and we think it’s a good thing to keep Hadoop open and non-proprietary.

You can see the rest of this blog post HERE.

Listen to the podcast. Find it on iTunes. Read a full transcript or download a copy. Sponsor: HP.

 

You may also be interested in:

Leave a Comment

We encourage you to share your comments on this post. Comments are moderated and will be reviewed
and posted as promptly as possible during regular business hours

To ensure your comment is published, be sure to follow the Community Guidelines.

Be sure to enter a unique name. You can't reuse a name that's already in use.
Be sure to enter a unique email address. You can't reuse an email address that's already in use.
Type the characters you see in the picture above.Type the words you hear.
Search
About the Author
Dana Gardner is president and principal analyst at Interarbor Solutions, an enterprise IT analysis, market research, and consulting firm. Ga...


Follow Us
The opinions expressed above are the personal opinions of the authors, not of HP. By using this site, you accept the Terms of Use and Rules of Participation