Debunking myths around big data should be a first step to making better business decisions for improving data analysis and data management capabilities in your company.
As the volume and purpose of data and business intelligence (BI) has dramatically shifted, older notions and misconceptions -- what amount to myths about data infrastructure -- need to updated and corrected, too.
So we're here to pose some better questions about data, and provide up-to-date answers for running data-driven businesses that can efficiently and repeatedly predict dynamic market trends and customer wants in real time.
As the volume and types of data that are brought to bear on business analytics advance, the means to manage and exploit that sea of data needs to be none too costly nor too complex for mid-size companies to master. There are better ways than traditional data architectures.
To help identify what works best around modern big data management, BriefingsDirect interviews Darin Bartik, Executive Director of Products in the Information Management Group at Dell Software. The discussion is conducted by Dana Gardner, Principal Analyst at Interarbor Solutions. [Disclosure: Dell is a sponsor of BriefingsDirect podcasts.]
Here are some excerpts:
Gardner: Are people losing sight of the business value by getting lost in speeds and feeds and technical jargon around big data? Is there some sort of a disconnect between the providers and consumers of big data?
Bartik: You hit the nail on the head with the first question. We are experiencing a disconnect between the technical side of big data and the business value of big data, and that’s happening because we’re digging too deeply into the technology.
With a term like big data, or any one of the trends that the information technology industry talks about so much, we tend to think about the technical side of it. But with analytics, with the whole conversation around big data -- what we've been stressing with many of our customers -- is that it starts with a business discussion. It starts with the questions that you're trying to answer about the business; not the technology, the tools, or the architecture of solving those problems. It has to start with the business discussion.
That’s a pretty big flip. The traditional approach to BI and reporting has been one of technology frameworks, and a lot of things that were owned more by the IT group. This is part of the reason why a lot of the BI projects of the past struggled, because there was a disconnect between the business goals and the IT methods.
So you're right. There has been a disconnect, and that’s what I've been trying to talk a lot about with customers -- how to refocus on the business issues you need to think about, especially in the mid-market, where you maybe don’t have as many resources at hand. It can be pretty confusing.
I've been a part of Dell Software since the acquisition of Quest Software. I was a part of that organization for close to 10 years. I've been in technology coming up on 20 years now. I spent a lot of time in enterprise resource planning (ERP), supply chain, and monitoring, performance management, and infrastructure management, especially on the Microsoft side of the world.
Most recently, as part of Quest, I was running the database management area -- a business very well-known for its products around Oracle, especially Toad, as well as our SQL Server management capabilities. We leveraged that expertise when we started to evolve into BI and analytics.
I started working with Hadoop back in 2008-2009, when it was still very foreign to most people. When Dell acquired Quest, I came in and had the opportunity to take over the Products Group in the ever-expanding world of information management. We're part of the Dell Software Group, which is a big piece of the strategy for Dell over all, and I'm excited to be here.
Without disparaging the vendors like us, or anyone else, the current confusion is part of the problem of any hype cycle. Many people jumped on the bandwagon of big data. Just like everyone was talking cloud. Everyone was talking virtualization, bring your own device (BYOD), and so forth.
Everyone jumps on these big trends. So it's very confusing for customers, because there are many different ways to come at the problem. This is why I keep bringing people back to staying focused on what the real opportunity is. It’s a business opportunity, not a technical problem or a technical challenge that we start with.
Gardner: Even the name "big data" stirs up myths right from the get-go, with "big" being a very relative term. Should we only be concerned about this when we have more data than we can manage? What is the relative position of big data and what are some of the myths around the size issue?
Bartik: That’s the perfect one to start with. The first word in the definition is actually part of the problem. "Big." What does big mean? Is there a certain threshold of petabytes that you have to get to? Or, if you're dealing with petabytes, is it not a problem until you get to exabytes?
It’s not a size issue. When I think about big data, it's really a trend that has happened as a result of digitizing so much more of the information that we all have already and that we all produce. Machine data, sensor data, all the social media activities, and mobile devices are all contributing to the proliferation of data.
It's added a lot more data to our universe, but the real opportunity is to look for small elements of small datasets and look for combinations and patterns within the data that help answer those business questions that I was referencing earlier.
It's not necessarily a scale issue. What is a scale issue is when you get into some of the more complicated analytical processes and you need a certain data volume to make it statistically relevant. But what customers first want to think about is the business problems that they have. Then, they have to think about the datasets that they need in order to address those problems.Big-data challenge
That may not be huge data volumes. You mentioned mid-market earlier. When we think about some organizations moving from gigabytes to terabytes, or doubling data volumes, that’s a big data challenge in and of itself.
Analyzing big data won't necessarily contribute to your solving your business problems if you're not starting with the right questions. If you're just trying to store more data, that’s not really the problem that we have at hand. That’s something that we can all do quite well with current storage architectures and the evolving landscape of hardware that we have.
We all know that we have growing data, but the exact size, the exact threshold that we may cross, that’s not the relevant issue.
Gardner: I suppose this requires prioritization, which has to come from the business side of the house. As you point out, some statistically relevant data might be enough. If you can extrapolate and you have enough to do that, fine, but there might be other areas where you actually want to get every little bit of possible data or information relevant, because you don't know what you're looking for. They are the unknown unknowns. Perhaps there's some mythology about all data. It seems to me that what’s important is the right data to accomplish what it is the business wants.
Bartik: Absolutely. If your business challenge is an operational efficiency or a cost problem, where you have too much cost in the business and you're trying to pull out operational expense and not spend as much on capital expense, you can look at your operational data.
Maybe manufacturers are able to do that and analyze all of the sensor, machine, manufacturing line, and operational data. That's a very different type of data and a very different type of approach than looking at it in terms of sales and marketing.
If you're a retailer looking for a new set of customers or new markets to enter in terms of geographies, you're going to want to look at maybe census data and buying-behavior data of the different geographies. Maybe you want datasets that are outside your organization entirely. You may not have the data in your hands today. You may have to pull it in from outside resources. So there's a lot of variability and prioritization that all starts with that business issue that you're trying to address.
Gardner: Perhaps it's better for the business to identify the important data, rather than the IT people saying it’s too big or that big means we need to do something different. It seems like a business term rather than a tech term at this point.
Bartik: I agree with you. The more we can focus on bringing business and IT to the table together to tackle this challenge, the better. And it does start with the executive management in the organization trying to think about things from that business perspective, rather than starting with the IT infrastructure management team.
Gardner: What’s our second myth?
Bartik: I'd think about the idea of people and the skills needed to address this concept of big data. There is the term "data scientist" that has been thrown out all over the place lately. There’s a lot of discussion about how you need a data scientist to tackle big data. But “big data” isn't necessarily the way you should think about what you’re trying to accomplish. Instead, think about things in terms of being more data driven, and in terms of getting the data you need to address the business challenges that you have. That’s not always going to require the skills of a data scientist.Data scientists rare
I suspect that a lot of organizations would be happy to hear something like that, because data scientists are very rare today, and they're very expensive, because they are rare. Only certain geographies and certain industries have groomed the true data scientist. That's a unique blend between a data engineer and someone like an applied scientist, who can think quite differently than just a traditional BI developer or BI programmer.
Don’t get stuck on thinking that, in order to take on a data-driven approach, you have to go out and hire a data scientist. There are other ways to tackle it. That’s where you're going to combine people who can do the programming around your information, around the data management principles, and the people who can ask and answer the open-minded business questions. It doesn’t all have to be encapsulated into that one magical person that’s known now as the data scientist.
There are varying degrees of tackling this problem. You can get into very sophisticated algorithms and computations for which a data scientist may be the one to do that heavy lifting. But for many organizations and customers that we talk to everyday, it’s something where they're taking on their first project and they are just starting to figure out how to address this opportunity.
For that, you can use a lot of the people that you have inside your organization, as well potentially consultants that can just help you break through some of the old barriers, such as thinking about intelligence, based strictly on a report and a structured dashboard format.
That’s not the type of approach we want to take nowadays. So often a combination of programming and some open-minded thinking, done with a team-oriented approach, rather than that single keyhole person, is more than enough to accomplish your objectives.
Gardner: It seems also that you're identifying confusion on the part of some to equate big data with BI and BI with big data. The data is a resource that the BI can use to offer certain values, but big data can be applied to doing a variety of other things. Perhaps we need to have a sub-debunking within this myth, and that is that big data and BI are different. How would you define them and separate them?
Bartik: That's a common myth. If you think about BI in its traditional, generic sense, it’s about gaining more intelligence about the business, which is still the primary benefit of the opportunity this trend of big data presents to us. Today, I think they're distinct, but over time, they will come together and become synonymous.
I equate it back to one of the more recent trends that came right before big data, cloud. In the beginning, most people thought cloud was the public-cloud concept. What’s turned out to be true is that it’s more of a private cloud or a hybrid cloud, where not everything moved from an on-premise traditional model, to a highly scalable, highly elastic public cloud. It’s very much a mix.
They've kind of come together. So while cloud and traditional data centers are the new infrastructure, it’s all still infrastructure. The same is true for big data and BI, where BI, in the general sense of how can we gain intelligence and make smarter decisions about our business, will include the concept of big data.Better decisions
So while we'll be using new technologies, which would include Hadoop, predictive analytics, and other things that have been driven so much faster by the trend of big data, we’ll still be working back to that general purpose of making better decisions.
One of the reasons they're still different today is because we’re still breaking some of the traditional mythology and beliefs around BI -- that BI is all about standard reports and standard dashboards, driven by IT. But over time, as people think about business questions first, instead of thinking about standard reports and standard dashboards first, you’ll see that convergence.
Gardner: We probably need to start thinking about BI in terms of a wider audience, because all the studies I've seen don't show all that much confidence and satisfaction in the way BI delivers the analytics or the insights that people are looking for. So I suppose it's a work in progress when it comes to BI as well.
Bartik: Two points on that. There has been a lot of disappointment around BI projects in the past. They've taken too long, for one. They've never really been finished, which of course, is a problem. And for many of the business users who depend on the output of BI -- their reports, their dashboard, their access to data -- it hasn’t answered the questions in the way that they may want it to.
One of the things in front of us today is a way of thinking about it differently. Not only is there so much data, and so much opportunity now to look at that data in different ways, but there is also a requirement to look at it faster and to make decisions faster. So it really does break the old way of thinking.
Slowness is unacceptable. Standard reports don't come close to addressing the opportunity in front us, which is to ask a business question and answer it with the new way of thinking supported by pulling together different datasets. That’s fundamentally different from the way we used to do it.
People are trying to make decisions about moving the business forward, and they're being forced to do it faster. Historical reporting just doesn't cut it. It’s not enough. They need something that’s much closer to real time. It’s more important to think about open-ended questions, rather than just say, "What revenue did I make last month, and what products made that up?" There are new opportunities to go beyond that.
Gardner: When it comes to these technology issues, do you also find, Darin, that there is a lack of creativity as to where the data and information resides or exists and thinking not so much about being able to run it, but rather acquire it? Is there a dissonance between the data I have and the data I need. How are people addressing that?
Bartik: There is and there isn’t. When we look at the data that we have, that’s oftentimes a great way to start a project like this, because you can get going faster and it’s data that you understand. But if you think that you have to get data from outside the organization, or you have to get new datasets in order to answer the question that’s in front of us, then, again, you're going in with a predisposition to a myth.
You can start with data that you already have. You just may not have been looking at the data that you already have in the way that’s required to answer the question in front of you. Or you may not have been looking at it all. You may have just been storing it, but not doing anything with it.
Storing data doesn’t help you answer questions. Analyzing it does. It seems kind of simple, but so many people think that big data is a storage problem. I would argue it's not about the storage. It’s like backup and recovery. Backing up data is not that important, until you need to recover it. Recovery is really the game changing thing.
Gardner: It’s interesting that with these myths, people have tended, over the years, without having the resources at hand, to shoot from the hip and second-guess. People who are good at that and businesses that have been successful have depended on some luck and intuition. In order to take advantage of big data, which should lead you to not having to make educated guesses, but to have really clear evidence, you can apply the same principle. It's more how you get big data in place, than how you would use the fruits of big data.
It seems like a cultural shift we have to make. Let’s not jump to conclusions. Let’s get the right information and find out where the data takes us.
Bartik: You've hit on one of the biggest things that’s in front of us over the next three to five years -- the cultural shift that the big data concept introduces.
We looked at traditional BI as more of an IT function, where we were reporting back to the business. The business told us exactly what they wanted, and we tried to give that to them from the IT side of the fence.Data-driven organization
But being successful today is less about intuition and more about being a data-driven organization, and, for that to happen, I can't stress this one enough, you need executives who are ready to make decisions based on data, even if the data may be counter intuitive to what their gut says and what their 25 years of experience have told them.
They're in a position of being an executive primarily because they have a lot of experience and have had a lot of success. But many of our markets are changing so frequently and so fast, because of new customer patterns and behaviors, because of new ways of customers interacting with us via different devices. Just think of the different ways that the markets are changing. So much of that historical precedence no longer really matters. You have to look at the data that’s in front of us.
Because things are moving so much faster now, new markets are being penetrated and new regions are open to us. We're so much more of a global economy. Things move so much faster than they used to. If you're depending on gut feeling, you'll be wrong more often than you'll be right. You do have to depend on as much of a data-driven decision as you can. The only way to do that is to rethink the way you're using data.
Historical reports that tell you what happened 30 days ago don't help you make a decision about what's coming out next month, given that your competition just introduced a new product today. It's just a different mindset. So that cultural shift of being data-driven and going out and using data to answer questions, rather than using data to support your gut feeling, is a very big shift that many organizations are going to have to adapt to.
Executives who get that and drive it down into the organization, those are the executives and the teams that will succeed with big data initiatives, as opposed to those that have to do it from the bottom up.
Gardner: Listening to you Darin, I can tell one thing that isn’t a product of hype is just how important this all is. Getting big data right, doing that cultural shift, recognizing trends based on the evidence and in real-time as much as possible is really fundamental to how well many businesses will succeed or not.
So it's not hype to say that big data is going to be a part of your future and it's important. Let's move towards how you would start to implement or change or rethink things, so that you can not fall prey to these myths, but actually take advantage of the technologies, the reduction in costs for many of the infrastructures, and perhaps extend and exploit BI and big data problems.
Bartik: It's fair to say that big data is not just a trend; it's a reality. And it's an opportunity for most organizations that want to take advantage of it. It will be a part of your future. It's either going to be part of your future, or it's going to be a part of your competition’s future, and you're going to be struggling as a result of not taking advantage of it.
You can read the rest of this blog post here.
You may also be interested in:
- Data complexity forces need for agnostic tool chain approach for information management, says Dell S...
- Dell's Foglight for Virtualization update extends visibility and management control across more infr...
- For Dell's Quest Software, BYOD Puts Users First and with IT's Blessing
- Dell survey highlights importance of putting users before devices when developing BYOD strategies
- New Levels of Automation and Precision Needed to Optimize Backup and Recovery in Virtualized Environ...
- Data explosion and big data demand new strategies for data management, backup and recovery, say expe...
Follow Us on SlideShare
- HP HAVEn CTO Mundada on new ways for businesses to...
ing demands on data centers drive need f...
CIOs think proliferat
ion of IT applicatio ns is thr...
ased Finansbank manages risk and securit...
HP Access Catalog smooths the way for streamline
- HP adds new value to Vertica data analytics platfo...
tion eases developer and operati...
Healthcare among most opportunis
tic use cases for ...
- Siemens Brazil blazes a best practices path to del...
- Nimble Storage leverages big data and cloud to pro...
- Eva Brian(anon) on: Open Group security gurus dissect the cloud: Highe...
SEGA Europe uses VMware to standardiz
e cloud envir...
- on: Why HTML5 enables more businesses to deliver more ...
- Bill Pedersen on: Veryant introduces isCOBOL for HP OpenVMS Systems
- Peter Jakab(anon) on: HP Data Protector, a case study on scale and compl...