HP Big Data Blog
Big Data changes everything. The HP Big Data team is powering data analytics, information governance, and information management across the enterprise. Join the conversation.

Data Protection: So Much Data, So Little Time (to Protect it)

Isn’t it odd how we move blindly through our digital life, saving remnants of day-to-day activity -- such as email, photos and tweets -- with little regard to the footprint we create?  Do we ever stop to think just how much data we are creating and how it will be protected?  Nah… we live in a “keep everything forever” society.…


What’s a few gigabytes anyway, right? 


The problem is if we multiply our selfish storage needs by a hundred, a thousand or even a million times -- we start to see a much different picture of what our world might look like in a year, or perhaps even 30 days from now.   



Let’s take a popular topic in the news. Surveillance.  If you stored metadata about every telephone call made in the US over a single year, it would add up to 9.25 exabytes, or 9250 petabytes, or 9,699,328,000 gigabytes.  That’s a lot of gigabytes.


That’s just one example, one channel. So if we think about all the other means and methods of saving unintentional data, it’s easy to imagine the backup nightmare.  Which brings us to the topic of the day: Is all data created equally?


The short answer is no.  Your Candy Crush scores or your Facebook “likes” are really important to you but your company? Not likely. 


However, with respect to business productivity applications, there are many examples of user-generated content that needs to be stored and protected.  For example, corporate email, data in business application, or information that sits within our customer relationship management database.


It’s the job of our loyal backup admins (of course working with their business owners) to identify what’s critical and what’s not.  No small task.  One person’s critical is potentially another person’s junk. This is not is typically a business decision and likely involves some type of record management process.  Ideally, data is classified, all parties agree and the backup admin goes forth and sets up backup and retention policies that reflect these business priorities. 


There are many ways to do “classify” data and apply protection levels.  My friend and fellow HP Product Manager Scott Baker (@lilbaker83) recently gave a killer presentation at VMworld Barcelona 2013 on managing virtual data protection and how to deal with the deluge of virtual information that faces the typical data center. Looking at the slide from his deck below, you can see that once critical data has been identified, you can apply both protection type and frequency.



Critical data intended to support the business --likely revenue generating or impacting information that you can’t live without --is stored on your most expensive storage solution with the highest levels of availability and resiliency.  This type of data likely is replicated in real time and protected with the use of array-based snapshots and data from those snapshots are backed up to a backup repository nightly at even higher frequencies.


Operational data can reside on media that has potentially slightly lower performance and availability standards.  It also may leverage snapshot integration and traditional backup, but the level of protection and frequency of protection may vary. The interval of protection is typically daily and a series of days are stored per week, then rolled up to a week, several weeks to a month, and so forth and so on.  So if you need to get back to a day in a week, the most immediate data represented in days within that week is available.


Lastly, legacy data may sit in an all-together different repository and likely leverages traditional backup methods such as disk to disk or tape based media for long-term retention.  This is likely data that is important to a business from a historical perspective but having a copy with a lower level of regularity is typically fine (weekly to monthly, in some cases quarterly or yearly). My colleague, Joe Garber, wrote an excellent post on legacy data.


So whether it’s your personal cloud (email, tweets, photos, etc.) or critical corporate information, classifying your data by level of importance makes the job of backup easier and more productive.  And if you are lucky, you might just save quite a few gigabytes’ lives in the meantime.


For information visit the HP Autonomy Enterprise Data Protection Solutions page. 

Leave a Comment

We encourage you to share your comments on this post. Comments are moderated and will be reviewed
and posted as promptly as possible during regular business hours

To ensure your comment is published, be sure to follow the Community Guidelines.

Be sure to enter a unique name. You can't reuse a name that's already in use.
Be sure to enter a unique email address. You can't reuse an email address that's already in use.
Type the characters you see in the picture above.Type the words you hear.
Showing results for 
Search instead for 
Do you mean 
About the Author
Stephen Spellicy is known as a virtualization and data protection subject matter expert with 17+ years of experience in the technology indus...

Follow Us
The opinions expressed above are the personal opinions of the authors, not of HP. By using this site, you accept the Terms of Use and Rules of Participation.