HP Big Data Blog
Big Data changes everything. The HP Big Data team is powering data analytics, information governance, and information management across the enterprise. Join the conversation.

Information Governance: Regaining Control of Dark Data

I recently wrote an article for Consumer Compliance Insights on Shining a Light on Dark Data. This is an issue that merits more attention. When we think of data we mostly focus on new data, the bit of the iceberg we see above the waters’ surface. But if the Titanic taught us anything, it was that it is foolish to ignore what lies beneath. Such is true of legacy, or “dark,” data.


Dark Data is often unknown and unmanaged, which increases enterprise risk exposure. For instance, The Ponemon Institute reported that individual data breach costs for organizations average around $5.5 million – something much more possible when data is unmanaged and ungoverned. There are also countless examples of fines, sanctions or adverse inference decisions being triggered by data being accidentally lost or mishandled.


This need not be the case. There are now technology solutions to help you see (and deal with) the legacy data that lies under the surface.  Here are some tips to help you regain control of dark data and take a positive step toward information governance:


  • Cast a wide net – Today, information that is subject to regulations, eDiscovery or legal holds is broad-based. Be cautious of focusing just on one data type (e.g., email) to avoid unexpected “red alerts.”
  • Don’t do it manually – Information is growing so fast, causing data stores to get increasingly out of hand. Relying just on manual processes, in which a human determines the value of each object, simply doesn’t make business sense. Look for a technology that provides a pathway to apply automated policies to data in order to optimize efficiency.
  • Think long term –While deleting some portion of the data is an objective for getting control of dark data, it’s generally not the only objective. By consolidating the remaining and valuable data in an active repository, organizations can more efficiently search and leverage this data over the long term – and also ensure that this data remains accessible even as technology changes.
  • Focus on defensibility – Keep audit trails on what decisions were made that impacted how data was managed. This will keep organizations from trying to recreate these impacts later on and help ensure data is protected if practices are questioned by the courts or regulatory bodies.


HP Autonomy’s ControlPoint and Application Information Optimizer (AIO) technologies are solid choices to achieve these objectives. Each allows organizations to gain control of legacy data, gain a quick win from an ROI perspective, and then establish a business case for broader information governance objectives. 


Leave a Comment

We encourage you to share your comments on this post. Comments are moderated and will be reviewed
and posted as promptly as possible during regular business hours

To ensure your comment is published, be sure to follow the Community Guidelines.

Be sure to enter a unique name. You can't reuse a name that's already in use.
Be sure to enter a unique email address. You can't reuse an email address that's already in use.
Type the characters you see in the picture above.Type the words you hear.
Showing results for 
Search instead for 
Do you mean 
About the Author
Joe Garber is Vice President of Information Governance at HP Autonomy. In this role, he leads product messaging and go-to-market efforts fo...

Follow Us
The opinions expressed above are the personal opinions of the authors, not of HP. By using this site, you accept the Terms of Use and Rules of Participation.