Information Faster Blog

Answering the information explosion “wake-up call”

By Judy Redman

The information explosion should serve as a “wake up call for business enterprises of all sizes,” says an April 8 article in  Nearly 90 percent of businesses blame poor performance on data growth, according to the Informatica survey that was cited in the article.  The survey concludes that for many businesses their applications and databases are growing by 50 percent or more annually, putting them in a position of being unable to manage this incredible expansion of information. 

The article recommends that it is best to implement a lifecycle approach to information management.  This means managing your applications and data “from development, test and early production all the way through to archive and retirement."  Having an information management strategy is a key success factor in managing the ever-expanding information that the average enterprise produces annually.

HP has an entire portfolio specially designed to help you answer the information explosion wake-up call.  HP TRIM Records Management software is designed to capture, manage, and secure business information in order to meet governance and regulatory compliance obligations. HP TRIM 7 is integrated with Microsoft Office and Microsoft SharePoint Server 2007 and the upcoming Sharepoint Server 2010. The Sharepoint integration is especially important as the Radicati Group predicts Sharepoint use to grow at 25% rate for the next few years.  HP Integrated Archive Platform for compliance archiving and HP Clearwell E-Discovery Platform for legal analytics are two additional HP Governance and E-Discovery solutions.

Finally, no Information Management program is complete unless critical business data is backed-up and recoverable.  HP Data Protector software has more than 35,000 customers who use it to automate high-performance backup and recovery from disk or tape for 24x7 business continuity.

I’d like to hear what you think are your biggest challenges in managing the information explosion.  Do you employ a lifecycle strategy to help you harness the unwieldy expansion of data at your company?  

Save $100 when you register now for HP's Information Management sessions at Software Universe

By Patrick Eitenbichler

HP Software and Solutions’ Information Management suite will be featured at the upcoming HP Software Universe 2010 in Washington DC, June 15 – 18th, 2010.

The IM suite, including HP Data Protector, HP Email, Database and Medical Archiving IAP, and HP TRIM records management software, will be represented in two tracks:

  • Data Protection

  • Information Management for Governance and E-Discovery

Customer case studies and presentations from product experts will highlight how HP’s Information Management solutions provide outcomes that matter. For more information about this event, or to register, please use code INSIDER at and get $100 off the conference rate.

How three very different companies are managing rapid database growth

By Patrick Eitenbichler

Wanted to share three great customer success stories. The companies are very different from each other, but they’re all grappling with business challenges posed by surging data growth: meeting compliance obligations, controlling storage costs, and optimizing performance. The companies turned to HP Database Archiving software to solve these problems, and more.

Tektronix, a U.S.-based provider of test and measurement solutions to the electronics industry, improved application and database performance by more than 47%, and aced compliance tests in 29 countries, despite data growth of 1.25 GB per month.

Tong Yang Group, a Taiwanese automotive parts manufacturer, experienced data growth at a rate of 30-40 GB on average per month - impacting database performance and causing
user-related issues. Tong Yang saw an immediate 10% increase in efficiency in handling orders, and they gained the ability to support 7% business growth in 2009 despite the economic recession.

Turkey is both a private financial services company and the country’s central depository for dematerialized securities. The agency’s database grew 1000 times in a one-year period, due in part to industry regulations requiring financial services firms to store more data for longer periods of time. With HP Database Archiving software, the agency met its growing data archiving needs while reducing storage costs by 50%.’s Central Registry Agency

To learn more about how these companies overcame their database growth challenges, click on their corresponding names above.

Is Exchange 2010 Archiving for You?

By André Franklin

Most Exchange customers are aware that email archiving is a new feature in the recently released Exchange Server 2010 (E2010). The archiving and e-discovery benefits of Exchange 2010 can be summarized as follows:

Less need for PST’s

  • User has a personal archive

  • Rules to move to archive (after a certain time period) 

Improved retention

  • Keep for a certain amount time period

  • Legal holds

Basic e-discovery services

  • Search across multiple mailboxes

This is Part 1 in a multi-part series. The goal is to help IT departments determine if Exchange 2010 archiving will meet desired archiving requirements. In a word…is Exchange 2010 archiving for you.


This edition of the multi-part series will explore archiving and retention rules.

Archiving Rules

Exchange 2010 archiving allows for time-based archiving policies only. For example, mailbox messages can be archived if the messages are older that X number of days. This may work for many Exchange customers. Unfortunately, this will NOT work for archiving policies require additional parameters. Assume a message should be archived based on age only if the word “FINANCIAL” appears in the subject line. If you need multiple conditions in archiving policies – 3rd party Exchange archiving products should be considered. Archiving based on age alone is perfect for many…including many small businesses.

Retention Rules

The same restrictions apply to retention rules. Retention rules are based on age only. Exchange 2010 does not support retention policies with multiple conditions (e.g. - Delete message if “FINANCIAL” is in the message header AND if message is older than one year).

Stay tuned for Part 2 in which we’ll discuss online and offline archive access.

Structured Records Management - Taking control of the structured data

In my last post I spoke about how the transfer of structured data from the source system into the records management system works. Now that we have covered this step, lets look at some of the special features that you want to manage structured data as records.

Like any other record, you want to be able to preserve the authenticity, reliability, integrity and usability of the data.  The authenticity is maintained by the system storing an audit trail of the whole transfer process and any subsequent actions taken on the records. The reliability is based on the collaboration of application owners and records managers in the definition and classification of the structured records model, which means that the transferred data is based on a design by people who know all the facts about its source and usage. 

That leaves me to elaborate a bit more about the integrity and usability. 

The structured records get transferred into the records management environment as XML files.  Each transfer batch is a self contained group, consisting of a number of XML files that contain the data and a summary XML file that contains a detailed description of what the data files contain.  To be able to use the data and the summary file in future, each of them is described by a XML schema definition.   All of these files together form a single package and the records management rules are applied at the package level, meaning that the same security and retention rules apply to all files of a single transfer. The integrity of the individual files can be proved at any stage based on hash comparison technology between the summary and the data files.

Usability means that the structured data is not lost once it resides in the records management environment. Text indexing can be used to provide searching across the contents of the XML files to find batches that include data pertinent to a particular circumstance, e.g. all batches that contain customer number XYZ or order number 123.  This is the kind of full text searching that people use across all machine readable formats as part of early searches in the e-discovery or freedom of information processes. However, structured records should also be available to other methods of searching, e.g. for reporting engines. Having the data in XML format with a full schema description allows us to use our Record Query Server to create an ODBC data source pointing to the XML files, which can then be used by a whole variety of SQL query tools - this is a distinct advantage that you get from storing structured records as XML data, rather than as flat text file or PDF formatted report output.  If the original application still exists, and its algorithms are desirable in the analysis of the data, the records management system provides a re-load function to send the XML based data back to the original source database schema.

In all our design of HP TRIM functionality we pay attention to the characteristics of records as prescribed by ISO 15489: authenticity, reliability, integrity and usability, and as you can see,  structured records management is no exception.  By adhering to this principle we are able to create a truly unified records management environment, encompassing all formats of information, physical, electronic, unstructured and structured, meaning that you can apply a single set of consistent records management policies across all your enterprise content.


Email Management?

By Noel Rath

AIIM (Association for Information and Image Management) has produced an excellent report from their survey on "Email Management - the Good, the Bad, and the Ugly" (© AIIM 2009, -- available at

Here are some of the key findings.

  • On average, our respondents spend more than an hour and a half per day processing their emails, with one in five spending three or more hours of their day.

  • “Sheer overload” is reported as the biggest problem with email as a business tool, followed closely by “Finding and recovering past emails” and “Keeping track of actions”.

  • Email archiving, legal discovery, findability and storage volumes are the biggest current concerns within organizations, with security and spam now considered less of a concern by our respondents.

  • Over half of respondents are “not confident” or only “slightly confident” that emails related to documenting commitments and obligations made by staff are recorded, complete, and retrievable.

  • Only 10% of organizations have completed an enterprise-wide email management initiative, with 20% currently rolling out a project. Even in larger organizations, 17% have no plans to, although the remaining 29% are planning to start sometime in the next 2 years.

  • Some 45% of organizations (including the largest ones) do not have a policy on Outlook “Archive settings” so most users will likely create .pst archive files on local drives.

  • Only 19% of those surveyed capture important emails to a dedicated email management system or to a general purpose ECM system. 18% print emails and file as paper, and a worrying 45% file in nonshared personal Outlook folders.

  • A third of organizations have no policy to deal with legal discovery, 40% would likely have to search back-up tapes, and 23% feel they would have gaps from deleted emails. Only 16% have retention policies that would justify deleted emails.



SourceOne customer speaks, but HP IAP customers boast

An EMC customer has spoken about the new SourceOne suite, saying:

  • EMC’s solution is not the cheapest.

  • They wonder if they’ve over-bought.

  • They’d like to see EMC add support for end-user access to the archive through mailboxes.
HP customers speak louder: Here’s what HP IAP customers are saying about HP’s comprehensive solution:Brunel University: “What we have is effectively the best 'find' button on the internet!  Beyond just efficiency, the solution has helped Brunel further enhance its reputation for corporate integrity, and you simply can't put a price on that.”Coscon: “The software and hardware integrated solution delivered by HP has not only mitigated the risks we faced, but also helped us to realise real-time mail data management in an effective manner within a short period of time.”Dubai International Financial Centre: “With the HP IAP we have peace of mind knowing that we can be in full compliance with legal and financial regulations.  It has made it far easier for us to retrieve any email we need— we can now do it in minutes.”


Compliance concerns resulting from cross-border litigation

By Patrick Eitenbichler

On April 7th, the Sarbanes-Oxley Compliance Journal published a great article written by Brandon Cook, Senior Product Marketing Manager at Clearwell Systems.

Brandon describes how the increasing number of business transactions across borders leads to more litigation, government inquirires, and compliance audits spanning international boundaries.  Using a number of real-life examples, he shows the implications and then provides recommendations on how to get prepared for cross-border e-discovery.

Take a read:  "Why Cross-Border Litigation is a Compliance Concern"


EMC's new services: Not new to HP customers

The EMC announcement of the SourceOne suite includes new consulting services to help customers develop information policies which align with business goals and regulatory requirements.

 HP already provides customers with regulatory compliance services, such as compliance and e-discovery workshops, information discovery and classification, business value analysis and requirements development, compliance and data policy assessment, and information policy definition.  Furthermore, the May 2008 acquisition of EDS enables HP to deliver a broad portfolio of information technology, applications and business process outsourcing services to clients in manufacturing, financial services, healthcare, communications, energy, transportation, retail industries, and government. In fact, HP recently increased its Information Management Services headcount by 10X, to further meet the needs of its customers.


EMC announcement: More like "PromiseOne"

On April 2, 2009 EMC finally announced the long-awaited replacement of EmailXtender.  No surprise.  Actually, it looks like they tried to announce it on April 1 and then pulled all the links—perhaps it was feared it would be seen as an April Fool’s joke.  What isn’t a joke is that this product, called SourceOne Email Management, is actually not a one-source archive solution—yet.  Like its predecessor, it does still archive one overall type of content: messaging.  EMC says that later this year they will release file, XML, and SharePoint archiving.  So, that’s when it will be “one source”?  Not exactly.  Why?  Because the SourceOne product family is not integrated.  Give them twelve to eighteen months—hey, they promised after all.

Bottom line: EMC’s announcement does not compare to the breadth and range of HP’s current offerings, and EMC is more than six months late to market with a product that does not even fulfill what they previously communicated to customers in terms of their key archiving needs.  Furthermore, the release of SourceOne Email Management is a replacement for EmailXtender, and what EMC is delivering with this release is a mere promise of what this product could become in the next year to eighteen months.  In these economic times, we need more than promises to show ROI like what HP IAP customers have been achieving for more than four years:

--Improving staff productivity by up to 80%, and email- and file-based productivity by over 34%

--Lowering email and document processing, review, and production costs by up to 90%

--Reducing time needed to analyze email and documents from weeks to minutes

--Achieving control of their corporate data, improving information governance

To Stub Or Not To Stub, That Is The Question…

By André Franklin

Whether 'tis nobler in the mind to suffer
The slings and arrows of convenient mailbox message stubs,
Or accept the clean simplicity of stub-less archiving…

… well… that is the question…

Ok…enough Shakespeare. What are you talking about?

When email archiving is performed to enable better email management, HP calls it selective archiving. In a word, selective archiving removes mail messages from mailservers. There are two popular selective archiving methods to manage mailservers:

  • with stubs

  • without stubs

So… what is a stub?

A stub is a substitute for a mailbox message that has been removed from a mailserver and placed into a dedicated archive. A stub contains a link to the original message and attachments that reside in the archive. The stub allows the original message and attachments to be retrieved from the archive through a user’s mailbox.

Only after messages are safely archived are they removed from the mailserver. Users remain within their mailbox quota limits as mailserver messages are deleted. This whole process improves email performance and reduces mailserver backup headaches. Assuming archive storage is lower cost per MB than Tier 1 mailserver storage, there are clear capital expenditure benefits for selective archiving.

(Note: archiving strictly for compliance purposes never uses stubs, but compliance archiving can be performed in addition to a selective archiving strategy).

What do I gain with each approach?


A stubbed representation appears in the same place in a user’s mailbox as the message it represents. It allows for a single integrated list of both mailbox and archived messages. Stub messages are very small in comparison to the messages they replace. When used with policies that automatically remove, archive, and “stub” messages (often based on message age), users can experience a sense of “infinite mailbox”, and without the massive mailbox capacity that would give some mail administrators a heart attack.


There is no possibility or “stubbing” software causing problems with the mail client. Mail messages and messages classes are not modified. Archived messages that have been removed from mailserver mailboxes are presented to users as a special “archive folder” (no view of mailbox messages).

We’ll look at more of the benefits and “gotchas” of each approach in my next post.


Go Green; Retire Those Old Energy-Hogging Apps!

by Mary Caplice

I was reading an article today from Forrester (‘Q&A: The Economics Of Green IT’) about how companies can not only save money by going green, there may also be government incentives and utility company programs to help them.  They may even incur penalties in some regions for not going green in certain areas. This article suggests that there are very compelling reasons for IT leaders to educate themselves on local incentives and penalties.  Some green projects require an upfront cost that will pay off later, some require none.  For example, GE expects to save millions just by turning on Windows features like standby and hibernate!  IT can save capital and operating expenses, cooling costs, DBA time, facility square footage  and  license fees for both hardware and software by retiring applications that are being kept alive in case they’re needed down the road for regulatory and compliance reasons.   There’s even a secondary market for that retired hardware!  One way to go about application retirement is to invest in HP Database Archiving software (excellent ROI potential is discussed in my recent blog ‘Death and Taxes- Maybe one can be avoided’).


Ignore it but it won’t go away!

By Mary Caplice

Although we’re all experiencing the effects of a worldwide downturn in the economy, organizations are finding that there are certain things they can’t ignore and wait until the economy improves.  They’re finding that they have no choice but to invest time and technology in reducing costs and risks associated with both increasing data retention regulations and the ability to quickly and efficiently answer legal discovery requests.  This problem is of course most concentrated in highly regulated industries such as Insurance, Financial Services and pharmaceuticals.

HP Database Archiving customers are finding that investing in our technology in these areas can really pay off!

IAP Retention Management – Future Ideas

By Ronald Brown

Today, the HP Integrated Archive Platform (IAP) manages retention at the “archive level” – meaning the archive itself is not only responsible for executing the retention management functionality, but it is also the place where the retention settings are configured.  This means the “archive” administrator has the responsibility to maintain the system in accordance with a company’s record retention strategies.  This is certainly one approach, however, there are others which may give more flexibility.

As the number of applications that move data into an archive grows, it becomes more important to actually understand the business value of that data and to provide more flexible retention policies.   Perhaps the owners of the application data itself should be able to communicate their requirements to the IT personnel responsible for their data.  In this case the application itself should drive the retention policies.  This will help ensure that the retention policies are specific to the application and maintained by the application experts.  The archive itself will be the executor of these policies.  While this affords more flexibility, the downside is that it requires more attention in order to define these policies and maintain them – so sometimes a blanket policy works better, especially for customers who are reluctant to commit the time and effort involved in defining their corporate retention strategy in a granular manner.

Another interesting use case is where the archive only retains data that is under active investigation or discovery.  Here, the archive is loaded with, for instance, 3 years of corporate data.  Then, specific queries are performed and the resultant data sets are placed on hold.  After this process is completed, all data not on hold is released and removed from the archive.  This use case serves a specific customer base very well – even though it seems to defeat the intended purpose of the archive.

One can never “predict” what is best for a customer and how they will utilize their technology investments.  The key is to give enough flexibility so that all use cases can be explored.


How do you manage your Non-Records?

It makes me cringe every time I hear someone say "we don't need to worry about these, they are not records".  People usually refer to information that has been created or used in the line of business, but doesn't fall into their organizations "official" definition of business records.

These non-records cannot just be left alone. They still contain evidence and deserve to be treated with respect. The fact is simply that they are not seen as being of high business value and therefore nobody wants to spend time managing them.

And this is where the combination of an archiving system with a records management system makes sense. In our portfolio we have the Integrated Archive Platform, which allows you to set up rules to capture e-mails and files automatically. The IAP doesn't just apply retention rules to them, but also maintains their evidential value through making them searchable and non-tamperable. 

Through the integration of the IAP and HP TRIM you can still elevate the status of these non-record information to a business record, if and when required. At that point you add descriptive metadata to capture additional information about how the records were used and preserve their integrity and usability as part of a collection in the context of your business activities. If you capture any records right at the point of their creation, they are still stored in the IAP and take advantage of the secure and resilient storage.

The combination of the IAP archiving and the HP TRIM records management technologies allows you to build an uncluttered collection of high value business records, without running the danger that you have out of control non-records floating around your network forever.

Email Archiving: Choose Carefully…Very Carefully (Part 5)

By André Franklin

This is part 5 of a 5 part series on email archiving. In part 4, we discussed the first five of the following seven principles of choosing an email archiving solution:

  1. Thoroughly understand your email environment

  2. Set clear archiving goals that will still make sense in 5 years or more

  3. Examine scalability in all dimensions

  4. Don’t treat email archiving as a silo. Consider other applications that need (or will need) data archiving

  5. Favor solutions built on standard interfaces for investment protection

  6. Backup and/or replication is more important than any other single feature

  7. Seek references of companies that have similar needs

We’ll take a close look at principles 6 and 7 in this final installment of the series.


Backup and/or replication is more important than any other single feature

The ability to backup and protect the archive is critical. Now…this may sound like a contradiction. Isn’t an archive deemed “safe” by definition? Redundant drives? WORM-like features? Digital signatures and certified tamperproof? In a secure area?

It all sounds so secure…but as my dad used to say, “it ain’t necessarily so”.

Let’s say I have all of the photos of my daughter’s wedding backed up to a CD-ROM. Let’s also assume that I delete these photos from my computer to free up disk space.

One more assumption…

My three year-old grandson finds the CD-ROM on my desk. He thinks it is the most amazing shiny toy he has ever seen.

Need I say more?

Stuff happens.

Non-rewritable CD-ROM is WORM by definition. It’s long lasting. It’s tamperproof. But the one thing CD-ROM is NOT is indestructible. And so it is with an email archive: it may be the safest device on the planet on paper…but backup is not really backup until there are at least two instances of the data…and one instance must be physically removed from the other. A secure data center does not help if multiple failures in the archiving platform itself make data unrecoverable.

Choose an email archiving solution that offers a backup solution…and keep the backup media in a safe place away from the archiving platform.

Remote replication can ease backup requirements. Depending on the vendor…archive platform “A” can be a replication target for archive platform “B”…and vice versa. With a failover scheme and physical separation, many benefits can be realized including greater data protection, greater data availability, and increased disaster recovery. Backup windows for mail servers can be shortened…and backup frequency reduced. So…replication is NOT just about disaster recovery.

The bottom line on backup: You gotta have this! Period.

The bottom line on replication: If there is as little as a 5% chance you’ll need replication in the future…invest in a solution that has it available today. Again…the cost to migrate to another vendor’s platform down-the-road costs more than just money – as we have discussed in previous posts. Evaluate ALL of the benefits of replication before deciding against it…and don’t make the decision on cost alone.


Seek references of companies that have similar needs

It’s not sufficient to buy based on an analysts recommendation. Ask vendors for references from customers that “look and smell” like you. If you have 50,000 mailboxes to archive and have compliance and mailbox quota issues to address... ask for a reference of similar size that has solved similar problems with their product. You may learn more in a single reference call than several sessions with sales reps.

In conclusion, if you keep the above seven principles in mind when deciding on an email archiving platform, you are far less likely to find the need to migrate to a different platform in the future. There are no standards adhered to by archiving vendors to facilitate migration from one archiving platform to another. So…be wise and take your time to make sure your choice will meet your needs now…and for years to come.


What does Archive and Records Management really mean?

By Noel Rath 

I’d like to discuss archive management and records management and the widespread use of these terms. The Society of American Archivists state that archives management is “The general oversight of a program to appraise, acquire, arrange and describe, preserve, authenticate, and provide access to permanently valuable records.” It’s not about day to day business activities, it is to appraise, acquire and preserve records. Records are evidence of actions, decisions and inaction; they are complete and must also be accessible, reliable, authentic, accurate and inviolate.

There is one view that a record is created at the end of the process i.e. when the letter or contract if finalized and therefore declared as a record. Maybe this is the reason for confusion of records management with archive systems. The State Records NSW says that “Records are a valuable corporate asset that by their retention and reuse as evidence of decision making and business activity can improve both the efficiency and effectiveness of an organisation”. Records management is therefore an inextricable part of the day to day operations of a business.

The widely adopted ISO 15489 standard defines records management as; “field of management responsible for the efficient and systematic control of the creation, receipt, maintenance, use and disposition of records, including processes for capturing and maintaining evidence of and information about business activities and transactions in the form of records definition.”

So we can describe records management as the control of evidence of decisions and business activities for the life of those activities from creation to ultimate disposition, i.e. destruction or archival whatever the end of the records management process is. The ISO definition also refers to the maintenance and use of these records. In the day to day operations of a business, these records of transactions have a life cycle and are the lifeblood of a business. So records management is inextricable tied to the business process and thinking of records management in this way can open up for discussion the value of records management in business.

For information about the value of records management, download the whitepaper at 


Federal Rule of Evidence 502: Help or Hype?


by Dean Gonsowski, Clearwell Systems, Inc.


i502There’s a lot of excitement (and corresponding uncertainty) about the recent passing of Federal Rule of Evidence 502 (FRE 502), which was signed into law on Sept 19th.  The main reason that the legal community is excited about FRE 502 is because of the potential for cost savings by reducing the amount of money associated with the e-discovery review process, which is routinely viewed as the most expensive area in the entire e-discovery process.

In combination with the codification of a national standard to determine when a privilege has been waived, FRE 502 is primarily designed to make the use of claw-back agreements a truly viable prospect when doing e-discovery privilege review.  It should provide some panacea (ideally) for rapidly escalating e-discovery costs.  Or, at least that was the impetus behind the rule’s creation - according to the Comments:

“The proposed new rule facilitates discovery and reduces privilege-review costs by limiting the circumstances under which the privilege or protection is forfeited, which may happen if the privileged or protected information or material is produced in discovery. The burden and cost of steps to preserve the privileged status of attorney-client information and trial preparation materials can be enormous. Under present practices, lawyers and firms must thoroughly review everything in a client’s possession before responding to discovery requests. Otherwise they risk waiving the privileged status not only of the individual item disclosed but of all other items dealing with the same subject matter. This burden is particularly onerous when the discovery consists of massive amounts of electronically stored information.”

In short, FRE 502 is designed to establish uniform, nationwide standards for waiver of attorney-client privilege and work product protection, with the main goal being to protect producing parties against the inadvertent disclosure of privileged materials or work product in either federal or state proceedings.  The salient section is subsection (b) which states that when a disclosure of privileged information is made in a federal proceeding or to a federal agency, the disclosure does not constitute a waiver if:

  1. the disclosure is inadvertent;
  2. the holder of the privilege or protection took reasonable steps to prevent disclosure; and
  3. the holder promptly took reasonable steps to rectify the error, including (if applicable) following Federal Rule of Civil Procedure 26(b)(5)(B).

The end game here is presumably to increasingly leverage automated review methodologies to save costs.  But, in order to facilitate this type of review methodology without taking on unhealthy levels of risk means that claw-back provisions must be as airtight at possible to prevent inadvertent electronically stored information (ESI) productions.  And yet, exactly how FRE 502 will work in practice is up to debate since there isn’t any case law interpreting it yet.

One area that’s top of mind is how this new Rule will impact the recent decisions on e-discovery search, including the Victor Stanley case authored by Chief Magistrate Judge Grimm.  Since FRE 502 contains a core “reasonableness” prong in section (b) it’s likely that Grimm’s proclamation about e-discovery search will still be controlling.  Grimm fundamentally had to evaluate whether the producing party’s search protocols and procedures were in fact reasonable.

“Defendants, who bear the burden of proving that their conduct was reasonable for purposes of assessing whether they waived attorney-client privilege by producing the 165 documents to the Plaintiff, have failed to provide the court with information regarding: the keywords used; the rationale for their selection; the qualifications of M. Pappas and his attorneys to design an effective and reliable search and information retrieval method; whether the search was a simple keyword search, or a more sophisticated one, such as one employing Boolean proximity operators; or whether they analyzed the results of the search to assess its reliability, appropriateness for the task, and the quality of its implementation.” (footnotes omitted).

In Victor Stanley, the producing party wasn’t able to demonstrate reasonableness because they didn’t strategically craft out their strategy nor conduct any sampling to make sure that the e-discovery search worked as designed.  This type of analysis would still seem to come into play under FRE 502 and so, as Grimm states, the use of either a best practices or collaborative approach to e-discovery would seem to be as important as ever.

Given that backdrop it’s just as important as ever that parties “show their work” when it comes to e-discovery search.   Whether FRE 502 will really make parties feel safe enough to use automated review processes (thereby reducing costs) will remain to be seen.  But, this first step which unifies standards and expectations is at least a very positive step.

This post on electronic discovery was originally published on Clearwell's e-discovery blog: "e-discovery 2.0"]


IAP Retention Management Demystified “TechSeries” Part III


By Ronald Brown

The next important topic about retention management pertains to “Litigation Hold” or “Legal Hold”, which is an important part of e-Discovery.  This refers to preserving documents or emails during the course of an legal proceedings.  Usually, this requires documents to be retained past their normal retention cycle.  For example:  your company has a standard retention policy of three (3) years.  You have configured your HP Integrated Archive Platform (IAP) system in this manner so that the retention manager will now auto-delete documents and emails that are three (3) years old.  You now receive a request from internal counsel that requires you to preserve email correspondence for several individuals during the calendar year 2007 that contain the phrase "guaranteed return".  These individuals are now called custodians.  In e-Discovery, a custodian is an individual whose e-mail and data are subject to review.  You need to preserve this data “until further notice”.  What do you do to make sure that this data is protected in 2010 when it would normally be deleted? 

The  feature of the IAP that allows you to process this legal hold request is called “Quarantine”.  The IAP has a unique way of handling these requests.  Some systems allow you to specify a “custodian” and a “timeframe”, and then will initiate the hold.  The IAP allows you to perform query-based legal holds, which grants the administrator/compliance officer more granular access to this capability.  Basically, the administrator will perform a query against the individual custodian repositories using the timeframe as a constraint and the terms "guaranteed return" in the “search for” field.  This allows the administrator to ONLY retain the documents that are relevant, not necessarily all documents.  This is important since you do not want to retain documents that are not responsive to the legal request.  You also don’t want to buy additional storage to retain data which you don’t need.   With the custodian/timeframe method, documents that may have nothing to do with the legal request may be held.  Additionally, in the IAP, if you happen to have an “audit” repository configured, then this even becomes easier, since you do not have to access the individual custodian repositories.  An “audit” repository is basically a super repository of all documents and emails in the archive. 

So, once an administrator saves the results of the query, he/she can then go to the Query Manager page in the PCC and hit the “Quarantine” button.  The hold shall now be executed in the background and the administrator/compliance officer can go about their other daily tasks.

After the legal hold (a.k.a. Quarantine) has been completed, the documents/emails are safe until the hold has been “released”.  After the release, the documents are returned to normal control of the Retention Manager.

Another key feature of the system is that it allows for multiple holds to be executed against overlapping data.  In this case, you may have several administrators, compliance officers and paralegals performing legal holds on different but overlapping data sets.  If a specific document has had several holds applied to it, then that document will not be relinquished to normal control of the Retention Manager until “all” holds have been released.

Email Archiving: Choose Carefully…Very Carefully (Part 4)


By André Franklin

In part 3, we discussed seven principles. If the principles are observed, you are unlikely to ever have the need to migrate to a different archiving platform in the near future.

The seven principles are:

  1. Thoroughly understand your email environment

  2. Set clear archiving goals that will still make sense in 5 years or more

  3. Examine scalability in all dimensions

  4. Don’t treat email archiving as a silo. Consider other applications that need (or will need) data archiving

  5. Favor solutions built on standard interfaces for investment protection

  6. Backup and/or replication is more important than any other single feature

  7. Seek references of companies that have similar needs

We examined in detail principles 1 through 3 in part 3. Let’s examine a couple more principles in this post…

Don’t treat email archiving as a silo

We have heard from many users that email is the biggest pain with regard to implementing archiving. This applies to email archiving for compliance purposes, or simply to lighten the load on mailservers. As such, email is often the first archiving problem to be tackled. It’s a noble deed to take on the toughest problem first, but it’s not a wise deed if future archiving needs are not taken into consideration.

What will you need to archive in the future?  Most environments have files. Many use Microsoft Sharepoint to share departmental and corporate information and content. Then there are instant messaging systems, text messages, voicemail, and so on. There is also database data that can be selectively archived for improved database performance. To complicate matters, information management systems want to control what is stored, for how long I is stored, and who has access to the stored information. All of this must be taken into account when implementing an archive.

In an ideal world, one can perform a single search across a massively scalable archive to retrieve data of various types from email to media files to financial records, etc.

All future archiving needs should be considered at the time the first archiving problem is tackled. If an archiving solution does not address the breadth of application data that you want or will need to archive…you run the risk of trying to migrate your archive data to a new and scalable archiving platform in the future. As we have discussed in previous posts…”it’s ain’t gonna be pretty”…so make the right choices upfront.

Favor solutions built on standard interfaces for investment protection

Solutions built around standard interfaces mitigate certain risks with regard to data interchange -- if a migration ever becomes necessary. In addition to standard interfaces, solutions that expose well-documented API’s also mitigate risks. This allows you to roll your own solution and/or interface with other solutions and add-ons. You never really know everything you will want or need in the future, nor can you know of future products that will add value to your existing archiving investment. Standards and API’s help put the odds in your favor.

We’ll examine the remaining two of the seven principles in part 5 of this series.

Death and Taxes- Maybe one can be avoided!

by Mary Caplice

Benjamin Franklin first said "In this world nothing is certain but death and taxes" but we’ve come across recent evidence from an HP customer that one can be avoided, sort of (but it’s not death, sorry!). This customer has thousands of old, de-commissioned applications that they’re keeping on ‘life support’ for compliance and regulatory reasons.

A conservative estimate is that it costs them approximately $10,000 per year in overhead alone to keep each one going, which adds up to $1 million per year per thousand old applications.  They compare it to paying property taxes on a house that’s empty with nobody living in it. How do they plan to avoid paying needless ‘taxes’? – they plan  to offload the data residing in these old apps to an XML archive using HP Database Archiving software, then shut down the old applications. 

The ROI for the first year alone is HUGE! So, how high are your taxes?

Data Archival = Data Survival

by Ali Elkortobi

Data needs to survive technological evolutions because data may still be needed many years after its creation and active usage. Structured data is particularly challenging since it must be stored in a format that survives:
• Database evolutions and migrations
• Application versions and migrations
• Operating system evolutions and migrations
• CPU/Storage evolutions & migrations

We believe that XML is a great format for storing long-term, survivable data. XML has the advantage of being pure text, where data and its metadata (description of the data) are kept together. XML has become a widely used format adopted by many open source programs and utilities.

• XML is searchable via XQuery and/or text retrieval. Because we are storing structured data as XML, it can be also queried thru SQL
• XML is verbose, but can be compressed by significant ratios. Ideally, queries should be possible on compressed XML

HP Database Archiving software fully complies with our proposal for the data life cycle and enables SQL queries on compressed XML. All you need is the data and an enterprise reporting tool. Encapsulating complex business objects data inside one XML file, opens an opportunity to apply record management to structured data as well as unstructured data. These XML files can be managed by a specialized record management tool such as HP TRIM.

Long live your data.

How long do you keep your backup tapes?


By Patrick Eitenbichler

Over the past week I talked to a number of customers and industry analysts to better understand whether backup tapes are kept for just a couple of months or for many years.  After all, the specs show that the lifespan of e.g. LTO media is 30 years.

To my surprise, I found out that close to half of all customers keep only three months worth of data -- before the tapes get re-used and the data gets overwritten.  The other half uses tape as an archive medium and keep the cartridges for several years (although I found no one who had a 30-year-old tape in their drawer  :-).

Question is...  Do backup administrators keep tapes for a certain period of time because "it's always been this way" -- or because they're following a recently updated backup strategy?  Through data classification and a proactive data protection and archiving strategy, users can achieve a multitude of goals -- all at the same time:

  • Reduce backup windows and simplify recovery:  If data is only kept for a couple of months, a disk-based backup solution such as HP's StorageWorks VLS or D2D using HP Data Protector software would make the most sense -- leveraging low-bandwidth remote replication to store data "off site", and deduplication to minimize storage costs.

  • After classifying data and determining WHAT data needs to be kept for the long term -- whether it's for compliance, e-discovery or corporate governance reasons -- decisions can be made re: what data can be kept on tape (low cost, low energy consumption) vs. and archiving solution such as HP's Integrated Archive Platform (single, searchable repository for all data types).

In short, users can reduce costs and increase IT productivity by calling a time-out and taking a closer look at "how long you keep your backup tapes" -- and WHY?


5 ways to improve email management

Today, 85 percent of business communications occur through email, as the average email user sends and receives 76 messages per day.  This tidal wave of email is creating enormous challenges for almost everyone touched by Microsoft Exchange or Lotus Domino messaging environments. If you’re a CIO, CSO, IT VP/Director, Director of Messaging, legal/compliance officer or a GC, then you know that compliance, email security and control, mail server performance optimization, storage TCO reductions, and e-Discovery preparedness are top of mind challenges that you must resolve.

Today, up to 90 percent of companies from small (1-100 users) to very large (10,000+ users) who have deployed the market leading email applications don’t have an archive solution.  Yet 100 percent of these companies face requirements, whether internal or external, that dictate the need to ensure that their intellectual property is secure, controlled, and available when needed.  Since you’re in the 100 percent category, can you ensure that your email messages are captured, protected, accessible, and managed? HP Email Archiving software (for both Microsoft Exchange and Lotus Domino environments) can help you overcome all of these challenges. It works exclusively with HP Integrated Archive Platform (IAP) to provide long-term retention and high-speed search and retrieval of messages and attachments to reduce the cost and business impact of e-discovery preparation, legal response, and regulatory compliance. This modular appliance approach eliminates the need to purchase separate archiving client software, servers, operating systems, indexing and search software, and content-addressable storage, helping enterprises improve:

Compliance: Many of the 20,000 compliance requirements across the globe require that you enable access to secure email. HP Email Archiving software with HP IAP helps reduce risk of non-compliance with Compliance Archiving (capture all sent and received email—before it becomes a PST), WORM on disk, encryption, and digital fingerprinting—all standard.

Security and control: 75% of corporate intellectual property is contained in email and other messaging applications. HP Email Archiving software with HP IAP enables email to be continuously controlled, secured, and protected with traceable audit trails and simple administration, all transparent to the end user.

Mail server performance: 183 billion email messages are sent daily. HP Email Archiving software with HP IAP lowers storage burden on the mail server, reducing mail server backup volume and speeding the backup and recovery processes.

Storage TCO: HP Email Archiving software with HP IAP reduces mail server storage burden. By reducing the number of users per mail server, server storage costs can be lowered and CAPEX can be deferred on mail server storage and upgrades.  

E-Discovery readiness: Many companies recognize that the cost of a solution enabling email security and e-Discovery readiness is far lower than legal costs associated with either an internal or external audit or legal discovery event. HP Email Archiving software with HP IAP ensures that the #1 culprit in legal discovery cases (email) is securely archived, searchable, and accessible.  By finding secure email quickly and easily, you’ll reduce the demand on IT, lower internal costs, and eliminate the need for expensive outside consultants.

For more information, visit

New Information Heroes on the IM Digital Hub

By Steve Fink 

We have added five new Information Heroes to our Information Management Digital Hub.  Click here to check out the new members of the Information Hero community.  You’ll find:

  • Marty Loeber who oversees e-Discovery for Valero's legal department, discussing how the USA’s largest oil refiner is lowering the cost of e-Discovery and improving outcomes using HP’s Integrated Archive Platform.

  • David Cohen, co-chair of one of the USA’s largest e-Discovery practices for national law firm K&L Gates, discussing why the link between IT and corporate counsel is key to managing both e-Discovery expense and outcomes.

  • Sue Derison, Director of Information Systems for Forsyth County Schools in Georgia, discussing how they are responding to growth, adapting to changing regulation and saving money by using HP TRIM to manage records.

  • Randy Kahn of Kahn Consulting Inc. talking about the importance of e-Discovery in a down economy.

  • Mark Saussure, Director of Digital Libraries for Penn State, discussing how Penn State is working to drive XAM deployment and improve university information management utilizing HP’s Integrated Archive Platform.

Let us know what you think -- especially if you'd like us to publish best practices from other Information Heroes.

Using Replication with the HP Integrated Archiving Platform

By Linda Zhou

The HP Integrated Archiving Platform (IAP) supports local and remote replication. There are two replication methods for copying data between two IAP systems: one-way replication and cross replication. These techniques are in addition to the disk-level mirroring built into IAP.

To illustrate how the replication works, let's look at two examples. Consider two IAP systems: one IAP system, IAP-USA, is in New York City, and the other IAP system, IAP-UK, is in London, UK.  We designate IAP-USA as the master and IAP-UK as the slave. User permissions are maintained at the master and replicated to the slave, for both one-way and cross replication.

One-way Replication

In this scenario, IAP-USA archives emails, but IAP-UK is dedicated only to replicated data from IAP-USA.

IAP-USA first archives emails into its Smartcells. These Smartcells are grouped in primary and secondary pairs. IAP-USA then sends its Smartcell data to IAP-UK to replicate the archived emails. IAP-UK has two options to store the replication data: one Smartcell or a pair of mirrored Smartcells. Email owners and compliance users can search emails in both IAP-USA and IAP-UK. This replication method is also called active-passive replication because IAP-USA is actively ingesting new emails and IAP-UK is passively replicating IAP-USA.

Cross Replication

In this scenario, both systems are archiving new emails, each replicating to the other system. For example, IAP-USA might archive the emails of North American users, while IAP-UK is responsible for archiving European users’ emails. The archived emails are stored in the primary and secondary Smartcells in IAP-USA and IAP-UK. Both email owners and compliance users can search their emails in IAP-USA and IAP-UK. This replication method is also called active-active replication because both IAP-USA and IAP-UK are actively ingesting new emails.

Replication Rates

The effective rate of replication is dependent on the rate of new emails being archived and the available network bandwidth between the two IAP systems. Because the peak time of new emails is during business hours, and less bandwidth may be available during that time, replication may fall behind. Generally, this should not be cause for alarm, as IAP will catch up during periods of reduced network traffic and email volume (e.g. overnight). However, if the replication backlog is consistently growing, the administrator should consider increasing the network bandwidth available for replication.


Email Archiving: Choose Carefully… Very Carefully (Part 3)

By André Franklin

In part 2 of this topic, we raised the question, “How DOES one choose carefully? We listed several basic principles to consider when choosing. If these principles are observed, you are unlikely to ever have the need to migrate to a different archiving platform.

The principles we listed were:

  1. Thoroughly understand your email environment.

  2. Set clear archiving goals that will still make sense in 5 years or more.

  3. Examine scalability in all dimensions.

  4. Don’t treat email archiving as a silo. Consider other applications that need (or will need) data archiving.

  5. Favor solutions built on standard interfaces for investment protection.

  6. Backup and/or replication is more important than any other single feature

  7. Seek references of companies that have similar needs.

Let’s examine a few of these principles in this post…

Thoroughly understand your email environment

Without a good understanding of your mail environment, you’ll play roulette when it’s time to purchase an archiving solution. You must know certain basics, such as: storage capacity in terabytes, messages per second, average message size, max message size, etc. You need to make sure the proposed archiving solution can keep up with traffic in your mail environment. The new solution must be of sufficient size to accommodate all of your terabytes of messages and attachments… with room for growth over the next several years.

Set clear archiving goals that will still make sense in 5 years or more

Are your goals strictly compliance or e-discovery oriented in which you will remove data from the archive after “x” number of years? Are you trying to offload data from mailservers…and expect archived data to grow and grow and grow over the years? Maybe you require all of the above?

If the solution you buy today is already designed to address the picture you have in 5 years… you have reduced much of the risk associated with the purchase of an archiving solution.

Examine scalability in all dimensions

One of the big mistakes made by many is that capacity equates to scale. Just because a storage or archiving device can store the amount of data you want does NOT mean you’ll be able to easily retrieve it when needed. Archives must be designed to ingest data quickly (keep up with traffic in your mail environment)…while allowing rapid search access to archived data. Message retrieval using search can be hours or days with some solutions…and may be quite unacceptable if a judge is waiting for you to produce information based on a number of search parameters.

Before you buy… examine scalability in ALL dimensions:

  • Search performance (retrieval time) for the environment size you envision in 5 years

  • Archiving performance (the number of “typical” messages per second that can be archived)

  • Capacity (the number of terabytes of messages the archive can hold)

Stay tuned…more on these seven principles in part 4 of this series…

IAP Retention Management Demystified “TechSeries” Part II

By Ronald Brown 

So… in the last part we introduced some of the places where you can set retention in the IAP.  The first place was the Domain.jcml configuration file, and the second was the PCC (Platform Control Center).

Let’s start with the Domain.jcml file.  Amongst all the other settings in this file, there are a few items specifically relating to retention of documents.  This file is typically configured during the initial installation or implementation of the product, as some of its settings are used during the start-up phase. 

The following settings exist here:


The best practice for configuring the DomainRetentionPeriodDays setting is to use the “minimum” required document retention period that a company may impose on documents that are stored in the IAP.  This is important because when the system calculates whether or not a document has met its retention period, it uses the “greater” of the domain and repository retention values.  Additionally, you can never set a user’s repository retention to a value lower than this.

The next two settings…


…refer to the “default” retention period that a user’s repository will receive when that user is created or “imported” into the system.  There really is no official difference between a RegulatedUser and an UnRegulated user, except that they can have different retention values for their repositories.  You can assign a user to either group or move them between groups.  Typically, customers will assign regular users to “UnRegulated” and users that are under more stringent regulations to “Regulated”.  An example of this is users that are involved in broker-dealer activities.

The next two settings:


…allow you to prevent documents from being deleted by user-initiated actions before a certain timeframe.  The IAP system provides functionality which, when enabled, will allow user-initiated deletes.  In most environments, this facility is not used, as it is counter intuitive to the concept of an archive. 

What do you think?  Should an archival platform allow end-users to be able to delete documents?  In what cases?

The last setting…


…is used to determine what document date will be used to calculate retention.  The two choices are:


IngestDate is the date the document was archived into the IAP.  SendDate is an interesting one, but in the case of an email, it is the date the email was “Sent”, and in the case of other documents or files, it is the date it was “last modified”.

The settings in the PCC (Platform Control Center) are much easier to understand and really only a subset of the settings in the Domain.jcml file.  Basically, these settings are mostly intuitive since they are GUI-based and allow you to:

    1. Reset the Domain retention period (but not lower than was originally set)

    2. Change the Repository retention period for any user (Regulated or UnRegulated)

    3. Disable Retention (This is a misnomer, as you are not disabling “Retention”, but disabling the auto-delete capability of the Retention manager.)

Let’s go back quickly to the concept of “disposition”.  Remember, this is what you do to a document after it has met its retention period.  Well, currently in the IAP, the only disposition for a document that has met its retention period is “auto-delete”.   Auto-delete is a process that runs nightly and determines which documents are eligible for deletion based on the retention settings.  This process then proceeds to remove the documents from the archive and indexes.  So, when you disable “Retention”, you are really just disabling the Retention Manager from running the auto-delete process.

In the next part, we will discuss the capability of the system to put documents on “Litigation-Hold”, bypassing the Retention manager.  This function is crucial when deploying an archive with auto-delete capabilities.

Archiving Databases: Throwing the Baby Out with the Bathwater

By Mary Caplice

When archiving data from relational databases for either compliance or performance reasons, it is standard practice to archive at the business transaction level of granularity rather than at the table, block or partition level.  In most cases it’s very straightforward to model a transaction for archiving so that the transaction is moved intact without leaving parts of it behind, except where there are many-to-many relationships where transactions become ‘chained’ to other transactions.

For example, you could have an application containing customers, invoices and payment information. At first it may look like all invoices with a status of ‘CLOSED’ older than one year can be archived. However, several payments can be linked across invoices, and in some cases partial payments are keeping some of these invoices ‘OPEN’. All invoices across the chain of payments must be considered. Any part of a chain that is open should disqualify the entire chain. Without support for chaining the integrity of the application can be compromised.

Unless your archiving solution can chain these transactions together correctly, you could be left with dangling transactions! 


Compliance – Importance of Green House Gas Regulations

By Reiner Lomb

Compliance continues to rise in importance to the business and to IT. According to the 2008 compliance survey by TechTarget, more than two thirds of the respondents said, compliance is more important at their organization this year. And half of the respondents said they will spend more in IT in the next twelve months to support compliance.

While there is a broad range of regulations that drive the need for increased compliance investments, a new type of regulations needs attention. Based on worldwide efforts to slow down global climate change, new regulations are underway to reduce Green House Gas (GHG) Emissions. Green House Gas regulations are emerging, both, at the state as well as at the national level.

The Western Climate Initiative (WCI) is an example of a regional carbon based cap and trade program. WCI Partners include the western states of the United States and Canada.  WCI partners will begin reporting emissions in 2011 for emissions that occur in 2010.

Australia, one of the countries that signed the Kyoto protocol, provides an example of new Green House regulations at the national level. On July 1, 2008, an estimated 700 Australian companies kicked-off the greenhouse gas emissions reporting process under the National Green House and Energy Reporting Act (NGER) in preparation for the introduction of the emissions trading scheme in 2010.

Other countries that signed the Kyoto protocol or those that will sign a still to be negotiated follow-up agreement have or will implement similar regulations. This will drive the need for Companies world wide to understand the requirements and deploy GHG emission management solutions.

Over the next weeks and months I will explore, what Information Management requirements need to be addressed by GHG emission management solutions. …Stay tuned.

Showing results for 
Search instead for 
Do you mean 
About the Author(s)
  • This account is for guest bloggers. The blog post will identify the blogger.
  • For years I've been doing video and music production back and forth between Boston MA and New Orleans LA. Starting in 2010, I've began working with Vertica (now HP Vertica) in the marketing team, doing customer testimonials, product release videos, and website management. I'm fascinated by Big Data and the amazing things my badass team at HP Vertica has done and continues to do in the industry every day.
HP Blog

HP Software Solutions Blog


Follow Us
The opinions expressed above are the personal opinions of the authors, not of HP. By using this site, you accept the Terms of Use and Rules of Participation.