Conquering the Data Mountain

The desktop computer revolution and the arrival of the Internet have produced a seemingly endless amount of data. It is technically possible now to keep virtually every electronic record that comes into existence.

The ability to create, maintain and use this huge volume of data raises important technical and legal issues. Call it the data mountain challenge, a contemporary dilemma faced by nearly all industries and many government agencies with broad-reaching implications.

So why not just get rid of the data that we think we don't need? Because there are costs and other burdens associated with attempting to eliminate electronic records on a selective basis.

In many instances, there may be legal constraints on attempts to destroy electronic records. Most businesses and institutions keep vast and ever-increasing quantities of outdated and useless records, which, if anyone looked closely, include many records that could, in the light of litigation, be viewed as inappropriate, embarrassing or absolutely devastating.

There have been several recent, headline-grabbing lawsuits and investigations about alleged corruption by corporations and individuals in which electronic records (especially e-mails) have served as vital evidence for one side or another.

Today's law schools teach every student the fundamentals of electronic research, and virtually every lawyer's office now relies on those capabilities. Judges increasingly are comfortable with electronic discovery and know about highly efficient and cost-effective search and "mining" techniques that have made it possible to sort vast quantities of data effectively.

As a result, the data mountain -- especially in the legal world -- no longer is an impossible summit to scale but a vast database that can be mined for secrets and insights that previously were unavailable.

But is it wise to keep excessive data? Should it be -- and can it be -- destroyed? And if so, how?

Retain or delete?

Even if a business has a document retention policy and employees apply the policy correctly by doing their housekeeping (deleting old e-mails and ridding disks of documents), making the data truly disappear is not easy.

"Deleted data" can continue to exist nearly forever in forms that range from immediately available to quite costly to recover -- but recoverable nevertheless. Unlike the infamous 18-minute gap in the Nixon Watergate tapes, even "deleted" e-mail, voice mail and other electronic records may be restored with enhanced computer-data recovery techniques.

Discarded data can lurk in a number of spots that the average user may not even know about. Take the personal computer running Microsoft Windows, for example. Computer professionals can find data in a variety of places including:

"Recycle Bin." Just as with a real trash can, if you accidentally toss something in the Recycle Bin, you can retrieve it.

Even when a user "empties" the Recycle Bin, the electronic data persists. A computer, just like a card catalog in a library, keeps track of shelf space, with a map that indicates which slots on the shelves are occupied and which are empty.

If we tell a computer to "really" delete a document (rather than merely to move it to the Recycle Bin), the computer's map of shelf space is updated to indicate that a spot on the shelves is available for another document. The catalog "card" used to index the document is not fully destroyed but just marked as "gone." The ability to identify such documents by reviewing descriptive document names may make it possible for a computer expert to reconstruct other information about the document (such as the date that it was created and the last time it was modified).

Temporary copies. These often are created during word-processing sessions to archive a document in progress, in case the computer crashes. This means that "shadow" versions of documents exist on the computer's hard drive, in places that the user does not know about but that computer forensics experts do know about. And to twist an old phrase, the shadow knows -- and it will tell on the user.

"Carbon-copy" e-mails. The rapid and uncontrolled dissemination of electronic data through "cc" e-mail lists creates additional problems. Even if one user is diligent in deleting copies of a document, everyone on the list must do likewise or the document never will be safely deleted.

Electronic "shredding" of documents, like shredding of paper, can make reconstruction of records very difficult. But such shredding cannot entirely eliminate the possibility of reconstruction. There are several commercially available tools that can help. Most disk shredders typically attempt to address the persistence of deleted data by "overwriting." Thus, swap file residue, deleted files and file names (any of which may contain all or part of deleted documents) are overwritten with random data. Still, overwriting data may not be enough. Experts have shown that magnetic traces may be recoverable (like holding up to the light a "whited-out" typewritten page to see the faint images under the dried goo). It may be effectively impossible to sanitize storage locations by simply overwriting them, no matter how many overwrite passes are made or what data patterns are written.

For the total elimination of data, most computer professionals would advise following the practices of the U.S. military for classified information stored on any magnetic media: It is destroyed, usually by melting down the data-carrying media -- an extreme but ultimately effective method.

Important concern

Whether you view the growing volume of data as a source of improved efficiency, insight and productivity in business or one of burden and pressure in litigation, it seems clear that the trend toward ever-increasing creation and use of electronic records is unstoppable.

Electronic records management, therefore, should be a major concern for every major business. So make it a priority.

Clearly identify categories of records that should not be retained. Review of legal constraints on document destruction should be an essential part of this task.

Recognize that document-destruction practices, once initiated, may be difficult to stop. Have clear contingency plans that will permit the business to preserve data whenever a dispute arises that may require use of the records, and certainly when a specific demand has been made for records in a lawsuit or by a regulatory agency.

Pay particular attention to concerns about privileged or highly confidential records.

Take steps to ensure that documents that should be destroyed truly are. Identify an "official" location for important records and encourage employees to discard all drafts and copies of the records, other than those in the official location.

Finally, recognize that there are no magical solutions to the problems of managing the data mountain. But by employing the latest practices and by implementing clear policies, you can make the data mountain work for you.

© Copyright 2003 Star Tribune. All rights reserved.

February 2003

As seen in:

Star Tribune

Star Tribune