Business Case: Finding the needle in the e-haystackWritten by Vawn Himmelsbach Issue Date: August 2007
In the past, corporate counsel simply searched through filing cabinets or boxes to find relevant documents, such as contracts or letters, when searching for that proverbial needle in a haystack. But in today’s electronic world, they often have to search through employee e-mail, servers, and back-up tapes as well.
“They can’t just browse through 100 people’s e-mail boxes looking for an e-mail that might be relevant,” says Martin Felsky, CEO of Commonwealth Legal.
In one case, a client had collected 40 terabytes of data, which is twice the size of the U.S. Library of Congress. “Corporate counsel definitely need tools to find documents that might be relevant to a case.”
Most companies don’t have those tools, although they may have basic building blocks in place. If a company is using Microsoft Outlook as its e-mail program, for example, corporate counsel can use Outlook’s built-in search capabilities. But it only searches the body of an e-mail, not attachments.
“It’s not designed for litigation purposes where I need to make sure that I search everything,” says Felsky.
Another problem is something he refers to as the “moral hazard.” If a manager is accused of sexual harassment in the workplace, for example, it’s possible he or she will delete or conveniently forget to forward any relevant e-mail to corporate counsel. “You have to take control of the search for these documents,” says Felsky. “You can’t let employees of the company determine what’s relevant and what’s not relevant.”
In 2006, the U.S. brought out new rules in civil procedure to deal with e-mail, and although Canada doesn’t have similar legislation at this point, the implications are global, says Ross Armstrong, senior research analyst with Info-Tech Research Group. The legislation is intended to reduce the length of civil cases, which are often drawn out for months when e-mail is subpoenaed as evidence.
If the defendant is unable to produce these subpoenaed documents in a certain time frame, judges will often rule against them. So, while e-mail evidence is all-important, says Armstrong, being able to find those e-mails quickly is just as critical.
Data mining includes numeric mining, which involves aggregating and extrapolating statistics and trends. The other, and slightly newer, interpretation is around text mining, which uses words or concepts to find documents that might be useful, says Mike Savage, a partner with fraud investigation and dispute resolution at Ernst & Young.
Text mining is much more efficient than manual methods, he said, because you’re using computer horsepower rather than intellectual horsepower. “You’re using the ability of the computer to work all night hunting through search strings and come back with this word close to that word,” he says.
Human beings are more fallible and might gloss over the keyword they’re looking for. Computers, on the other hand, are 100-per-cent accurate. And today’s software is good at finding deleted files; Microsoft Windows may delete the cross-reference, but it hasn’t wiped the file off the hard drive.
And this can help corporate counsel find that smoking gun e-mail. Or, in some cases, they’re able to clear a client using e-mail correspondence. “Sometimes the defence lies in understanding the context more than just the little sound bite,” said Savage.
KPMG uses data clustering software tools to extract words or concepts from electronic files, including Word documents, Excel spreadsheets, PowerPoint presentations, and e-mail.