Get Adobe Flash player


Search Terms for Audio: Iterate Your Way to Success

We’re involved in several very high profile matters at the moment, each with thousands of hours of audio, some of it in multiple languages. And I sat through a project team meeting today where we were discussing the set of search terms that one of the law firms had developed to start running against these audio files.

What transpired during this meeting is so common I thought I would pass it along.  You see, quite understandably, the law firm that developed the search terms took them directly from the same set of terms that had been developed for the email search.  But the reality is that people tend to speak very differently than they write, so I spent a good thirty minutes going over the search terms and providing suggestions to shrink the list and make it more realistic.

Confidentiality prevents me from using any of the real terms from this case, but here are some illustrative examples:

  • People don’t talk like they “text”.  You’ll never (well, seldom) hear someone actually say “LOL” or “TTFN”.  (Although I have been known to say “WTF” from time to time!) Granted, these aren’t likely to be meaningful search terms themselves, but other such contractions that may be used in emails between traders will have another spoken equivalent.
  • Proper names, especially people’s names, tend to morph quite a bit in spoken form from what you may see in email. Around the office people call me “Der Schlueter” with a really bad German accent. But I can’t remember the last time anybody used either Jeff or Schlueter in an email. Names are often omitted because the recipients are assumed based on the addresses used.
  • Certain types of information have only one form in which they would typically appear in text, but could be spoken in many different ways. Numerical data is like this.  Somebody may purchase  1,900 shares of a security, but the trader might say “one thousand nine hundred” or “nineteen hundred” which in an audio search are two totally different constructs.

During the course of the aforementioned meeting, one of the review team leaders finally came up with the suggestion that I had been hinting at all along. Which is that, instead of spending a lot of time THINKING about what the search terms should be, the better approach is to simply start searching with a few of the most realistic and highly probable terms that will bring up the responsive files. Then start listening to these files, and getting a better understanding of the language used and which terms will be the most relevant for searching.

You don’t have to listen to hundreds of hours to do this. In my experience, listening to just one hour of different calls for each major custodian will give you a great idea of the best terms to use.  Develop the term list from there, do some searching, and listen to some more.  You may come up with another set of terms that you can then add to your search criteria and iterate through again. This iterative process is what will help you round out your search term list and be confident in the results.

Print Friendly

Audio Discovery: The Real Costs of Human Review

Two excellent reports have come out in the last year or so that address a pair of related issues: the growing costs of e-discovery, and the use of technology assisted review to help curtail those costs. While neither one addresses audio discovery specifically, the general thesis still applies; technology really can help you do things better, and cheaper. Who doesn’t like better and cheaper?

Well, there is actually an answer to that question which I’ll get back to in a minute. But first, a bit more detail on the two reports I mentioned.

The first is an article from the Richmond Journal of Law and Technology by Maura Grossman and Gordon Cormack.  The link will download the entire article for you so I’ll spare you the legal citations, and in a short blog entry I have nowhere near the time to cover all the points. But I quote the first two sentences in the Conclusion of the report:

Overall, the myth that exhaustive manual review is the most effective – and therefore, the most defensible – approach to document review is strongly refuted. Technology-assisted review can (and does) yield more accurate results than exhaustive manual review, with much lower effort.

Why does manual review fare so poorly in this competition? Lots of reasons, but a big piece of it is reviewer fatigue, and also that reviewers make mistakes and often don’t agree on the significance of what they’ve read. Shocking! Not everyone thinks alike.  Go figure.

The second report from the Rand Institute for Civil Justice, titled “Where the Money Goes”, looks at the cost elements involved in discovery. Again, the link is there for you to download a summary or the whole report, so I want to key on just one element. When looking at the costs of producing electronic documents, their finding was that 73% of the cost came during the Review component of the EDRM. What does that really mean?

It means that no matter how much people gripe about charges from the e-discovery vendors, it’s still all those in-house and outside attorneys, paralegals and other folks who are eyeballing the documents that drive the total cost in the process. And as with the Grossman article, the Rand report provides evidence that technology can help make the whole process better, and cheaper.

How does this apply to audio discovery? For years, if any party presented or requested large bodies of audio evidence for discovery, the expected process for managing this discovery was human review. And it generally takes about  4 hours of human time to review each 1 hour of audio. So if even a bargain-basement contract attorney makes $75/hour, that’s $300/hour of audio in review costs. Even a fairly small 1000 hour project would create a $300,000 cost, and most of the time the parties would just cry “unduly burdensome” and whisk it under a rug.

Fast forward to today, and effective audio discovery technology exists that has been proven effective in federal regulatory investigations, criminal cases and other litigation matters. It can lower costs by as much as 80%, in much the same way that technology assisted review lowers other e-discovery costs.  And yet, we see an interesting phenomenon, which is that many law firms still espouse the use of manual review to run these audio projects. Who wouldn’t want something better and cheaper?

People often ask me who my competition is in the audio discovery arena. And while there are a few other technology providers in this space, my answer to this question is actually different. My biggest competition is…wait for it…the billable hour.  Law firms make profit on billable hours. They don’t make profit on e-discovery costs (generally speaking).

I realize this is a bold and harsh statement, and I wouldn’t make it so blatantly except 1) I have heard from actual law firms who confirmed it for me, and 2) I’m not sure how many people are reading this blog yet, so I could use some publicity!

But seriously, if you have an opinion on this, weigh in here. Comments are welcome!

Print Friendly

Welcome to the Audio Discovery Blog!

Rest assured, we realize the last thing anybody needs right now is yet another boring blogspot to monitor, with esoteric topics that would make watching grass grow seem like a night at the Cineplex. (But hey, watching grass grow is at least a real 3-D activity!)

But the fact remains that audio is a burgeoning source of evidence for both regulatory and litigation investigations, and from all the evidence we’ve compiled in the industry, it is evident that this is one type of evidence that is evidently being ignored WAY TOO OFTEN.

We might argue that this fact is self-evident, but then we’d be taking our puns just entirely too far.

So the purpose of this blog is to help enlighten and educate our audience on the ins and outs of dealing with audio evidence, because one thing is very true: audio is not like email, word documents, TIFF images or any of the other kinds of electronically stored information (ESI) that make up the rest of content we deal with in the e-discovery world. Which is why we coined the term…

Audio Discovery.

So, let’s kick this off near the top of the Electronic Discovery Reference Model and talk about how to Identify audio evidence and the likely places it can come from. We have seen projects from many different walks of life: hundreds of hours of body-mic recordings from a personal defamation case; thousands of hours of phone wire-taps in criminal gang activity; even archived radio and TV advertisements (including video) that were searched for false advertising.

But the PRIMARY source of content we see regularly–the content that has literally grown to hundreds of thousands of hours–comes from trading floor activities in both energy and financial services. These tend to be the Big Kahuna matters in the audio discovery world, which if you think about it makes sense. These trading activities are routinely recorded and kept for long periods of time, as in many cases they are the only record of a transaction request. And let’s face it: the last decade has shown that, well, not EVERYONE who engages in this activity has the most stellar reputation. So these recordings have the potential to contain lots of ripe, juicy content that both regulators and litigators would just love to wrap their ears around.

So in upcoming posts, we’ll use these types of matters as the foundation to discuss the elements of audio discovery that will be important for you. Here’s a look at just some of the topics that we’ll cover:

  • Audio is a time-based medium and best measured that way. Measuring projects by the gigabyte could be a big rip-off!
  • Who’s listening? The Federal Regulators, that’s who. And you should be, too!
  • How accurate is “accuracy” in audio discovery, or the trade-off between precision and recall.
  • Audio vs. Text Search: All is Not Created Equal

Throughout this blog, our goal will be to educate and make you think about how audio discovery applies in your world, whatever that world is. Whether you are a compliance manager in a financial services firm, an auditor with a government regulator, or an attorney with clients facing litigation or regulatory oversight, you are now–or soon will be–faced with handling audio evidence in one fashion or the other.

So we want you to be prepared to do it with aplomb. And that means quickly, accurately AND cost effectively.

Audio no longer has to be the “dirty little secret” that gets swept under the rug during a Rule 26 Meet and Confer. With tools and techniques we’ll cover, your Audio evidence can rise up and be Discovered!

Print Friendly


Can you HEAR me now?

When it comes to audio evidence, the answer is oftentimes “NO!”

And this is unfortunate, because audio evidence (or “sound recordings” as the FRCP likes to say) are becoming a critical source of discovery content in both regulatory and litigation matters. So the purpose of this blog is to help you learn what Audio Discovery is all about and how to do it in the most efficient and cost-effective ways.

As your Bloggist, I bring 20+ years of experience in audio technologies to the table, first in the old Ma Bell system and then later with companies like Cingular Wireless and now Nexidia. So I’ve witnessed first-hand many of the revolutions in digital audio that are now dramatically changing how you manage this important discovery component. In this blog, I will help you navigate these .WAVs so you can be an audio expert too. And if you didn’t get that pun, even more reason to come back often!

Jeff Schlueter
VP/GM, Legal Markets