We’re involved in several very high profile matters at the moment, each with thousands of hours of audio, some of it in multiple languages. And I sat through a project team meeting today where we were discussing the set of search terms that one of the law firms had developed to start running against these audio files.
What transpired during this meeting is so common I thought I would pass it along. You see, quite understandably, the law firm that developed the search terms took them directly from the same set of terms that had been developed for the email search. But the reality is that people tend to speak very differently than they write, so I spent a good thirty minutes going over the search terms and providing suggestions to shrink the list and make it more realistic.
Confidentiality prevents me from using any of the real terms from this case, but here are some illustrative examples:
- People don’t talk like they “text”. You’ll never (well, seldom) hear someone actually say “LOL” or “TTFN”. (Although I have been known to say “WTF” from time to time!) Granted, these aren’t likely to be meaningful search terms themselves, but other such contractions that may be used in emails between traders will have another spoken equivalent.
- Proper names, especially people’s names, tend to morph quite a bit in spoken form from what you may see in email. Around the office people call me “Der Schlueter” with a really bad German accent. But I can’t remember the last time anybody used either Jeff or Schlueter in an email. Names are often omitted because the recipients are assumed based on the addresses used.
- Certain types of information have only one form in which they would typically appear in text, but could be spoken in many different ways. Numerical data is like this. Somebody may purchase 1,900 shares of a security, but the trader might say “one thousand nine hundred” or “nineteen hundred” which in an audio search are two totally different constructs.
During the course of the aforementioned meeting, one of the review team leaders finally came up with the suggestion that I had been hinting at all along. Which is that, instead of spending a lot of time THINKING about what the search terms should be, the better approach is to simply start searching with a few of the most realistic and highly probable terms that will bring up the responsive files. Then start listening to these files, and getting a better understanding of the language used and which terms will be the most relevant for searching.
You don’t have to listen to hundreds of hours to do this. In my experience, listening to just one hour of different calls for each major custodian will give you a great idea of the best terms to use. Develop the term list from there, do some searching, and listen to some more. You may come up with another set of terms that you can then add to your search criteria and iterate through again. This iterative process is what will help you round out your search term list and be confident in the results.
Many audio discovery projects are fairly straightforward. You may have a few hundred or a thousand hours of recorded calls or voicemails and you just need to load them all up and search through them interactively to find what’s relevant, responsive, privileged, or otherwise noteworthy.
But we see some audio projects that start out at a GARGANTUAN stage. Maybe it’s a Regulatory request, or an overly zealous opposing counsel that asks for every recording ever made. Whatever the reason, we sometimes see projects that start in the tens of thousands of hours, and even a few that looked to be over 100,000 hours. While it would certainly be lucrative for us to process and host all that audio, the fact is, even with mondo-discounting the price would still rise to the level that some might consider “unduly burdensome.”
Enter stage right: Data culling in audio discovery is here to save the day!
Two years ago the buzzword was Early Case Assessment. Now it’s shifted to Predictive Coding or Technology Assisted Review. Whatever you call it, the process is essentially about using a rules-based approach to screen through content and identify the files that are most likely to be on-target. And this can work as well with audio discovery as it does with our text-based relative.
In our experience, culling data for audio discovery takes two forms:
- Expression based
- Voice Activity based
Expression-based culling is much the same as what you do with textual documents. You simply identify the phrases or concepts that are likely to point to content of interest, and you run these against the full file set to identify your targets. If you’ve read my earlier posts, you know about the differences between searching text and audio, specifically as it relates to precision vs. recall. One of the big challenges with expression-based culling is to identify the optimal thresholds that will maximize the precision vs. recall trade-off. After all, with the culling process you are pulling files out of the mix that won’t be easily available for further review, so you need to be careful. Make sure you or your vendor is using statistically valid methods to test your culling criteria.
Voice activity based culling is unique to audio discovery, but can also be a big time and money saver. The need for this shows up quite often in trading floor investigations, especially when recordings are made from the open-mic or “squawk box” systems that are still in use. These systems can lead to hours and hours of silence, where the logger dutifully keeps making a recording even during off hours when no one is around. So being able to screen for presence of voice, or for a certain percentage of voice during a recording, is critical to screening these calls out and avoiding further processing and hosting charges on them.
Employing these methods, we have seen projects reduced by 50-90% in terms of total hours that ultimately go into deeper review, which saves time and money for everyone involved. Score another one for the value of technology assisted review!
Two excellent reports have come out in the last year or so that address a pair of related issues: the growing costs of e-discovery, and the use of technology assisted review to help curtail those costs. While neither one addresses audio discovery specifically, the general thesis still applies; technology really can help you do things better, and cheaper. Who doesn’t like better and cheaper?
Well, there is actually an answer to that question which I’ll get back to in a minute. But first, a bit more detail on the two reports I mentioned.
The first is an article from the Richmond Journal of Law and Technology by Maura Grossman and Gordon Cormack. The link will download the entire article for you so I’ll spare you the legal citations, and in a short blog entry I have nowhere near the time to cover all the points. But I quote the first two sentences in the Conclusion of the report:
Overall, the myth that exhaustive manual review is the most effective – and therefore, the most defensible – approach to document review is strongly refuted. Technology-assisted review can (and does) yield more accurate results than exhaustive manual review, with much lower effort.
Why does manual review fare so poorly in this competition? Lots of reasons, but a big piece of it is reviewer fatigue, and also that reviewers make mistakes and often don’t agree on the significance of what they’ve read. Shocking! Not everyone thinks alike. Go figure.
The second report from the Rand Institute for Civil Justice, titled “Where the Money Goes”, looks at the cost elements involved in discovery. Again, the link is there for you to download a summary or the whole report, so I want to key on just one element. When looking at the costs of producing electronic documents, their finding was that 73% of the cost came during the Review component of the EDRM. What does that really mean?
It means that no matter how much people gripe about charges from the e-discovery vendors, it’s still all those in-house and outside attorneys, paralegals and other folks who are eyeballing the documents that drive the total cost in the process. And as with the Grossman article, the Rand report provides evidence that technology can help make the whole process better, and cheaper.
How does this apply to audio discovery? For years, if any party presented or requested large bodies of audio evidence for discovery, the expected process for managing this discovery was human review. And it generally takes about 4 hours of human time to review each 1 hour of audio. So if even a bargain-basement contract attorney makes $75/hour, that’s $300/hour of audio in review costs. Even a fairly small 1000 hour project would create a $300,000 cost, and most of the time the parties would just cry “unduly burdensome” and whisk it under a rug.
Fast forward to today, and effective audio discovery technology exists that has been proven effective in federal regulatory investigations, criminal cases and other litigation matters. It can lower costs by as much as 80%, in much the same way that technology assisted review lowers other e-discovery costs. And yet, we see an interesting phenomenon, which is that many law firms still espouse the use of manual review to run these audio projects. Who wouldn’t want something better and cheaper?
People often ask me who my competition is in the audio discovery arena. And while there are a few other technology providers in this space, my answer to this question is actually different. My biggest competition is…wait for it…the billable hour. Law firms make profit on billable hours. They don’t make profit on e-discovery costs (generally speaking).
I realize this is a bold and harsh statement, and I wouldn’t make it so blatantly except 1) I have heard from actual law firms who confirmed it for me, and 2) I’m not sure how many people are reading this blog yet, so I could use some publicity!
But seriously, if you have an opinion on this, weigh in here. Comments are welcome!