Who’s Talking Now?
I’ve written before about the importance of meta-data when dealing with audio recordings, especially telephone recordings captured by one of the many “call loggers” that are deployed in trading floors and call centers around the world. These call loggers can capture a wealth of information about each call, including the date and time of the call, the extension or name of the agent/custodian, the physical location of the phone, and many other variables. It’s an invaluable way of making sure that you can use this meta-data to isolate the recordings that really matter.
But there are times when even this meta-data isn’t enough to truly help you narrow your review content. In the case of telephone recordings, this is often the case when you are trying to isolate calls between specific speakers. You may have meta-data for one side of the call…say, the agent whose phone is being recorded…but you may not have any way to identify the other party in the call.
This is where a good Speaker Identification system can really save the day and help turn a three month project into a 3 day project…or even better! So how does it work?
The first step is to build a model of the speakers that you are trying to identify. To do this, you need representative audio samples of these known speakers. And surprisingly, you don’t need to have that much. For example, with Nexidia’s AudioFinder application you need only about 5 minutes of speech for each of the speakers that you want to ID. And this can come either from entire recordings, or from snippets of recordings that you identify during the model-building process.
Building the model is as easy as collecting the samples from each speaker and telling the system “go build my model.” It’s similar to the process used for developing adaptive coding models in the text world, but much quicker and easier to implement. Once you have the model built, you simply run the Speaker Identification routine by applying the model against the audio files in question and out pop the results. In the case of Nexidia, the results are a list of files that are mostly likely to contain the speakers in question, and illustrated segments within those files where the speakers are found.
In one recent case involving a regulatory investigation of a multinational bank, our client was looking for calls between three of its traders and fourth individual. Meta-data could identify the calls for the three traders, but Speaker ID was used to find the fourth speaker. Sifting through more than 25,000 calls, we were able to identify just over 150 that contained that fourth speaker, and turn those calls back to the client review team for final analysis. The total length of time it took to process these calls and return the results was less than three hours.
And think of the time saved by the final review team…it’s the difference between listening to 150 calls, and listening to 25,000. You can do the math on that!
When it comes to audio evidence, the answer is oftentimes “NO!”
And this is unfortunate, because audio evidence (or “sound recordings” as the FRCP likes to say) are becoming a critical source of discovery content in both regulatory and litigation matters. So the purpose of this blog is to help you learn what Audio Discovery is all about and how to do it in the most efficient and cost-effective ways.
As your Bloggist, I bring 20+ years of experience in audio technologies to the table, first in the old Ma Bell system and then later with companies like Cingular Wireless and now Nexidia. So I’ve witnessed first-hand many of the revolutions in digital audio that are now dramatically changing how you manage this important discovery component. In this blog, I will help you navigate these .WAVs so you can be an audio expert too. And if you didn’t get that pun, even more reason to come back often!Jeff Schlueter
VP/GM, Legal Markets