I often find myself promoting the fact that Nexidia supports more than 35 languages world wide, including different “language packs” for both American and British English. People wonder why we bother with this; aren’t they essentially the same language? And since Nexidia is capturing the phonemes why can’t we just have a standard English language pack and be done with it?
Well, Yanks and Brits can certainly understand each other (for the most part) on each side of The Pond. But that’s because the human brain has an amazing ability to adapt and recognize patterns and nuances and put things into context on the fly. So when an American says “aluminum” but a Brit says “aluminium” most people realize right away they mean the same thing. But these two words do sound different, especially when you factor in the vastly different accents and dialects across the UK. So the reason we have two different language packs for essentially the same language goes back to this: we need to accurately capture the sounds made by speakers of each language, and we need to support the search and retrieval of those sounds using the common text expressions that represent those words and phrases.
Here’s a classic illustration. Let’s ponder the word “advertisement”. It’s spelled the same in both the US and the UK (and Canada…let’s not forget our northern neighbors). But it’s pronounced quite differently.
In the US, it’s ad-ver-TISE-ment.
In the UK, it’s ad-VER-tiz-ment.
So in order to provide the most accurate search possible, the Nexidia engine first captures the spoken sounds (phonemes) that are used to represent this word in a recording. Then, when the user enters the text expression to search, we convert this text back into the appropriate sounds that are representative for the accents and dialects for a particular language and find all the matches. In the North American English language pack, we know to look for ad-ver-TISE-ment, while in the UK English language pack we look for ad-VER-tiz-ment.
I haven’t even touched on the fact that we have yet another English language pack for our Aussie mates (or should I say “Ozzie mites”?). I suspect that Down Under, the word for advertisement is “Fosters,” beer being the most popular consumer product. (And yes, I know that Fosters isn’t actually popular in-country…but if I said “Four X” or “Tooheys” the rest of the world wouldn’t get my joke.)
This was obviously just one example of the literally hundreds of thousands of permutations and differences that exist even between what are essentially the same language. But it helps you better understand the work Nexidia has put in to make sure that this is all transparent to the end user. With that, I’m off to pop open a bottle of Bud, put some prawns on the barbie and settle in to watch some soccer…I mean, football!
We’re involved in several very high profile matters at the moment, each with thousands of hours of audio, some of it in multiple languages. And I sat through a project team meeting today where we were discussing the set of search terms that one of the law firms had developed to start running against these audio files.
What transpired during this meeting is so common I thought I would pass it along. You see, quite understandably, the law firm that developed the search terms took them directly from the same set of terms that had been developed for the email search. But the reality is that people tend to speak very differently than they write, so I spent a good thirty minutes going over the search terms and providing suggestions to shrink the list and make it more realistic.
Confidentiality prevents me from using any of the real terms from this case, but here are some illustrative examples:
- People don’t talk like they “text”. You’ll never (well, seldom) hear someone actually say “LOL” or “TTFN”. (Although I have been known to say “WTF” from time to time!) Granted, these aren’t likely to be meaningful search terms themselves, but other such contractions that may be used in emails between traders will have another spoken equivalent.
- Proper names, especially people’s names, tend to morph quite a bit in spoken form from what you may see in email. Around the office people call me “Der Schlueter” with a really bad German accent. But I can’t remember the last time anybody used either Jeff or Schlueter in an email. Names are often omitted because the recipients are assumed based on the addresses used.
- Certain types of information have only one form in which they would typically appear in text, but could be spoken in many different ways. Numerical data is like this. Somebody may purchase 1,900 shares of a security, but the trader might say “one thousand nine hundred” or “nineteen hundred” which in an audio search are two totally different constructs.
During the course of the aforementioned meeting, one of the review team leaders finally came up with the suggestion that I had been hinting at all along. Which is that, instead of spending a lot of time THINKING about what the search terms should be, the better approach is to simply start searching with a few of the most realistic and highly probable terms that will bring up the responsive files. Then start listening to these files, and getting a better understanding of the language used and which terms will be the most relevant for searching.
You don’t have to listen to hundreds of hours to do this. In my experience, listening to just one hour of different calls for each major custodian will give you a great idea of the best terms to use. Develop the term list from there, do some searching, and listen to some more. You may come up with another set of terms that you can then add to your search criteria and iterate through again. This iterative process is what will help you round out your search term list and be confident in the results.