Google introduces audio indexing

Google introduces audio indexing

The GAudi tech finds videos containing your search term, and even highlights exactly when it's spoken.

If you've always wondered how search engines are likely to cope with the growing amount of non-textual data filling up the tubes, Google is – not surprisingly – working on an answer: audio indexing.

Designed for the company's YouTube video sharing service, the GAudi – not to be confused with a partnership between everyone's favourite search giant and a car manufacturer – technology was previously available via iGoogle but has now been deemed accurate enough to warrant its own page on the Google Labs site, first spotted by Blogoscoped.

The premise is simple: a speech recognition engine catalogues all the words uttered during an audio or video clip and adds its findings to a traditional database, searchable in the same way as the main Google engine searches text-based websites. Sounds easy, but as anyone who has used speech recognition programs in the past will tell you: it's very hard to do right.

With this in mind, it's perhaps unsurprising that Google has chosen to restrict the beta application to indexing videos taken of US political candidates' speeches: fairly straight-forward stuff from a voice recognition standpoint, with little extraneous noise or background music to deal with. Even so, the technology is already showing promise – rather than relying on human-driven metatagging, the technology will make finding video and audio content that much easier – simply search for a key word, and if it is spoken at any time in the video you'll get a hit.

The tech is pretty slick, too: as well as simply flagging the videos containing the term the system will also mark the precise moment at which the word is uttered. There's even a little snipped of a transcript with your term highlighted beneath the video. Nice.

Although the current beta is perhaps of more interest to our friends across the pond, the technology is something I'll be keeping my eye on – it's just possible that Google has finally found the killer app that justifies its purchase of YouTube.

Can you see a use for a video indexing service that uses spoken key words, or are you just looking forward to the day when you can compile a list of YouTube videos featuring your favourite swear word? Share your thoughts over in the forums.


Discuss in the forums Reply
Arkanrais 17th September 2008, 11:12 Quote
I wonder how this app will get past accents and they way people speak in general. I tried a voice to text program and I had to talk pretty much like a news reporter with excellent pronunciation or I would get completely ass backwards words coming up. the program had me going from hilarity to a 'hulk smash' mood after not too long, and it cost several hundred $$$ for the full version.
all I can say is I feel sorry for the google techs that have to get this thing working worldwide with all the slurring, drunken crazy people who rant about their goats on youtube.
RTT 17th September 2008, 11:52 Quote
Most google announcements these days make me feel like this:
badders 17th September 2008, 11:54 Quote
Like someone's dropped an ice cream on you from a great height?
dyzophoria 17th September 2008, 19:31 Quote
I wonder if it get past beta :D
Firehed 17th September 2008, 21:44 Quote
Originally Posted by RTT
Most google announcements these days make me feel like this:
We definitely need a tinfoil hat smiley in blue.
johnmustrule 18th September 2008, 04:55 Quote
Actually google probably uses a really advanced algorithm for their speech recognition. They've got more computing power than they probably need so I expect that they're not using consumer grade software. As far as accents go, it'll probably be like most speech recognition programs and be based off the American dialect/accent, at least for English that is.
Log in

You are not logged in, please login with your forum account below. If you don't already have an account please register to start contributing.

Discuss in the forums