The Blizzard Challenge 2008

The Blizzard Challenge 2008

The entrants in the Blizzard Challenge have all finished their engines, so it's up to us to decide if they've succeeded.

If you're interested in helping advance the science of text-to-speech synthesis, you're needed as part of the Blizzard Challenge.

The Challenge is an annual event hosted by the University of Edinburgh's Centre for Speech Technology Research in which programmers are given 10,000 sentence-length recordings of a person from which they must create a working speech synthesis engine. Once each team has completed their engine, the results are uploaded for people like us to listen to and rate.

Speech synthesis is an important technology, and one which gets criminally overlooked in these days of multi-gigabyte storage and the ability to record voiceover artists in high-fidelity. Not only are the text-to-speech engines vital for partially sighted people using screen-reader software that all too often sounds like a cross between Stephen Hawking and a Dalek, but an engine which is as flexible as a real human voice holds the promise of massively improved immersion in games with vast swathes of text being transformed into realistic speech without the need to hire actors and expensive studios.

In order to make things easier for the teams involved, the Blizzard Challenge has traditionally used a neutral voice for the basis of the engines – one without a particularly strong accent and as emotion free as possible. This time round, however, the Challenge is to create a working engine from a voice sample which has a lot more personality than usual. While this makes things a lot harder for the programmers, it holds the promise of an engine capable of producing a voice that doesn't sound permanently bored.

If your last experience of text-to-voice synthesis was with the Say program on your Amiga 500 Workbench floppy, then you'll be pleasantly surprised by how advanced some of this years entries are. If you want to participate, you can sign up on the project homepage. It'll only take about an hour of your time, and it's well worth it.

Do we have any partially sighted visitors relying on screen readers, or are we all just looking forward to seeing the technology to a point where it can be used to put more speech into games like Oblivion? Share your thoughts over in the forums.


Discuss in the forums Reply
freedom810 16th May 2008, 09:30 Quote
When i saw the name blizzard i immediately thought of WoW :(
theevilelephant 16th May 2008, 11:47 Quote
im just doing it now and im pretty darn impressed with some of them!
DriftCarl 16th May 2008, 12:58 Quote
my mum could really use some of this. i have been trying for ages to find something good for her to use, we have just set the screen to use 400% zoom at the moment.
Ill sign up and see how good these people are :)
LordPyrinc 16th May 2008, 22:33 Quote
WOT... DU... U... MEEN... THIS... NOT... NOR... MAL... SPEEK?
Phil Rhodes 18th May 2008, 18:04 Quote
One of the downsides of this is that the dalek/hawking ones are actually faster to listen to. Anyone who has to use these things generally spends a lot of time trying to make their computer interaction faster - that is, less painfully slow.

Next time I hear Jaws say "Page has three frames and one hundred and three links" I'm going to go to Yahoo and start laying about myself with a baseball bat.

Tab, tab, tab, tab, tab...
Brooxy 19th May 2008, 11:31 Quote
Originally Posted by freedom810
When i saw the name blizzard i immediately thought of Starcraft 2 :(

Corrected :D
Log in

You are not logged in, please login with your forum account below. If you don't already have an account please register to start contributing.

Discuss in the forums