read my lips
'Did you say Harriet?' asks the voice-dialling function on my phone in its calm Japanese-American accent. 'Henry,' I repeat, firmly.
'Did you say Helen?' the voice asks. I repeat myself. 'Did you say Herb?' 'Henry!' I yell and, before clicking the escape button, dissolve into a stream of four-letter words.
The fact that my phone's speech-recognition capability struggles to catch a single word raises the question of how any program could precisely transcribe whole sentences onto a screen.
Experience strengthens my doubts. The last time I tangled with speech rec, five years ago, it failed to live up to the hype. But it may deserve a second chance because in theory it means you can sit back and relax your shoulders. No more repetitive strain disorder. On paper, it could be a true killer application, catching on like Facebook.
So this week, I don the fitness instructor-style headphones and investigate whether 'talk-into-text' technology is ready for action. The area is a cinch to research because it's far from crowded now that IBM's ViaVoice has been licensed to Nuance, which is responsible for Dragon NaturallySpeaking: one of just two serious contenders.
Dragon NaturallySpeaking by Nuance Communications (Windows XP or Vista)
Because reading and rendering the human voice is such a complex activity, speech rec programs need to be big. Dragon is huge - to accommodate it, you must have a gigabyte of free hard disk space.
But the visual interface is minimal. Dictation appears in a floating window as you speak. When you pause for breath, the program transcribes the words into the cursor's location. When you first start speaking into the microphone and see the words that tumble from your mouth appear on the screen, the feeling is eerily magical.
The initial training is not that intensive. You just read one short chapter from Alice in Wonderland, say. Then, to gauge your writing style, the software starts parsing the documents in your computer. Dragon works pretty well, if you speak clearly and persist. Now and then, you have to train it to understand specific words, using the function designed for this purpose, which is a hassle but worth it.
It's great for writing something fairly casual such as a letter. But if you are working on a document that you need to edit intensively and you are at all dextrous, you are probably better off using your fingers.
iListen by Macspeech (Apple operating system)
At least iListen requires less disk space - 200 megabytes. Otherwise, it is hard to find much to praise about the Mac platform rival. The tuition mode features a disembodied head, which tries to convince you all will be well if you just speak clearly.
In fact, iListen requires you to speak so loudly into the mic that you may disturb your neighbours.
During my reading of a trial script, I floundered just trying to make the program correctly hear the words 'full stop'. Other words it struggled with included 'and' and 'in'.
Adding to the fun, after one 15-minute trial transcript reading, it crashed, remembering nothing. Doubtless, it is possible to make iListen work, but it seems to require monumental patience.
iListen is just not on the same level as its rival, doubtless due to Dragon's devouring of IBM's ViaVoice.