Speech technology is poised for the mainstream. Previously a niche application, IBM's popular ViaVoice is embedded in two new personal digital assistants (PDAs) - Compaq's latest iPaq and a device from China's Legend Computer. Worldwide spending and revenues from voice applications will reach US$41 billion by 2005, according to The Kelsey Group, a market research firm. Two weeks ago, IBM linked with Johnson Controls and car-maker DaimlerChrysler to demonstrate in the United States a hands-free, eyes-free phone using IBM's speech technology built into the dashboard of a car. The product will be available as an add-on next year and built in in 2003. IBM Voice Systems' director of marketing Nigel Beck said there was momentum towards embedding speech technology as processing power on mobile devices increased. 'You are going to start to see a lot of embedded speech devices. There is also an opportunity for speech technology like IBM's WebSphere Voice Response to be deployed by service operators as the main customer interface, which can be more effective than the interactive voice systems used today,' he said. For years, experts have predicted that in the near future phone numbers and addresses would be dictated into PDAs instead of typing them in, that house lights could be turned on by command and the car could be asked to tune the radio to a particular station. Yet today, it is still necessary to wade through a labyrinth of computer-generated options just to get a phone bill faxed. While devices embedded with speech technology have limited features, they can perform simple tasks such as retrieving a telephone number or advising on appointments for the day. 'There is little doubt that voice is an easier way to input data and an easier way to control hand-held devices than using the keyboard or handwriting-recognition software,' said Mr Beck. 'The technology is here. It really depends on the processing power of the device and the database it can support.' He said standardisation in speech technology on VoiceXML would further drive applications and adoption. The W3C (World Wide Web Consortium) already has posted a draft specification for VoiceXML 2.0 which is designed to bring synthesised speech, spoken and touch-tone commands, digitised audio and computer-human conversations to the Web. Much bigger, and longer-term markets for IBM's voice products are in customer-relationship management systems. Interactive voice systems used by banks, telecommunications providers and government agencies are frustrating to customers. 'About 90 per cent of customers will hit '0' to get out of the voice-automated menu to talk to a live operator,' Mr Beck said. He said the cost per call of operating a call centre was about US$10, while using a speech-enabled, Web-based system cost just 20 US cents. 'Now, this issue of cost in customer-interfacing systems such as call centres and operating loyalty programmes is an important opportunity, because we think that using speech technology to interact with the customer is a more comfortable substitute for a live operator than an interactive voice system,' he said. In China, Tom.com is already using IBM's technology to operate a voice portal in Beijing. Tom Voice allows users voice-activated phone access to the Internet where they can obtain the latest information on stocks, flight schedules, hotel reservations, meal orders, transport services, world news, movies and weather forecasts. In a demonstration yesterday of Tom Voice's service, IBM officials showed that the system used natural speaking tones and not the awkward robotic voices which have become more familiar. The system would answer when asked a question such as 'What is the arrival time of flight CX877?' The system, which is in Putonghua, is able to understand speech with a Cantonese accent. Mr Beck said more companies such as banks and telecoms operators would find speech-technology driven systems such as Tom Voice a viable alternative to operating costly call centres.