I.T. doesn't have all the answers

PUBLISHED : Sunday, 25 September, 2011, 12:00am
UPDATED : Sunday, 25 September, 2011, 12:00am


The IBM supercomputer that trounced two human champions in the quiz show Jeopardy! this year raised the spectre of computers outwitting humans in some fields. But that day might still be very far away, one of the computer's key developers told a Hong Kong audience last week.

Scott Spangler, IBM's master inventor of the supercomputer Watson, also discussed wider uses of the technology in future at the University of Hong Kong.

His lecture - 'DeepQA: The Technology Behind Watson' - drew an overflow audience that filled Wang Gungwu Lecture Hall and spilled onto nearby stairways.

It attracted computer science majors and others intrigued by a technology that 'could one day control humankind', as a fearful arts major expressed it.

One professor even asked half-jokingly: 'Can we still keep our jobs?' To that question Spangler, a senior technical staff member and master inventor at IBM's Almaden Research Centre, answered: 'Absolutely.'

'The experts who played this game said if it had been a written test instead of a game show test, they would have beaten the computer. They felt confident that they knew the answer more often than Watson did, but Watson was faster. It was only because of how the game was set up that they were beaten.

'You have to really understand that when it comes to just plain old knowledge, those experts knew more than Watson did. There's no question about that. So we [scientists] are not there yet. All in all, I think humans are still far above these machines.'

Building Watson involved solving the problem of getting a computer to understand natural language, he said. 'To be honest, a lot of researchers weren't very optimistic we could do this, because human language is extremely complicated and computers are not very good at understanding it.

For example, a computer is easily puzzled by a sentence like: 'If leadership is an art, then surely Welch has proved himself a master painter during his tenure at GE' - a reference to Jack Welch, former chief of General Electric.

Spangler said: 'You can imagine a computer having a lot of trouble with that. Is it talking about painting? 'Prove his leadership' - what is this?'

Competing on Jeopardy! was seen as a major challenge for Watson because of the show's rapid-fire format and clues that rely on subtle meanings, puns and riddles.

'When IBM researchers were given this problem, they really said 'forget it. We can't do this. This is too hard. Not a good idea.' But we kept coming back to it and ... took on this challenge,' Spangler said.

In the end, they produced a technology that can scan and analyse information from many more sources - and far more rapidly - than a person can.

Watson is a breakthrough achievement in the field of question-answering (QA) computer technology. It is geared up to perform a massive number of simultaneous tasks at high speed, analysing complex language and delivering correct responses to Jeopardy! clues.

Watson is a supercomputer that runs on 10 racks of IBM's commercially available Power 750 servers, using 2,880 Power7 processors to run the DeepQA software. It can hold the equivalent of about a million books worth of information. Programmers fed Watson masses of information ranging from the World Book Encyclopedia to online sources such as Wikipedia and Project Gutenberg books.

A question put to Watson triggers an astonishingly complex process - which it performs in about three seconds. More than 100 algorithms simultaneously analyse the question in different ways and identify various plausible answers. Yet another set of algorithms tests, scores and ranks those answers.

Watson tracks down hundreds of bits of evidence to both support and refute each possible answer - then assesses how strongly the evidence supports the answer. The answer with the strongest evidence assessment earns the most confidence, and it may - or may not - become Watson's final reply.

If Watson's best answer still ranks too weakly on confidence during a game of Jeopardy!, the computer will refuse to submit it to avoid the risk of losing money with a wrong answer.

'Currently, Watson uses English only, but there's nothing that prevents us from using the same approach in other languages,' Spangler said. 'But you need to develop language-specific parsers to do the kinds of things that Watson does with English.'

Other possible applications for Watson's technology include dealing with big sets of data commonly found in the legal and financial worlds, and making medical diagnoses.

It was announced on September 12 that Watson would team up with health insurer WellPoint to assist medical professionals in diagnosing and treating patients. IBM and WellPoint announced an agreement to develop the first commercial application for Watson. 'Watson is expected to serve as a powerful tool in the physician's decision-making process,' they said.

Indiana-based WellPoint, which has 34 million members, said it would 'develop and launch Watson-based solutions to help improve patient care', and IBM will supply the technology.

Spangler said: 'Wouldn't it be wonderful if a doctor could consult Watson and say, 'Here's this patient with these symptoms and these problems: what's the most effective treatment for that individual?' And Watson could say, 'With this level of confidence, I think you should recommend this treatment'.'

'It's not going to replace the doctor at all, but it's going to make the doctor that much more effective in getting the right answer to the given question. It's going to be digesting all sorts of information: patient history, medications, tests, huge volumes of journal references.

'It's not something Watson can do now, but it's something we are training Watson to begin to do in the future,' he said.

Spangler said the next 'grand challenge' is to enable Watson to interact in virtual conversations. 'DeepQA is one question and one answer; it's not a dialogue. You cannot go back and ask Watson to 'tell me more about that',' he said.

'[This will] require deeper levels of understanding ... but we are considering 10 years from now it would make sense for us to tackle the problem.

'The current system is based on finding out what we already know, but we can imagine learning things by looking for interesting relationships we didn't know existed before. It's very possible we could invent knowledge with this technology.'