
Alibaba’s algorithm-powered machine outperforms humans in understanding visuals for the first time
- Alibaba’s algorithm recorded an 81.26 per cent accuracy rate in answering questions related to images, compared to a rate of 80.83 per cent for humans
- VQA technology can be used in a wide range of areas, including searches for products on e-commerce sites and supporting analysis of medical images
A machine has outperformed humans in understanding images to answer text questions for the first time after Chinese e-commerce giant Alibaba Group Holding’s model “AliceMind” earned the top spot in the global Visual Question Answering (VQA) Challenge 2021.
Alibaba’s algorithm recorded an 81.26 per cent accuracy rate in answering questions related to images, compared to an accuracy rate of 80.83 per cent for humans in the annual VQA Challenge, which has been run since 2015 by the Conference on Computer Vision and Pattern Recognition (CVPR), according to a statement from Alibaba on Thursday.
This year the challenge contained more than 250,000 images and 1.1 million questions. The evaluation presents an image and a related question, to which participants are asked to provide an accurate answer. Alibaba’s result, which was updated eight days ago, beat other global players including US tech giant Microsoft, the Hangzhou-based company said.
“We are proud that we have achieved another significant milestone in machine intelligence, which underscores our continuous efforts in driving the research and development in related AI fields,” said Si Luo, head of Natural Language Processing (NLP) at Alibaba DAMO Academy. “This is not implying humans will be replaced by robots one day. Rather, we are confident that smarter machines can be used to assist our daily work and life, and hence, people can focus on the creative tasks that they are best at.”
Computer vision is one of the most active areas of AI research and development in China, although an early emphasis on surveillance applications and the impact of the US-China tech war has prompted a search for new commercial growth drivers.
VQA technology can be used in a wide range of areas, said Si, including searches for products on e-commerce sites, supporting the analysis of medical images for initial disease diagnosis, as well as for smart driving.
Alibaba has already used VQA in multiple application scenarios, including its intelligent chatbot Alime Shop Assistant, which is used by tens of thousands of merchants on Alibaba’s retail platforms every day.
Alibaba owns the South China Morning Post.
