Advertisement
Chinese scientists’ attack on ChatGPT shows how criminals can use weak spots to exploit AI
- Using researchers’ attack method, a giant panda’s face is misclassified as a woman by Bard and a bald eagle is misclassified as a cat and a dog in Bing Chat
- Researchers hone in on models’ vulnerabilities as world’s powers sign up to Bletchley Declaration at UK summit on AI safety
Reading Time:3 minutes
Why you can trust SCMP
7

Zhang Tongin Beijing
An AI model under attack could mistake giant pandas for humans or fail to detect harmful content, according to a research team in Beijing that says it discovered an effective method for attacking ChatGPT and other popular commercial AI models.
Doctored images used by the researchers appeared almost identical to the original, but they could effectively circumvent the models’ mechanisms designed to filter out toxic information.
The findings highlight significant security concerns within artificial intelligence and help shed light on the vulnerabilities of commercial multimodal large language models (MLLMs) from tech giants including Google, Microsoft and Baidu.
At the inaugural Global AI Security Summit held in the UK last week, representatives from the US, Britain, the European Union, China and India signed the Bletchley Declaration, an unprecedented deal to encourage safe and ethical development and use of AI.
Wu Zhaohui, China’s vice-minister of science and technology, took part in the conference and presented proposals, advocating for stronger technical risk controls in AI governance.
Advertisement