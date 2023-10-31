Chinese AI start-up Baichuan claims to beat Anthropic, OpenAI with model that can process 350,000 Chinese characters
- The Beijing-based company says the latest version of its large language model has a bigger ‘context window’ than its foreign competitors
- The company also says its model surpasses Anthropic’s Claude 2 in its quality of responses, as well as its understanding of long text
Chinese artificial intelligence start-up Baichuan has launched an AI model that it said can digest and summarise novels, making it the world’s most powerful model in handling long text prompts.
The Beijing-based company, established by Chinese search engine Sogou’s founder Wang Xiaochuan, on Monday announced its Baichuan2-192k large language model (LLM), the latest iteration, saying its “context window” can handle around 350,000 Chinese characters.
A context window is the combination of input and output text that a model can process during conversations with users.
For comparison, Claude 2, introduced in July by Amazon.com-backed Anthropic as the world’s most advanced AI model in terms of the number of words that users could include in their chat queries, was said to have a context window of around 75,000 English words, corresponding to hundreds of pages of documents or a book.
The context window of the Baichuan model is 14 times bigger than that of OpenAI’s GPT-4-32k, according to a WeChat post by the Chinese company.
Baichuan also said its model surpassed Claude 2 in its quality of responses, as well as its understanding and summarisation of long text, citing test results by LongEval, a project launched by University of California, Berkeley and other US institutions to evaluate how well LLMs handle large prompts.
Baichuan said a larger context window will make its AI model useful to businesses that need to process and generate long text on a daily basis, such as the legal, media and finance industries. The company has started internal testing of the model with industrial partners, Baichuan said.
Still, joint research by scholars from Stanford University and UC Berkeley suggests that the capacity to process more information does not necessarily make an AI model better than its peers.
“Performance substantially decreases as the input context grows longer, even for explicitly long-context models,” researchers wrote in their study.
Baichaun faces heightened competition from Chinese rivals that are racing to draw users to their AI models and applications.
The cloud division of Alibaba Group Holding, owner of the South China Morning Post, on Tuesday announced an update to its Tongyi Qianwen model, trained with hundreds of billions of parameters.
Tongyi Qianwen 2.0 outperforms OpenAI’s ChatGPT 3.5 and Meta Platforms’ Llama2, and has narrowed its gap with ChatGPT 4, said Zhou Jingren, the technology chief of Alibaba Cloud, at the company’s annual partner event.
Meanwhile Zhipu AI, a start-up backed by Alibaba and Tencent Holdings, last week debuted its ChatGLM3 model with various improvements, including faster inference speed, lower training costs and the addition of a coding assistant.
The company also launched a smaller version of the model, which was designed for use in personal electronic devices such as laptop computers and smartphones.