DeepSeek speculation swirls online over Chinese AI start-up’s much-anticipated R2 model

Name: South Korea says DeepSeek sent data to ByteDance-owned servers in China without consent
Uploaded: 2025-04-28T12:00:14.000Z
Duration: 2 min 51 s
Description: South Korea says DeepSeek sent data to ByteDance-owned servers in China without consent

The latest speculation includes R2’s imminent launch and the new benchmarks that it set in terms of cost-efficiency and performance

Reading Time:2 minutes

Why you can trust SCMP

DeepSeek’s R2 artificial intelligence model was said to have been trained on a server cluster of Huawei Technologies’ Ascend 910B chips, according to Chinese social-media posts. Photo: Shutterstock

Ben Jiangin Beijing

Published: 8:00pm, 28 Apr 2025

Chinese start-up DeepSeek is seeing rampant speculation on social media, raising anticipation for its next open-source artificial intelligence (AI) model, as the company continues to keep the industry guessing about its progress amid an intensifying US-China tech war.

The latest speculation about DeepSeek-R2 – the successor to the R1, reasoning model, which was released in January – that surfaced over the weekend included the product’s imminent launch and the purported new benchmarks it set for cost-efficiency and performance.

That reflects heightened online interest in DeepSeek after it generated worldwide attention from late December 2024 to January by consecutively releasing two advanced open-source AI models, V3 and R1, which were built at a fraction of the cost and computing power that major tech companies typically require for large language model (LLM) projects. LLM refers to the technology underpinning generative AI services such as ChatGPT.

According to posts on Chinese stock-trading social-media platform Jiuyangongshe, R2 was said to have been developed with a so-called hybrid mixture-of-experts (MoE) architecture, with a total of 1.2 trillion parameters, making it 97.3 per cent cheaper to build than OpenAI’s GPT-4o.

MoE is a machine-learning approach that divides an AI model into separate sub-networks, or experts – each focused on a subset of the input data – to jointly perform a task. This is said to greatly reduce computation costs during pre-training and achieve faster performance during inference time.

In machine learning, parameters are the variables present in an AI system during training, which helps establish how data prompts yield the desired output.

02:51

South Korea says DeepSeek sent data to ByteDance-owned servers in China without consent