We use cookies to tailor your experience and present relevant ads. By clicking “Accept”, you agree that cookies can be placed per our Privacy Policy
ACCEPT
avatar image
Advertisement

DeepSeek’s upgraded foundational model excels in coding and maths

The new version and DeepSeek V3 are both foundation models trained on vast data sets that can be applied in different use cases

Reading Time:2 minutes
Why you can trust SCMP
2
The DeepSeek logo is seen in this illustration taken January 27, 2025. Photo: Reuters
Coco Fengin Guangdong

Chinese artificial intelligence (AI) star DeepSeek has upgraded its open-source V3 large language model by adding parameters and improving capabilities in coding and solving mathematical problems.

The DeepSeek-V3-0324, named after its predecessor and the launch date, has “enhanced reasoning capabilities, optimised front-end web development and upgraded Chinese writing proficiency”, according to a notice on the company’s website.

The new version and DeepSeek V3 are both foundation models trained on vast data sets that can be applied in different use cases, including that of a chatbot. DeepSeek R1, the reasoning model, is based on DeepSeek V3.

The updated foundation model has made improvements in several benchmarks, especially the American Invitational Mathematics Examination (AIME), where it scored 59.4 compared with 39.6 for its predecessor, while achieving an increase of 10 points on LiveCodeBench to achieve 49.2, DeepSeek data showed.

This illustration photograph taken on January 29, 2025 shows screens displaying the logos of DeepSeek and OpenAI’s AI chatbot ChatGPT. Photo: AFP
This illustration photograph taken on January 29, 2025 shows screens displaying the logos of DeepSeek and OpenAI’s AI chatbot ChatGPT. Photo: AFP

Compared with DeepSeek V3, which has 671 billion parameters and adopts the company’s own commercial license, the new 685-billion-parameter model uses the MIT software licence that is the most popular on developer platform GitHub.

Launched on AI community Hugging Face as well as the company’s own website, DeepSeek-V3-0324 is now the top trending model on Hugging Face, receiving positive comments on its performance.

Coco Feng
Coco Feng joined the Post in 2019, covering the technology and internet sector from the Greater Bay Area. Previously, she worked at the Post's Beijing bureau.
Advertisement
Select Voice
Choose your listening speed
Get through articles 2-3x faster
1.1x
220 WPM
Slow
Normal
Fast
1.1x