Tencent’s upgraded LLM for text-to-image generation released on open source platforms

The eight-month-old Hunyuan large language foundation model underwent a major upgrade, which enhanced its text-to-image model’s overall performance by 20 per cent

The complete source code of its text-to-image LLM has been released on US open source platforms Hugging Face and Github ‘to benefit the industry as a whole’

Reading Time:2 minutes

Signage at the Tencent booth at the Smart China Expo in Chongqing, China, Sept. 4, 2023. Photo: Bloomberg

Published: 7:00am, 16 May 2024Updated: 5:15pm, 17 May 2024

Chinese video gaming and social media giant Tencent Holdings launched an upgraded version of its large language model (LLM) with text-to-image generation that is open source for enterprises and individuals.

The eight-month-old Hunyuan large language foundation model developed by Tencent underwent a major upgrade earlier this year, which enhanced its text-to-image model’s overall performance by 20 per cent compared with the previous version, according to a statement posted on Tuesday on the official WeChat account of Tencent Cloud, the company’s cloud-computing services arm.

Tencent said the latest text-to-image function employs the DiT model architecture, which is also used by OpenAI’s text-to-video tool Sora. The company added that its primary database is in Chinese, enabling the tool to effectively and accurately understand Chinese-language commands.

The complete source code of its text-to-image LLM has been released on US open-source platforms Hugging Face and Github “to benefit the industry as a whole and build an open source ecosystem for next-generation vision generation”, according to the statement.

That means both individuals and enterprises can access the program’s code and modify or share its design, fix broken links, or scale up its capabilities.

Since launching Hunyuan last September, Tencent has integrated its LLM into the company’s various business units, including Tencent Cloud, Tencent Games and super app WeChat. The company said the AI-powered tool has also been provided to over 20 media outlets and advertising firms to facilitate their work.

The launch of the upgraded version came a day after Microsoft-backed OpenAI unveiled its newest GPT model, GPT-4o, which is capable of natural human-computer interaction across text, image, video, and audio.