China’s national library to archive 200 billion Weibo posts in project to preserve country’s digital heritage
- Sina, which had 462 million active users on Weibo at the end of 2018, was chosen as first partner for the initiative because of its enormous trove of data
The National Library of China will archive over 200 billion public posts on Weibo, the country’s popular Twitter-style microblogging site, as part of an initiative to preserve the digital heritage of the world’s biggest internet population.
More than 210 million news stories published on Sina.com, the news portal operated by the parent company of Weibo Sina Corp, together with 200 billion public posts on Weibo, will be archived under a non-profit project by the national library, according to a statement on Weibo’s official account on government services.
The goal of the project is to chronicle the evolution of civilisation in the internet era for the “long term development of information security and digitisation of the country”, Rao Quan, director of the national library, said in the statement on Monday.
Sina, which had 462 million active users on Weibo at the end of 2018, was chosen as first partner for the initiative because of its enormous trove of data which records both significant social events and public reaction to them, the company said. Other Chinese internet companies are being invited to participate in the initiative.
Data collected from Weibo and Sina.com will be stored on company servers while the national library and Sina will jointly analyse the data for policymaking and academic purposes.
Weibo did not immediately respond to an emailed request for comment.
Web archiving has been a common practice for countries around the world but accessing social media content has posed a challenge due to the overwhelming size of the data as well as privacy policies of the internet platforms.
The US Library of Congress, which had archived a complete collection of all posts on Twitter since Twitter’s inception in 2006, decided to only preserve tweets on a selective basis starting from December 2017 after the volume dramatically increased.
Academics are also calling on Facebook to make its data more open to the research community after the social media giant tightened restrictions on third party data use following a public backlash over data privacy last year.
China’s internet space, which had 829 million internet users by the end of 2018, is a massive trove of data subject to government control, with social media and other online content heavily regulated and censored.
Under Chinese President Xi Jinping the ruling Communist Party has tightened its grip on the internet through an ongoing drive to crack down on content deemed unsuitable by the authorities, including pornography, gambling, fake news and political dissent. The crackdown has intensified amid the growing popularity of new social media platforms such as live-streaming, short videos platforms and microblogs.
Last week Sina voluntarily suspended its flagship news app and other products after it was summoned by China’s internet regulator for spreading untrue and vulgar information, and for being a “bad influence” on public opinion.