Google Flu Trends and other hopes of providing public health breakthroughs by analysing huge amounts of medical data have been overstated, according to a new study.
"If we actually began relying on the claims made by big data surveillance in public health, we would come to some peculiar conclusions," said John Ayers, a research professor at San Diego State University who was lead author on the study.
"Some of these conclusions may even pose serious public-health harm."
The findings, published in the American Journal of Preventive Medicine, cast doubt on claims made by Google's chief executive Larry Page that more analysis of peoples' health data could save up to 100,000 lives per year.
Speaking this year, Page said that excessive worries about privacy were holding back developments in the field. Page has not specified how the figure for lives saved was calculated.
Ayers said: "Big data has big value, and that includes saving lives. But to realise these gains we need better science."
However, the authors of the new study point out that even one of Google's simplest health data-mining systems, Google Flu Trends, has consistently failed to provide useful forecasts of flu cases in the US. Google Flu Trends tries to use searches made through the site to predict forthcoming numbers of influenza cases in the US.
In March, David Lazer, an associate professor at Northeastern University in Boston, published a paper showing that Google Flu Trends had overestimated flu level for 100 out of 108 weeks when compared with authoritative figures from the US Centres for Disease Control.
However, Ayers' team showed they could use open-source, publicly available data from Google's archive to significantly improve the accuracy of the flu prediction.
Rather than monitoring a particular group of influenza-related queries - as Google does - they monitored how the queries changed, and gave some queries more weight than others.
They showed that this was more accurate during the flu seasons in both 2009 and 2012 than Google's model for each week.
"With these tweaks, Google Flu Trends could live up to the high expectations it originally aspired to," Ayers said.
The researchers pointed out that in the 2012 to 2013 season, Google's system predicted that 10.6 per cent of the population had flu - compared to just 6.1 per cent according to patient records, an overstatement of 73 per cent. The revised model suggested 7.7 per cent infection - a 26 per cent overstatement.
"Big data is no substitute for good methods, and consumers need to better discern good from bad methods," Ayers said.