The Hunt for Insights in the Online Chatter About Swine Flu

Sunday, May 03 2009

“SWINE flu” last week was the most searched term on Yahoo, displacing “American Idol.”

The Wikipedia page Swine influenza had 1.3 million pages views on Wednesday and on Thursday. At the same time, Twitter was estimated to be transmitting 125,000 tweets a day mentioning swine flu — 1 percent of all the chatter there — overwhelmingly from users concerned about the outbreak’s potential to do harm rather than, say, describing stomach pains.

In short, the Internet allows us to take the temperature of society as never before. And we could conclude last week that the public was hot and bothered by the recent outbreak of swine flu, but was not yet showing a fever.

And that, in a nutshell, is the wrinkle facing Google Flu Trends, an innovative project from the philanthropic arm of Google, Google.org, that is intended to give the public a heads-up on influenza outbreaks, often beating government predictions by a week or more.

Because the number of actual swine flu cases in the United States is still very small — especially compared with seasonal influenza outbreaks — the interactive map at the Flu Trends Web site shows a shockingly docile United States, with low “flu activity” detected in Texas, New York and California, states saturated with news coverage on the topic.

By contrast, a map of the United States from Facebook illustrating how often swine flu is mentioned on the “walls” of users’ profile pages shows dark-blue hot spots in those states.

The Google folks are clearly aware that a look at Flu Trends may seem like a disappointment for the flu-obsessed visitor. (Last week, the company also introduced an experimental Flu Trends for Mexico.)

“Flu Trends was never designed to catch only a few dozen cases of flu,” Jeremy Ginsberg, lead engineer of the project, wrote in an e-mail message. “Because we’re looking at trends across a large population of users, we’re best equipped to provide up-to-date estimates of activity when the number of people affected is a bit higher.”

In other words, we are seeing in sharp relief the difference between the Internet as an alarm bell — OMG, everybody is freaking out! — and the Internet as a mirror to society that potentially can reveal so much more about how people are living.

“There is a tradeoff between sensitivity and specificity,” said Dr. Philip Polgreen, an assistant professor at the University of Iowa who has made a study of Yahoo search data as a predictor of influenza outbreaks. “Right now we are finding out that Google Flu Trends is very specific, but it might not be that sensitive. But it is still too early to tell.”

The stakes are high. Unlike efforts to use Internet buzz to predict which “American Idol” contestant is likely to win, tools to predict when seasonal influenza will peak (usually between November and April) can save lives and certainly money, as the authorities can schedule inoculations, increase staff at hospitals and order delivery of treatments.

“This is why everyone is looking,” Dr. Polgreen said. “With better information, you can plan for the future, especially in the context of influenza, where there is a vaccination and medicine. If you have one to two weeks, you can make sure you have the right kind of antivirals.”

True to Google’s belief in the algorithm, Flu Trends takes reams of data and applies a formula to try to impose some meaning.

In an article published in Nature in February, Mr. Ginsberg and his five fellow authors described collecting two piles of information — five years of material from the government tracking how often patients reported flulike symptoms and five years of Google search data — and then trying to find some overlap. Does, say, a spike in interest in what is the best type of thermometer, or where the nearest emergency room is, really mean more people are visiting doctors with flulike symptoms?

The connection merely had to show up over and over. If it turns out that before a flu takes hold, people have a mysterious hankering for frozen raspberries, Flu Trends would add that search to the algorithm.

This method, called surveillance among health care professionals, is not new. In the past, surveillance meant looking for spikes in purchases of cold medicine at the pharmacy. And the challenge in both offline and online surveillance is to ignore the “noise” and focus on what really matters. Offline noise could be a coupon discounting cough medicine — leading to a sharp increase in sales that has nothing to do with the health of the population. Online, noise is everywhere.

“The problem is always the noise,” said Alessio Signorini, a Ph.D. candidate in computer science at the University of Iowa who is analyzing Twitter for what it shows about the outbreak. “Looking for the term ‘swine flu’ is not what you would look for if you wanted to predict an outbreak. You want to look for symptoms.”

But, Mr. Signorini added: “Twitter gives you a unique possibility. People are more likely to say I feel terrible on Twitter, rather than going to the doctor.”

He says he is continuously amazed at what people will share. “On Twitter, nobody says they have an S.T.D., but they say they are taking the medicine for chlamydia and complain about the side effects.”

And it is that candid behavior that gives Mr. Signorini hope that social networking sites will be an important tool for public health officials.

For Mr. Ginsberg of Google, Flu Trends’ sober response to the swine flu outbreak last week was very useful to the public.

“When early reports of possible swine flu cases were circulating last week,” he wrote in an e-mail response to questions, “it could have been the case that hundreds of thousands of people would fall ill within a matter of days. This did not transpire, and by looking at Flu Trends data last week, we could gain some confidence that the epidemic was not likely spreading at this pace through the population.”