Better Predicting Flu Outbreaks with Wikipedia

Los Alamos, NM - Scientists at Los Alamos National Laboratory have the ability to forecast the upcoming flu season and other infectious diseases by analyzing views of Wikipedia articles. “The ability to more accurately forecast the flu season and other infectious diseases will transform the way health departments prepare for and respond to epidemics, ultimately saving lives,” scientist Sara Del Valle said.

Del Valle and her team recently published  “Forecasting the 2013-2014 Influenza Season using Wikipedia,” in the Public Library of Science.

“Infectious diseases are one of the leading causes of morbidity and mortality around the world; because of this, forecasting their impact is crucial for planning an effective response strategy,” Del Valle said.

According to the Centers for Disease Control and Prevention, seasonal influenza effects up to 20 percent of people in the United States and causes major economic impacts resulting from hospitalization and absenteeism.

Understanding influenza or other infectious disease dynamics and forecasting their impact is fundamental for developing prevention and mitigation strategies. To do this, researchers at Los Alamos combined modern data assimilation methods with Wikipedia access logs and CDC influenza-like illness (ILI) reports to create a weekly forecast for seasonal influenza.

“We used techniques often seen in weather forecasting to iteratively tune a model of influenza dynamics based on Wikipedia observations so that our forecast agrees with the most current ILI data,” said researcher Kyle Hickmann, who is the lead author of the paper and a member of Del Valle’s team.

These methods were applied to the 2013-2014 influenza season but are standard enough to forecast any disease outbreak, given incidence or case count data, the team believes. Del Valle and her team adjusted the initialization and parameterization of a disease model, which allows them to determine systematic model bias. They also provided a way to determine where the model diverges from observation, and evaluate forecast accuracy.

Wikipedia article access logs are shown to be highly correlated with historical ILI records and allow for accurate prediction of ILI data several weeks before it becomes available. The results showed that prior to the peak of the flu season, their forecasting method projected the actual outcome with a high probability.

“Disease forecasting is still in its infancy and there is much more to learn in this field,” Del Valle said. “We are continuing to refine our approach so our forecasts can be used for actionable decision-making.”

The scientific paper “Forecasting the 2013-2014 Influenza Season using Wikipedia” is in the Public Library of Science journal Computational Biology. It is by Kyle Hickmann, Geoffrey Fairchild, Reid Priedhorsky, Nicholas Generous, James Hyman, Alina Deshpande, and Sara Del Valle of Los Alamos National Laboratory’s Defense Systems and Analysis Division.

About Los Alamos National Laboratory Los Alamos National Laboratory, a multidisciplinary research institution engaged in strategic science on behalf of national security, is operated by Los Alamos National Security, LLC, a team composed of Bechtel National, the University of California, The Babcock & Wilcox Company, and URS for the Department of Energy's National Nuclear Security Administration.

Los Alamos enhances national security by ensuring the safety and reliability of the U.S. nuclear stockpile, developing technologies to reduce threats from weapons of mass destruction, and solving problems related to energy, environment, infrastructure, health, and global security concerns.