Technology to advance infectious disease forecasting for outbreak management

George, Dylan B.; Taylor, Wendy; Shaman, Jeffrey; Rivers, Caitlin; Paul, Brooke; O’Toole, Tara; Johansson, Michael A.; Hirschman, Lynette; Biggerstaff, Matthew; Asher, Jason; Reich, Nicholas G.

doi:10.1038/s41467-019-11901-7

Download PDF

Comment
Open access
Published: 02 September 2019

Technology to advance infectious disease forecasting for outbreak management

Dylan B. George ORCID: orcid.org/0000-0002-1943-5309¹,
Wendy Taylor²,
Jeffrey Shaman³,
Caitlin Rivers⁴,
Brooke Paul⁵,
Tara O’Toole¹,
Michael A. Johansson ORCID: orcid.org/0000-0002-5090-7722⁶,
Lynette Hirschman⁷,
Matthew Biggerstaff⁸,
Jason Asher⁹ &
…
Nicholas G. Reich ORCID: orcid.org/0000-0003-3503-9899¹⁰

Nature Communications volume 10, Article number: 3932 (2019) Cite this article

12k Accesses
36 Citations
36 Altmetric
Metrics details

Subjects

Forecasting is beginning to be integrated into decision-making processes for infectious disease outbreak response. We discuss how technologies could accelerate the adoption of forecasting among public health practitioners, improve epidemic management, save lives, and reduce the economic impact of outbreaks.

“Data gaps undermine our ability to target resources, develop policies and track accountability. Without good data, we’re flying blind. If you can’t see it, you can’t solve it.” Kofi Annan¹

Data, analytics are force multipliers for outbreak response

Present capacity to develop, evaluate, manufacture, distribute and administer effective medical countermeasures (e.g., vaccines, diagnostics, therapeutics) is inadequate to meet the burden of both recurrent and emerging outbreaks of infectious diseases. When such interventions are unavailable, public health measures (e.g., contact tracing, outbreak investigations, social distancing) and supportive clinical care remain the only feasible tools to slow an emerging outbreak. Decision-making under such circumstances can be greatly improved by the use of appropriate data and advanced analytics such as infectious disease modeling or machine learning. Furthermore, these analyses can guide decision-making when medical countermeasures become available, allowing them to be used in more effective ways. Data analyses already underpin public health actions such as anticipating resource requirements, refining situational awareness and monitoring control efforts^2,3,4,5. New applications of data science and statistical analyses to disease outbreaks could provide support to decision-makers during public health crises.

Forecasting is an emerging analytical capability that has demonstrated value in recent outbreaks by informing policy and epidemic management decisions in real-time outbreak response. During the 2014–2016 Ebola virus disease (EVD) outbreak in West Africa, there was a strong push to use clinical trials to confirm that Ebola vaccines could be safe and efficacious (J. Asher, personal communication). Real-time forecasts generated during the outbreak highlighted challenges for the design of the planned clinical trials. These studies showed, based on forecasted incidence rates of EVD, that there was a strong possibility that the trials being proposed during September 2014 would not have sufficient case numbers to demonstrate significant results. This forecasting sped up discussions among senior leaders to pursue more productive, alternative trial designs (J. Asher, personal communication).

In this Comment, we discuss major limitations of the current set of tools used in forecasting outbreaks and highlight existing and emerging technologies that have the potential to significantly enhance forecasting capabilities. We focus on forecasting for outbreak management, specifically the capacity to predict short-term (i.e., days to weeks) trends of disease activity or incidence (i.e., the number and location of new cases) in an ongoing outbreak. We do not address the prediction of outbreak emergence, which is a separate endeavor with its own opportunities⁶ and challenges⁷, nor do we consider projecting multi-year trends of disease burden⁸.

From a data science perspective, the forecasting workflow encompasses three general categories: data, analytics, and communication (Fig. 1). Each step in the process has challenges and opportunities.

Data collection

Effective data collection and curation is essential for analytics and efficient outbreak management. Yet, for infectious disease forecasting, data quantity, quality and timeliness persist as significant challenges. Few epidemiological data are consistently reported, broadly shared, and available for decision-making during outbreak responses, especially early in outbreaks. Data collection can be a slow process, particularly in low-resource settings lacking sufficiently trained staff, with sporadic communications, limited healthcare systems, and inconsistent electrical power. Improving collection systems and advancing forecasting approaches that address these limitations and leverage existing surveillance data are necessary.

Improving diagnostic capabilities at scale should be a priority area of development. Recent advances have introduced the capacity to collect and share near real-time diagnostic results. For example, Quidel’s Sofia platform⁹ and BioFire’s FilmArray multiplex PCR¹⁰ both provide rapid diagnostic tests for respiratory pathogens that are wirelessly connected to cloud-enabled databases. These early examples demonstrate how rapid, aggregated, and geo-coded diagnostic test results could improve real-time tracking of population health trends. Additionally, they could enable timely and targeted clinical trial recruitment. Determining how to scale these capabilities could provide a significant source of data to improve forecasts.

Data cleaning

Collected data is usually not in a form amenable for immediate analysis that could support decision making, and must be processed and cleaned. Data cleaning has been largely a manual, ad hoc process in outbreak forecasting efforts. Therefore, technologies to clean data would be particularly valuable for forecasting.

Technologies that translate raw, unprocessed data into structured formats would be particularly useful. For instance, software could extract data from line lists of cases or clinical notes in electronic health records, or convert data stored in non-standard formats into machine-readable data. Digitizing handwritten text reliably, quickly and securely from clinical or epidemiological records will be a persistent need for the foreseeable future.

Data sharing

Although tools are improving, epidemiological data sharing remains a problem. Public health agencies provide data via their websites and situational reports^11,12. These efforts are critical for supplying information to the public but the formats often cause challenges for quantitative analysts. Typically, these reports are provided with a considerable time lag, and are not machine-readable nor provided in standard formats with metadata. This impedes sharing and use of these data.

There have been instances where epidemiological data are available via informal networks of people sharing spreadsheets (D. B. George, personal communication); secure CSV file transfers¹³; or unofficial APIs^14,15. These approaches should be lauded, but they are not long-term, enterprise solutions.

Open-science approaches to sharing data have shown promise in recent outbreaks. Epidemiologists and modelers have begun using publicly available repositories, such as GitHub, to aggregate and share digitized data in standardized formats^16,17,18. This paradigm shift resulted in a rapid improvement in data-sharing capability during the 2014–2015 West Africa Ebola outbreak (D. B. George, personal communication). A team of influenza forecasters in the U.S. also has used GitHub to share forecast data to facilitate the creation of multi-model ensemble forecasts^19,20. The shift from informal means of sharing data to robust technologies using standardized, machine-readable formats enables more rapid and meaningful engagement of a broader group of analysts. Structured open-science approaches to data sharing that are specifically tailored to forecasting applications should be further supported and explored.

Analytics: training models

Over the past several years, academic research on infectious disease forecasting has grown and models have successfully generated predictions for pathogens such as influenza^19,20,21, dengue¹³, Zika²², and Ebola². But, scaling academic research to support public health decision-makers in real-time has received little attention and relatively scarce resources.

The U.S. Department of Health and Human Services has built models for recent outbreaks using a combination of extramural and internal analytical resources. However, the federal government and state and local public health agencies find it difficult to recruit and retain scientists capable of developing, interpreting, and communicating quantitative results. Formalized training in “outbreak science” for public health practitioners will be a vital component in ensuring that the public and private sector work-force can respond quickly in case of an emerging epidemic threat^23,24. Even when scientists are available in public health agencies, the long and bureaucratic processes for acquiring and securing software and data technologies present significant challenges to using current and emerging data science tools.

Analytics: forecasting

The U.S. government wisely spent decades developing weather forecasting capabilities and continues to invest in advancing the personnel, infrastructure, data, analytics and decision frameworks necessary for supporting these activities²⁵. Similar efforts to develop infectious disease forecasting capabilities need to occur. To succeed, the technological architecture supporting forecasting must be evaluated in the context of ongoing outbreak response. To this end, since 2013 the U.S. Centers for Disease Control and Prevention (CDC) has fostered an open collaboration, called FluSight⁴, to improve the science and usability of epidemic forecasts of influenza for public health decision-making^21,26,27. However, many public health agencies have limited technical expertise or capacity to adopt, advance, and modify analytical approaches and technologies by themselves. Maintaining progress will require sustained, collaborative work and resources from public health agencies, academia, and the private sector. Few research funding agencies provide substantial and sustained support for this type of translational work, despite a strong track record of research productivity emerging from the CDC FluSight challenge and other governmental forecasting challenges²⁸. Nor have donor foundations shown leadership in this crucial area of epidemic response. If not provided with sufficient resources, public health will remain decades behind most other sectors in its use of advanced analytics.

Visualization and communication

Forecasting results must be communicated effectively to ensure they produce actionable insights. Visualizations play a key role. Academic groups have built data visualization tools to communicate forecasts²⁹, but these largely rely on customized code. Analysts who develop forecast models typically have limited time to spend on visualization and lack advanced design skills. This can lead to hard-to-understand visualizations and misinterpretation of results when used to support decision making. However, recent work by CDC has progressively refined information from forecasting results on seasonal influenza and translated that information into actionable risk communications⁴. Such efforts should be encouraged and supported.

Conclusions

Experience from the successful application of analytical technologies across multiple industries can inform the development of technologies for infectious disease forecasting and outbreak science. Improving technologies across the forecasting workflow will significantly advance forecasting capabilities, enable involvement from multiple stakeholders (e.g., industry, government, and academia), and allow the field to develop a robust forecasting architecture. Such advances will improve public health response to outbreaks, mitigate economic losses, and save lives.

References

Annan, K. Data can help to end malnutrition across Africa. Nature 555, 7 (2018).
Article ADS CAS Google Scholar
Chretien, J. P., Riley, S. & George, D. B. Mathematical modeling of the West Africa ebola epidemic. Elife https://doi.org/10.7554/eLife.09186 (2015).
Rainisch, G. et al. Estimating Ebola treatment needs, United States. Emerg. Infect. Dis. 21, 1273 (2015).
Article CAS Google Scholar
CDC. FluSight: Flu Forecasting. https://www.cdc.gov/flu/weekly/flusight/index.html (2019).
Meltzer, M. I. et al. Modeling in real time during the ebola response. Cent. Dis. Control Prev. Mortal. Morb. Wkly. Rep. 65, 85–89 (2016).
Google Scholar
Camacho, A. et al. Cholera epidemic in Yemen, 2016–18: an analysis of surveillance data. Lancet Glob. Heal 6, e680–e690 (2018).
Article Google Scholar
Holmes, E. C., Rambaut, A. & Andersen, K. G. Pandemics: spend on surveillance, not prediction. Nature 558, 180–182 (2018).
Article ADS CAS Google Scholar
Foreman, K. J. et al. Forecasting life expectancy, years of life lost, and all-cause and cause-specific mortality for 250 causes of death: reference and alternative scenarios for 2016-40 for 195 countries and territories. Lancet (Lond., Engl.) 392, 2052–2090 (2018).
Article Google Scholar
Quidel. https://www.quidel.com/immunoassays/sofia-tests-kits (2019).
Meyers, L. et al. Automated real-time collection of pathogen-specific diagnostic data: syndromic infectious disease epidemiology. J. Med. Internet Res. 20, 1–29 (2018).
Article Google Scholar
CDC. Weekly U.S. Influenza Surveillance Report. https://www.cdc.gov/flu/weekly/index.htm (2019).
Organization, W. H. Influenza surveillance and monitoring. https://www.who.int/influenza/surveillance_monitoring/en/ (2019).
Reich, N. G. et al. Challenges in real-time prediction of infectious disease: a case study of dengue in Thailand. PLoS Negl. Trop. Dis. 10, 1–17 (2016).
Article Google Scholar
Rudis, B. cdcfluview: Retrieve ‘U.S’. Flu Season Data from the ‘CDC’ ‘FluView’ Portal. R package version 0.7.0. https://cran.r-project.org/package=cdcfluview (2019).
CMU-Delphi. https://github.com/cmu-delphi/delphi-epidata (2019).
Rivers, C. M. cmrivers github. https://github.com/cmrivers/ebola (2019).
CDC. cdcepi github. https://github.com/cdcepi/zika (2019).
CDC. Epidemic Prediction Initiative. https://github.com/cdepit/FluSight-forecasts (2019).
Tushar, A. et al. FluSightNetwork/cdc-flusight-ensemble: end of 2017/2018 US influenza season. https://doi.org/10.5281/ZENODO.1255023(2018).
Reich, N. G. et al. A collaborative multi-model ensemble for real-time influenza season forecasting in the U.S. bioRxiv 566604 https://doi.org/10.1101/566604(2019).
McGowan, C. et al. Collaborative efforts to forecast seasonal influenza in the United States, 2015–2016. Sci. Rep. 9, 683 (2019).
Kobres, P.-Y. et al. A systematic review and evaluation of Zika virus forecasting and prediction research during a public health emergency of international concern. bioRxiv 634832, https://doi.org/10.1101/634832(2019).
Polonsky, J. A. et al. Outbreak analytics: a developing data science for informing the response to emerging pathogens. Philos. Trans. R. Soc. B Biol. Sci. 374, 20180276 (2019).
Article Google Scholar
Rivers, C. et al. Using “outbreak science” to strengthen the use of models during epidemics. Nat. Commun . 10, 3102 (2019).
Nelson, B. et al. Forecasting Success: Achieving U.S. Weather Readiness for the Long Term; U.S. Congressional Committee on Commerce (2013).
Biggerstaff, M. et al. Results from the centers for disease control and prevention’s predict the 2013–2014 Influenza Season Challenge. BMC Infect. Dis. 16, 1–10 (2016).
Article Google Scholar
Biggerstaff, M. et al. Results from the second year of a collaborative effort to forecast influenza seasons in the United States. Epidemics https://doi.org/10.1016/j.epidem.2018.02.003(2018).
National Science and Technology Council. Toward Epidemic Prediction: Federal Efforts and Opportunities in Outbreak Modeling (2016).
Tushar, A. & Reich, N. G. flusight: interactive visualizations for infectious disease forecasts. J. Open Source Softw. 7, 2016–2018 (2017).
Google Scholar

Download references

Acknowledgements

The findings and conclusions in this comment are those of the authors and do not necessarily represent the official position of the Centers for Disease Control and Prevention. We thank Stephanie Rogers, Kevin O’Connell, and Joe Buccina for insightful discussions on early drafts, and Christyn Zehnder for skilled, enthusiastic assistance with figures. Wendy Taylor is the former director of the Center for Accelerating Innovation and Impact at the U.S. Agency for International Development. The views expressed here are the author’s alone.

Author information

Authors and Affiliations

BNext, IQT Labs, Arlington, VA, USA
Dylan B. George & Tara O’Toole
Rockefeller Foundation, New York, NY, USA
Wendy Taylor
Climate and Health Program, Department of Environmental Health Sciences, Mailman School of Public Health, Columbia University, New York, NY, USA
Jeffrey Shaman
Center for Health Security, Johns Hopkins University, Baltimore, MD, USA
Caitlin Rivers
Taivara Ltd, Columbus, OH, USA
Brooke Paul
Division of Vector-Borne Diseases, Centers for Disease Control and Prevention, San Juan, PR, USA
Michael A. Johansson
The MITRE Corporation, Bedford, MA, USA
Lynette Hirschman
Influenza Division, Centers for Disease Control and Prevention, Atlanta, GA, USA
Matthew Biggerstaff
Leidos, Reston, VA, USA
Jason Asher
Department of Biostatistics and Epidemiology, University of Massachusetts-Amherst, Amherst, MA, USA
Nicholas G. Reich

Authors

Dylan B. George
View author publications
You can also search for this author in PubMed Google Scholar
Wendy Taylor
View author publications
You can also search for this author in PubMed Google Scholar
Jeffrey Shaman
View author publications
You can also search for this author in PubMed Google Scholar
Caitlin Rivers
View author publications
You can also search for this author in PubMed Google Scholar
Brooke Paul
View author publications
You can also search for this author in PubMed Google Scholar
Tara O’Toole
View author publications
You can also search for this author in PubMed Google Scholar
Michael A. Johansson
View author publications
You can also search for this author in PubMed Google Scholar
Lynette Hirschman
View author publications
You can also search for this author in PubMed Google Scholar
Matthew Biggerstaff
View author publications
You can also search for this author in PubMed Google Scholar
Jason Asher
View author publications
You can also search for this author in PubMed Google Scholar
Nicholas G. Reich
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

D.B.G. and N.G.R. conceived of and drafted the paper. W.T., J.S., T.O., C.R., B.P., M.A.J., L.H., M.B., and J.A. contributed formative ideas, recommendations, and assisted with drafting and editing the paper.

Corresponding author

Correspondence to Dylan B. George.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

George, D.B., Taylor, W., Shaman, J. et al. Technology to advance infectious disease forecasting for outbreak management. Nat Commun 10, 3932 (2019). https://doi.org/10.1038/s41467-019-11901-7

Download citation

Received: 01 April 2019
Accepted: 07 August 2019
Published: 02 September 2019
DOI: https://doi.org/10.1038/s41467-019-11901-7

Technology to advance infectious disease forecasting for outbreak management

Subjects

Data, analytics are force multipliers for outbreak response

Data collection

Data cleaning

Data sharing

Analytics: training models

Analytics: forecasting

Visualization and communication

Conclusions

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Rights and permissions

About this article

Cite this article

Search

Quick links

Subjects

Data, analytics are force multipliers for outbreak response

Data collection

Data cleaning

Data sharing

Analytics: training models

Analytics: forecasting

Visualization and communication

Conclusions

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Search

Quick links