A review of epidemiological parameters from Ebola outbreaks to inform early public health decision-making

The unprecedented scale of the Ebola outbreak in West Africa has, as of 29 April 2015, resulted in more than 10,884 deaths among 26,277 cases. Prior to the ongoing outbreak, Ebola virus disease (EVD) caused relatively small outbreaks (maximum outbreak size 425 in Gulu, Uganda) in isolated populations in central Africa. Here, we have compiled a comprehensive database of estimates of epidemiological parameters based on data from past outbreaks, including the incubation period distribution, case fatality rate, basic reproduction number (R0), effective reproduction number (Rt) and delay distributions. We have compared these to parameter estimates from the ongoing outbreak in West Africa. The ongoing outbreak, because of its size, provides a unique opportunity to better understand transmission patterns of EVD. We have not performed a meta-analysis of the data, but rather summarize the estimates by virus from comprehensive investigations of EVD and Marburg outbreaks over the past 40 years. These estimates can be used to parameterize transmission models to improve understanding of initial spread of EVD outbreaks and to inform surveillance and control guidelines.

care, insufficient protective measures among health care workers in health care settings 6,24,25 ) or with fatal EVD patients in preparation for burial 19,20 . Control measures for EVD are well documented and include identification, isolation and care of suspected patients, strict infection prevention and control among those caring for patients and safe burials 26,27 . At the start of an infectious disease outbreak, it is critical to understand the transmission dynamics of the pathogen and to determine those at highest risk for infection or severe outcomes in the population(s) affected 28,29 . This information is needed to develop interventions to reduce the spread of disease and to reduce morbidity and mortality in the affected populations. Real-time analysis of any ongoing outbreak by analyzing detailed information collected on the confirmed, probable and suspected cases and deaths provides an opportunity to determine the stages of disease and areas where control measures can be applied. For example, knowledge of the incubation period distribution of the pathogen will inform the duration of time required to follow up the contacts of cases to evaluate whether or not they become secondary cases. Additionally, information on the timing of symptom onset, isolation, hospitalization and outcome (either death or recovery) are important to understand EVD progression. Mathematical models which make use of available data early in an outbreak to estimate the outbreak's potential impact are increasingly used by public health policy makers to inform decision making around emerging and re-emerging pathogens [28][29][30] .
The purpose of this review was to collect all published epidemiological parameter estimates (reprinted in detailed tables containing estimates, and corresponding confidence intervals) estimated from past EVD outbreaks. Our aim was not to perform a meta-analysis, but rather to compile and document the available parameter estimates based on data from EVD outbreaks over the past 40 years. In order to estimate any of the parameters referenced in our manuscript, we would need detailed case data of each of the cohorts studied in the original papers, which we do not have. We also reprint parameter estimates from past Marburg outbreaks and the ongoing outbreak in West Africa for comparison. This information is valuable for public health organizations that need to quickly evaluate the early behavior of a new outbreak and estimate the potential impact, in terms of morbidity, mortality and geographic spread. We highlight how the parameter estimates we have examined improve our understanding of EVD epidemiology. Our results help to put the ongoing EVD outbreak in West Africa into context and to assess the likely effects of ongoing and novel interventions.

Data collection
All searches using the following search terms (Ebola, Marburg, EHF, EVD, MHF, EBOV, Ebola Zaire, Ebola Sudan, Ebola Reston, Ebola Bundibugyo, outbreak, model, parameterization, incubation period, case fatality rate, case fatality rate (CFR), risk factors, basic reproduction number, R 0 , effective reproduction number, serial interval, delay distributions, generation time) were carried out on 1 August 2014, 15 September 2014 and again in February 2015 using the following databases: ScienceDirect, ResearchGate, Google, GoogleScholar, BioOne, Web of Science and PubMed. Our searches aimed to find primary reports describing and analyzing data collected from investigations of EVD and Marburg outbreaks since the virus was identified in 1976. The criteria for inclusion were: sample size of EVD cases described in the study ≥5, studies of human outbreaks, studies which evaluated potential risk factors had to report prevalence proportion ratios, odds ratios or relative risks. Reviews, commentaries, case reports on individual cases, and policy pieces were excluded. Additionally, literature evaluating non-human outbreaks or the potential for international (human) spread of EVD outside of an outbreak zone was excluded.
Using these search terms, a total of 49 papers were determined eligible for inclusion. In addition, for context we included additional published information on EVD including the final outbreak sizes as reported by the World Health Organization (WHO) Disease Outbreak News following declaration that each outbreak was over.
From the relevant EVD and Marburg literature, we extracted the following details for all parameter estimates (as provided): point estimates, confidence intervals, ranges, sample size used to estimate the parameter (total numbers of cases encompassing confirmed, suspected, and retrospectively diagnosed cases, depending on the study), EVD virus, and inferential methods. We then compiled the parameter estimate database into tables. Table 1 and Data Citation 1 list the human outbreaks of Ebola Zaire, Ebola Sudan and Ebola Bundibugyo that have occurred in Africa from 1976 to present. We have not provided detailed information on the outbreaks as these have been previously described 9 . Table 2 (available online only) summarizes the literature we used in this review.
Our manuscript and tables include estimates, confidence intervals and ranges obtained from the referenced publications (Table 2 (available online only) and Data Citation 2).

Definition of key parameters recorded
The incubation period is the interval between exposure to a pathogen and initial occurrence of symptoms and signs 28,29 . The incubation period distribution is usually characterized using the mean or the median incubation period.
The CFR is the proportion of cases (infected symptomatic individuals) within a designated population who die as a result of their infection. For past EVD and Marburg outbreaks, we report on the CFR www.nature.com/sdata/ SCIENTIFIC DATA | 2:150019 | DOI: 10.1038/sdata.2015.19 estimated after the outbreak was declared over (estimated at least 42 days after the last case experienced symptom onset) by taking the number of deaths among cases divided by the total number of cases recorded during the outbreak. However, during outbreaks, the CFR is often estimated before all cases have been identified and before some cases have either recovered or died.
Risk factors for infection include demographic factors, medical conditions and behavioral exposures or practices that are associated with an individual's risk of becoming infected with Ebola.
The basic reproduction number (R 0 ) is used to measure the transmission potential of a disease. It is the average number of secondary infections produced by an infected case in a susceptible population 31 . If R 0 >1, then once established the outbreak will continue, whereas if R 0 o1, then the outbreak will die out.
The effective reproduction number (R t ) is similar to R 0 but relates to a particular calendar time t (after the start of the outbreak). Like R 0 , if R t >1, then the outbreak will continue, whereas if R t o1, then the outbreak will die out. R t can be reduced through the use of successful control measures (e.g., by limiting contacts between susceptible and infectious individuals). R t can also be reduced due to the depletion of susceptible individuals whether through extensive transmission or through the immunization of susceptible individuals 32 .
The serial interval is the interval between symptom onset in an index case and symptom onset in a secondary case infected by that index case 33 .
The generation time is the interval between infection of an index case and infection of a secondary case infected by that index case. The serial interval is more frequently estimated than the generation time and is often assumed to be the same duration as the generation time 34 .

Delay distributions
Symptom onset to hospitalization (also referred to as onset to clinical assessment): The interval between symptom onset and hospitalization. Hospital admission to day of first blood sample: The interval between admission to hospital or medical facility for treatment of EVD and when a biological sample is collected for diagnosis. Symptom onset to recovery/discharge: The interval between symptom onset and recovery or hospital discharge. Symptom onset to death: The interval between symptom onset and death. Duration of admission (survivors)-hospitalization to discharge: The interval between admission to a hospital or medical facility for treatment of EVD and discharge from the facility. Duration of admission (fatal cases)-hospitalization to death: The interval between admission to a hospital or medical facility for treatment of EVD and death.

Data Records
The data from this analysis are summarized in two types of data format. Four data tables detail the methods and parameter estimates from each study included in our review. Our data tables: Table S1: Human Outbreaks of Ebola Zaire, Ebola Sudan and Ebola Bundibugyo from 1976 presents compiled data on the year and location of the each human outbreak, the Ebola Virus causing the outbreak and number of cases reported (Data Citation 1). Using these four tables, we then summarized the parameter database in six tables and two figures presented in this article. The parameters estimated for Ebola Zaire, Ebola Sudan and Ebola Bundibugyo outbreaks, including the incubation period distribution, serial interval distribution, R 0 , delay distributions and CFR, are shown in Tables 3 (available online only), 4 (available online only), 5, respectively. Parameter estimates for the ongoing outbreak in West Africa are summarized in Table 6 (available online only) and for Marburg outbreaks are presented in a single table (Table 7). Risk factors for Ebola and Marburg infection are summarized in Table 8 (available online only). Estimates of the incubation period distribution and CFR are presented in Figs 1 and 2, respectively.

Incubation period distribution
The incubation period distribution of EVD has been estimated for past EVD outbreaks ( Fig. 1 and Tables 3 (available online only), 4 (available online only), 5; minimum sample size n = 5, maximum sample size n = 1,798). The mean (or median) incubation period (Fig. 1) The mean incubation period for the ongoing Ebola outbreak in West Africa has been estimated to be between 9-12 days (Table 6 (available online only)) 16,17,41,48 . The range of incubation periods observed in past EVD outbreaks supports the policy of contact tracing for 21 days following contact with an EVD patient. An outbreak is officially declared over after no new cases are identified 42 days (2 times the 21-day maximum incubation period) after the last EVD case is found.

Case fatality rate (CFR)
In Fig. 2 and Tables 3 (available online only), 4 (available online only), 5, we reprint the estimated CFR for each Ebola outbreak (by virus) and for Marburg virus. The Ebola Zaire virus is the most lethal with an overall estimated CFR ranging from 69 to 88% 2,5,25,38,43,49,50 (Table 3 (available online only)). The CFR of outbreaks due to Ebola Sudan virus ranged from 53 to 69% 1,24,51-53 (Table 4 (available online only)), and the CFR of outbreaks due to Ebola Bundibugyo ranged from 34 to 42% 19,46,47 (Table 5). For the ongoing outbreak in West Africa due to Ebola Zaire, the estimated CFR, as measured among confirmed and probable cases with definitive outcome (recovered or fatal), is approximately 70%, and varies little among        the three most affected countries (Guinea, Liberia and Sierra Leone; Table 6 (available online only) and Data Citation 2) 38 . The CFR among EVD cases reported by Nigeria (n = 20) was 40% 54 . A second, unrelated EVD outbreak occurred in Équateur province, DRC between July and October 2014 resulting in 69 confirmed and probable cases with a CFR of 74% 49 . The CFR for Marburg is approximately 80% [55][56][57] .
In the ongoing outbreak in West Africa, estimates of R 0 and R t have been estimated for all countries combined, as well as separately for Guinea, Liberia, Nigeria and Sierra Leone 16,30,36,41,48,54,[61][62][63][64][65][66][67][68][69] . All estimates of R t are provided in Table 6 (available online only) and Data Citation 2 and Data Citation 3. Gomes et al. 62 reported several all-country R 0 estimates (means ranging 1.8-2.1), depending on model choice and assumptions. Fisman et al. 30 reported all-country and country-specific R 0 estimates depending on assumptions including action taken to mitigate infection. For the most part, R 0 estimates for Guinea, Liberia, and Sierra Leone ranged from 1.2 to 2.5 with the striking exception of the Fisman et al. 30 R 0 estimate of 8.3 for Sierra Leone. Nishiura and Chowell 65 estimated R t fluctuating around 1 for Guinea, 1.7 for Liberia and 1.4 for Sierra Leone. The WHO Ebola Response Team 16 estimated R t for Guinea (ranging from 1.6 to 2.0), for Liberia (ranging from 1.4 to 1.6) and for Sierra Leone (ranging from 1.3 to 1.5).
Several groups have also estimated R 0 for specific geographic areas within the region (full details in Table 6

Serial interval distribution
The serial interval, defined as the time interval between symptom onset in an index case and symptom onset in a secondary case infected by that index case, has been infrequently estimated due to the paucity of data on epidemiologically linked pairs of index and secondary cases. For Ebola Zaire (Table 3 (available online only)), the mean serial interval was estimated to be 10-16.1 days 5,49,60,70 . In the ongoing outbreak in West Africa, the mean serial interval has been estimated to be approximately 14-15 days 16,17,30,41 (Table 6 (available online only)).

Generation time distribution
Closely related to the serial interval, the generation time is defined as the time interval between infection of an index case and infection of a secondary case infected by that index case. As such, the generation time distribution nearly always needs to be inferred indirectly from serial interval observations and knowledge of the incubation period distribution. We found one such estimate of the mean generation time for Marburg of 9 days (95% CI 8.2, 10.0) 55 .

Delay distributions
For Ebola Zaire, including the ongoing outbreak, the mean time from symptom onset to hospitalization (Table 3 (available online only) and Table 6 (available online only)), ranged from 3.2 to 5.3 days 5,16,17,20,38,41,48 , whereas the mean time from symptom onset to death ranged from 6 to 10.1 days 5,17,25,[37][38][39]41,49,61 . For Ebola Sudan (Table 4 (available online only)), the mean time from symptom onset to hospitalization was 2 days (range 0-8) 51 and the median time from symptom onset to death was 9 days (range 5-15) 24 , respectively. The mean time from hospitalization to discharge for Ebola Sudan ranged from 8 to 10 days 51,53 whereas the mean time from hospitalization to death was 6.1 days (range 2-13) 51 . For Ebola Bundibugyo (Table 5), the median time from symptom onset to hospitalization was 3.5 days (range 0-8) 19 and the median time from symptom onset to death was 9-10 days (range 3-21 days) 19,47 .

Risk for developing EVD
Risk factors for human-to-human transmission of EVD or Marburg were evaluated from comparison of the exposures, behaviors and practices in cases compared to controls (including unaffected controls, defined to be suspected cases but negative serologic test results) and were described using a prevalence proportion ratio, an odds ratio or a relative risk (and the corresponding confidence interval). Significant risk factors associated with developing EVD are reported in Table 8 (available online only) and Data Citation 4 and include direct physical contact (sharing a bed, touching a cadaver or funeral preparations for an EVD patient, nursing care and contact with bodily fluids) and non-physical contact (sharing a meal, contact with a hospital where EVD patients were treated) 24,39,45,47,56,71 .

Usage Notes
The data presented in this review summarize estimates of the epidemiological parameters of EVD and Marburg. These results can facilitate parameterization and sensitivity analysis of transmission models examining surveillance, control and treatment strategies. The results can also inform epidemiological studies investigating human-to-human transmission during Ebola and Marburg outbreaks, deepening our understanding of the transmission process.
The number of parameters estimated for each outbreak has generally increased with time (Table 2  (available online only)). While the incubation period distribution was consistently assessed, R 0 has increasingly been estimated, most notably with the ongoing outbreak in West Africa (Table 6 (available online only) and Data Citation 2). Fig. 1 shows the central estimates and ranges for the different studies that estimated the incubation period distribution for EVD outbreaks. While there are small differences in the central estimates of the incubation period distribution of the four Ebola viruses, the ranges around the mean or median are consistent, with a maximum of ≤21 days. Current EVD guidance states that EVD has an incubation period of 2-21 days, which is the basis for the recommended duration of contact tracing of 21 days 26,27 . This is supported by the findings in our review. Figure 2 shows CFR for different Ebola Zaire, Ebola Sudan, Ebola Bundibugyo and Marburg outbreaks. While the CFR for Ebola and Marburg are high (compared to other infectious diseases), outbreaks caused by Ebola Zaire and Ebola Sudan have experienced the highest CFR amongst these three Ebola viruses causing outbreaks in humans 1,2,5,19,24,25,38,43,46,47,[50][51][52][53][72][73][74][75][76] . The estimated CFR for the ongoing outbreak in West Africa, due to Ebola Zaire, is approximately 70% 16,17,36 , which falls within the range of CFRs for past outbreaks due to this virus 2,5,25,38,43,50 . Figure 2 also illustrates that recent CFR estimates for Ebola Zaire remain comparable to those observed in the 1970s. While there are ongoing efforts to develop medical treatments for EVD, treatment remains mainly supportive. The massive scale of the ongoing outbreak has highlighted the urgent need to develop new treatments and to fast track the use of experimental medical interventions 77 .
The transmission potential, as measured by R 0 , is fairly consistent among the three Ebola viruses, ranging from approximately >1 to 4 (also mentioned in ref. 78). Previously, EVD typically affected villages in remote areas of central Africa 35,38,42,44,58,59 , and while devastating in these areas, the populations that are at risk are generally limited in number. The ongoing EVD outbreak had circulated for at least three months prior to discovery 22,79 which allowed spread of the virus to go unchecked while it infected people in an area of Guinea that shares borders with Sierra Leone and Liberia. Recent experience in Nigeria, has shown that an Ebola virus with R 0 >1, even in a population of over 20 million, can be controlled with vigorous application of control methods 49,54 .
Differences in estimates of R 0 and R t are likely, at least in part, to be the result of the quality of data available and the inferential method. The focus on R 0 estimation together with serial interval estimates may reflect a shift from data collection purely for surveillance to recognition of the epidemiological value of such data.
The specific factors that result in an EVD outbreak have been under investigation since the emergence of this virus and include examination of human and susceptible non-human populations living in close proximity with each other in remote areas of central Africa. Recent investigations into the first cases of the ongoing outbreak found that the outbreak may have begun in Meliandou, Guinea in a village where the inhabitants frequently came in contact with fruit bats in a hollowed out tree 80 . Although the current focus is on limiting human-to-human transmission and treating the infected, the challenging underlying factors that led to this large outbreak in West Africa will require long-term investments to improve both health care and surveillance for infectious diseases.
Our dataset is the most complete collection of published epidemiological parameter estimates from EVD outbreaks available at the time of writing and provides an evidence-based foundation for both retrospective analyses and responses to future outbreaks.