Modeling Zika Virus Transmission Dynamics: Parameter Estimates, Disease Characteristics, and Prevention

Because of limited data, much remains uncertain about parameters related to transmission dynamics of Zika virus (ZIKV). Estimating a large number of parameters from the limited information in data may not provide useful knowledge about the ZIKV. Here, we developed a method that utilizes a mathematical model of ZIKV dynamics and the complex-step derivative approximation technique to identify parameters that can be estimated from the available data. Applying our method to epidemic data from the ZIKV outbreaks in French Polynesia and Yap Island, we identified the parameters that can be estimated from these island data. Our results suggest that the parameters that can be estimated from a given data set, as well as the estimated values of those parameters, vary from Island to Island. Our method allowed us to estimate some ZIKV-related parameters with reasonable confidence intervals. We also computed the basic reproduction number to be from 2.03 to 3.20 across islands. Furthermore, using our model, we evaluated potential prevention strategies and found that peak prevalence can be reduced to nearly 10% by reducing mosquito-to-human contact by at least 60% or increasing mosquito death by at least a factor of three of the base case. With these preventions, the final outbreak-size is predicted to be negligible, thereby successfully controlling ZIKV epidemics.


Results
Identification of parameters that can be estimated from island data. We fitted our model to the cumulative new infection data from each of six islands of French Polynesia (Tahiti, Sous-le-vent, Moorea, Tuamotu-Gambier, Marquises, Australes) and Yap island. We first estimated five parameters along with their respective standard errors (Table 1). With these parameters, the model simulations exhibited reasonable agreement with the data (see Supplementary Fig. S1). However, as we can observe from Table 1, the standard errors for the estimated parameters are very large, giving a negative lower limit of 95% confidence intervals. The reason for the large standard errors could be that the data may not have enough information to estimate all five parameters and/or the model solution, P, may not be sensitive to all five parameters 22 . This uncertainty embedded in larger confidence intervals can be reduced by using less number of free parameters during data fitting process. As successfully implemented in many previous studies [23][24][25][26][27] , the number of free parameters can be reduced without violating the significance of data-fitting by fixing the parameter which has the least impact on the model solution.
To use the similar technique, we computed the sensitivity matrix (see "Methods" section), which allowed us to www.nature.com/scientificreports www.nature.com/scientificreports/ identify parameters that can be fixed and obtain reasonably smaller confidence intervals without violating the significance of data-fitting.
As discussed in "Methods" section, we computed the standard errors and the sensitivity value of P, i.e. ∂ , at the estimated parameter values using the second-order accurate complex-step approximation technique (Fig. 1). Note that the bigger the overall sensitivity value, ∂ ∂Φ P j , the more sensitive P is to the parameter Φ j . As seen in Fig. 1, the magnitude of the sensitivity of P to one of the parameters (mostly η) is bigger in a multiple of magnitudes than each of the rest. In addition, the model solution is sensitive to most of the other parameters only for short periods of time. For each island, we identified the least sensitive parameter and fixed it during the data fitting process. We used the fixed value as an average estimate of all islands from Table 1 and later performed the sensitivity analysis of these chosen values. Then we www.nature.com/scientificreports www.nature.com/scientificreports/ refitted the model to the data to estimate the remaining four parameters (see Supplementary Table S1). We repeated the process by increasing the number of fixed parameters one at a time until each estimated parameter has a confidence interval less than a threshold value (see Supplementary Tables S2 and S3). Since we are interested in 95% confidence intervals, which corresponds to t-value of about 2.1 from student's t-table for this data, we used a threshold ϑ = 1/(2.5) = 0.4 for our estimation (see "Methods" section). However, a lower value can be used if a higher confidence level is desired.

∂Φ
For the data considered here, the standard errors along with sensitivity results suggest that the data sets do not contain sufficient information to estimate more than three parameters in islands of French Polynesia and more than two parameters in Yap island with a reasonable degree of certainty attached to the estimates (Table 2). Interestingly, even in the islands where the equal number of parameters can be estimated, the parameters that can be estimated differ from island to island. For example, the data sets of both Tahiti and S-L-V allow to estimate 3 parameters, but (β m , α h , η) can be estimated from Tahiti while (β m , γ h , η) can be estimated from S-L-V (Table 2).
Final parameter estimates. As identified above, the given island data sets allow us to estimate three parameters for Tahiti, S-L-V, Moorea, T-G, Marquises, Australes, and two parameters for Yap with a reasonable confidence interval. Note that the parameters that can be estimated from these data set with a reasonable confidence interval differ from island to island. The final parameters obtained for each island along with their 95% confidence intervals are given in Table 2. The standard errors across all 7 islands have significantly decreased in our final estimates (see Supplementary Fig. S2).
To assure that the reduction of free parameters does not provide a poor fitting, we performed F-test 27 . In each island, we found that increasing the number of free parameters did not improve the statistical significance of the model fitting (p-value > 0.05 in each case). This shows that choosing the fixed parameters in a way as done in our case provides smaller confidence intervals without violating the significance of the data-fitting. With the final estimated parameters, the model prediction along with the survey data for each island is shown in Fig. 2 (left column).
In order to investigate whether the final estimated parameters are affected by the choice of values at which the fixed parameters are set, we performed a sensitivity analysis of the fixed parameters on the estimated parameters. In this analysis, we randomly chose 200 different values for each 'fixed' parameter from the uniform distribution of the values over the range of estimate in Table 1. Then for each of 200 sets of fixed parameters, we estimated the free parameters through data fitting. We obtained that the estimated parameters are less sensitive to these fixed values ( Supplementary Fig. S3), showing the robustness of our final parameters. This observation is aligned with the fact that fixing the less sensitive parameter at a reasonable value would not significantly affect the estimated parameters.
Note that we estimated parameters based on cumulative data as it provided a simple model formulation. In addition, we also computed the weekly new infection predicted by the model and compared them with the experimental weekly raw data (Fig. 2, right column). The model predictions from these final parameters estimated provide excellent agreement with the experimental weekly data from each of the 7 islands considered. To observe whether these final parameter estimates are affected when weekly raw data are used for fitting as in the early epidemics of Ebola virus 28 , we also fitted our model directly to the weekly raw data and found that the final estimates are not affected much in these island data sets (Supplementary Table S4 and Fig. S4).

Characteristics of ZIKV transmission dynamics.
Note that the mosquito-to-human transmission rate, βˆh, could be estimated with reasonable confidence from only Yap island data. Based on this estimate, we obtained the mosquito-to-human transmission rate to be 0.50 (95% CI: 0.46-0.53) per day for Yap island ( Table 2). On the other hand, we could estimate the human-to-mosquito transmission rate, β m , from all islands except Yap, and  www.nature.com/scientificreports www.nature.com/scientificreports/ found that β m ranges from 0.04 (95% CI: 0.03-0.06) per day in Marquises to 0.13 (95% CI: 0.09-0.18) per day in Moorea. It shows that the per day rate of mosquito-to-human transmission is about 4 to 12 times higher than that of human-to-mosquito. Our predicted human incubation period (1/α h ) is about 4 to 12 days and can be estimated from Tahiti, Moorea, T-G and Australes. The predicted infectious (1/γ h ) period from our model is about 12 days that was estimated from S-L-V and Marquises islands ( Table 2). These predictions are consistent with some previously measured laboratory data 29,30 . www.nature.com/scientificreports www.nature.com/scientificreports/ Estimated values of η, which could be estimated from the data sets of all islands, indicate that only a small portion of predicted Zika infection was reported to the health sentinel sites. The reported cases ranged from 2.85% in Moorea to 19.99% in Yap. This shows that an actual epidemic size of the ZIKV could be significantly higher than that seemed in the reported cases. This is in agreement with the fact that individuals infected with ZIKV usually do not show any symptoms or show only mild symptoms and are most likely to be unreported.
Basic reproduction number. The Basic Reproduction Number, R 0 , is defined as the average number of secondary cases generated by a typical infectious individual in a fully susceptible population 17 . The disease dies out if R 0 < 1 and the epidemic occurs if R 0 > 1. We calculated R 0 for our model using the next generation operator approach 31 . We obtained the basic reproduction number for our model as follows: Using the estimated parameters in Eq. (1), we obtained the basic reproduction number, R 0 , with a value ranging from 2.03 in Australes to 3.20 in Yap island ( Table 2). Based on the parameter estimates, the range of R 0 for each island is also presented in Table 2. The model predicts R 0 > 1 in each island, and there were ZIKV epidemics, which is consistent with the observations in the data collected.
We further examined the effects of the parameters on the reproduction number R 0 using the normalized forward sensitivity index S x given by 32 : where x is one of the parameters whose sensitivity on R 0 is sought. This index implies that the higher the value in its magnitude, the more sensitive R 0 is to the parameter. Also, the positive (or negative) sign indicates that R 0 increases (or decreases) as x increases. Our result shows that the basic reproduction number is more sensitive to mosquito lifespan than any other parameters (Fig. 3), suggesting that prevention programs focused on reducing mosquito lifespan can be more effective for avoiding ZIKV infection. To a lesser extent, R 0 is also sensitive to β m , βˆh, and γ h . Such measurements can be useful to identify and quantify the effective prevention strategies.
Disease outcomes: prevalence and outbreak size. While we acknowledge that the same parameters may not be suitable for all the islands, we take the average of the values in Table 2 for our base case computations and simulation study purposes. With these parameters, our model predicts the mean prevalence of infection to be at its peak between the initial 8 to 10 weeks of infection. The amplitude of the peak suggests that during the peak time of infection 30-35% of the total population will be affected (Fig. 4). The model also suggests that after  www.nature.com/scientificreports www.nature.com/scientificreports/ approximately 20 weeks, the ZIKV epidemic will be over, even if no prevention program is implemented. Since our model does not include demographic birth-death and the disease death terms, the final outbreak size can be calculated by integrating the term β S t I t ( ) ( ) h h m from the beginning of infection to the time when epidemic ends. We found that the final size of the epidemic can reach nearly 100% without prevention indicating that almost the entire island population can be infected with ZIKV during the epidemic period.
Effect of prevention programs on disease outcomes. We evaluated two illustrative prevention programs: one that reduces contact between mosquito and human, and another that decreases the mosquito lifespan. Reducing the contact between mosquito and human refers to a variety of programs, such as wearing skin-covering clothes and using mosquito repellents. Similarly, decreasing the mosquito lifespan refers to the program such as the use of insecticides or other chemicals which aim to inhibit mosquito population growth. In our model, mosquito-to-human (βˆh) and human-to-mosquito transmission rate (β m ) are the parameters related to prevention programs that focus on reducing contact between humans and mosquitos, while the mosquito life-span (λ m ) can be associated with the preventive measures that aim to destruct the mosquito population.
If φ with 0 ≤ φ ≤ 1 is an effectiveness of the first prevention program (i.e. the reduction of contact between human and mosquito), implementing such programs causes the following transformation of our model: m m . Our model suggests that reducing mosquito and human contact by at least 60% (i.e., when φ ≥ 0.6)would decrease the prevalence of ZIKV to an almost negligible level (Fig. 5). In this case, the final outbreak size reduces dramatically from 100% to nearly 10%.
Similarly, a decrease in mosquito lifespan (the second prevention program) with effectiveness θ, i.e., the reduction of mosquito lifespan by θ times, changes our model causing λ θλ → m m . With such prevention programs, the prevalence of ZIKV decreases to a negligible level when mosquito death is increased by at least a factor of three, i.e., θ ≥ 3 (Fig. 5). Also, this prevention effort can reduce the final outbreak size from about 100% to nearly 10%.

Discussion
In this study, we developed a sensitivity analysis based method, which utilizes the transmission dynamics model of ZIKV infection and the recently expanded complex-step approximation technique 22,33 , to identify parameters that can be estimated from the available limited data set. Using the estimated parameters by this technique, we also computed the basic reproduction number for ZIKV transmission dynamics and performed analysis and simulation of the models to investigate the disease outcomes and the effectiveness of prevention programs on controlling ZIKV infections.
Implementing our technique to seven island data (six French Polynesia and one Yap), we identified that these data sets do not contain sufficient information to estimate more than three parameters in islands of French Polynesia and more than two parameters in Yap island with a reasonable degree of certainty attached to the estimates. Note that previous studies 21 used some of these island data to estimate up to six parameters. However, the previous study 21 used the stochastic approach with a Bayesian fitting procedure and whether this approach experiences similar effects is not known. Importantly, our analysis also showed that the number of estimable parameters and the estimated values varies from island to island, suggesting that the same set of parameters cannot be estimated from every island and thus attempting to estimate the same parameters across all islands may not provide reasonable information about the ZIKV transmission dynamics. Identification of parameters that can be estimated as done in our study may help to obtain important information about parameters related to ZIKV transmission dynamics. As a result, our method provides reasonably small confidence intervals implying more reliability to the estimated parameters (Table 2) while assuring significantly well model fitting to the island data. www.nature.com/scientificreports www.nature.com/scientificreports/ Compared to a previous study 21 that used the six islands of French Polynesia, some of the estimates from our method are quite different. In general, our estimates provided higher βˆh, lower β m , and lower η than the previous estimates.
We found that only a small portion of infections was reported (2.85-19.99%) as suspected cases across the islands (Table 2). This indicates that actual epidemic size could be quite larger than the documented epidemic size. Those non-reported zika infections might be either asymptomatic infections and/or infections with mild symptoms that did not enter the healthcare system. This phenomenon was supported by the household survey following the Yap island outbreak in 2007 9 . Having a large number of non-reported cases estimated in this study warns higher severity of zika burden in epidemic regions and underscores a need for better surveillance and detection strategies.
Computed basic reproduction number, R 0 , from our study slightly varies from island to island ( Table 2) and reflects that the ZIKV spreads rapidly throughout the islands. Based on a sensitivity analysis of basic reproduction number, we found that the value of R 0 is mostly dependent on the mosquito life span, though other parameters can also have some impacts on R 0 . This indicates that the most effective prevention strategy to avoid zika epidemics could be the control of mosquito growth or life span.
Our investigation on prevalence and infection provided some valuable implications to ZIKV epidemics. The prevalence started to increase at the beginning and reached its peak in between 8 to 10 weeks of the outbreak (Fig. 4). Then, it gradually decreased since more humans were recovering from the virus than those with the new ZIKV infections. Our study found that almost 100% of the island people were infected during the outbreak and the result is consistent with the other studies 21 . Since mosquito bite is the main reason for disease transmission, our result showed that reducing the human and mosquito contact could create a safe environment. Reducing the contact about 60% between human and mosquito can drastically reduce both peak prevalence and final outbreak size and almost eradicate the ZIKV infection (Fig. 5). The outcome is almost identical with the reduction of the mosquito lifespan (Fig. 5). The disease can completely be exterminated by lowering the mosquito lifespan by a factor of 3 to 4 times its base case. We note that the evaluation of these prevention programs was based on the sensitivity of prevention-related model parameters. Further evaluations with detailed models and the data related to the prevention programs are necessary before recommending these programs to practical applications.
We acknowledge some limitations of our study. In this study, we modeled the island situation in which humans and mosquitos usually have close proximity to one another. While our study is relevant to many settings that share characteristics of our population, including military units, college campus, nursing homes, boarding schools, and other rural communities, these results may not be generalizable to other conditions where uniform mixing between humans and mosquitos is not the case. Secondly, we did not consider the seasonal variation in transmission in our analysis as a result of climate factors. However, the outbreaks ended before there was a substantial seasonal change in rainfall or temperature and hence might have very less influence on disease transmission. If the outbreaks had ended because of seasonality rather than the depletion of susceptible populations, it would reduce the estimated proportion of the infected population. Our parameter estimates and related confidence intervals are based on limited data sets, thus there might be some quantitative difference between our predictions and the real scenarios. Our results on prevention programs are based on the parameter sets averaged over islands. However, we acknowledge that the same parameters may not provide reasonable outcomes for all islands, or even for the same island at different time points. We have also ignored potential stochastic effects in ZIKV transmission, which may be important, particularly during the early phase of the infection. The estimates may be improved by incorporating stochastic effects in our model 28 . However, our data contain entire epidemic periods, rather than only initial growth, thereby reducing the stochastic effects. Further study with stochastic modeling is necessary to accurately evaluate the stochastic effects on ZIKV dynamics of these islands.
The main goals of this study were to gain deeper insight into the epidemiological parameters of ZIKV transmission and to evaluate appropriate prevention strategies. The results identified the importance of the information contained in the data in estimating the ZIKV related parameters from the available limited data. This work offered novel insights into ZIKV related parameters as well as ZIKV infection dynamics and effect of prevention programs on disease outcomes, which might be useful for developing ideal prevention and control strategies.  9 . In the ZIKV outbreak data of French Polynesia, clinical cases were defined as suspected cases if they were presented to health practitioners with rash and/or mild fever and at least two of the following signs: conjunctivitis, arthralgia, and edema. In total, 8,744 suspected cases were reported from the health sentinel sites. Similarly, in the Yap Island data, researchers reviewed medical records and conducted prospective surveillance at the hospital and all four health centers on Yap to identify patients with suspected ZIKV disease 9 . Suspected cases had the following characteristics: acute onset of generalized macular or papular rash, arthritis or arthralgia, or nonpurulent conjunctivitis. Out of the total 1,276 households tested on Yap Island, 185 cases were identified as suspected ZIKV disease, which we extrapolated for the whole population of Yap Island. We obtained population data for these Islands from the 2012 French Polynesia Census 35 and the Federated States of Micronesia 2000 Census 36 .
Mathematical model. We developed a compartmental mathematical model to describe the ZIKV transmission dynamics, similar to the ones previously used for vector-borne transmission 37,38 . The humans were modeled using a susceptible-exposed-infectious-recovered (SEIR) framework, whereas the mosquitos were modeled as www.nature.com/scientificreports www.nature.com/scientificreports/ susceptible-exposed-infectious (SIE) framework (Fig. 6). In this model, exposed classes were incorporated to include delays as a result of intrinsic (human) and extrinsic (mosquito) incubation periods.
In the model system, S h represents the number of susceptible humans, E h is the number of humans currently in their incubation period, I h is the number of infectious humans, and R h is the number of humans that have recovered from the ZIKV infection. Similarly, S m , E m , and I m represent the susceptible, exposed, and infectious mosquito populations, respectively. The dynamics of our ZIKV epidemiological model are governed by the following system:  www.nature.com/scientificreports www.nature.com/scientificreports/ incubation period, 1/γ h represents the human infectious period, and 1/λ m is the mosquito life-span. In this model, susceptible humans get infected through the bites by infected mosquitos at a mosquito-to-human transmission rate β h and a susceptible mosquito get infected when it bites infected humans at the human-to-mosquito transmission rate β m . We presented our model with density-dependent infection rate from mosquito to human transmission. In this model, the total human population and the total mosquito population remain constant over time, . Therefore, with scaling β β → N h h h , the density-dependent infection rate and the frequency-dependent infection rate are equivalent, and with this scaling, our model can easily recover the model with frequency-dependent rate.
Since the death due to ZIKV was not reported during the period of epidemics, we have ignored disease death rate terms in the model. We also consider a closed population (i.e. a population with no births, deaths or continual immigration), since the mean human lifespan is much longer than the outbreak duration, and entry and exit of people inside the island are negligible during this short period of the outbreak. We assumed all people transmitted at the same rate, regardless of whether they displayed symptoms or were reported as cases. We considered that no transmission typically occurs before the exposed individuals enter the infectious class.
We  Initial population, mosquito lifespan and mosquito incubation period. Serological analysis of samples from blood donors between July 2011 and October 2013 suggested that only 0.8% of the population of French Polynesia were seropositive to ZIKV 39 . We, therefore, assumed that the population was fully susceptible initially. We also assumed that the outbreak began with one initial exposed and one infectious human (i.e., e h (0) = i h (0) = 1/N H ), and one exposed and one infectious mosquito (i.e., e m (0) = i m (0) = 0.005). The mosquito lifespan and the mosquito incubation period were previously estimated to be 10 days 40,41 and 15 days 41,42 , respectively. Therefore, we took constant values of 1/α m = 10 days and 1/λ m = 15 days for all islands. With these parameters and initial conditions fixed, the remaining five model parameters, α h , γ h , βˆh, β m and η (η is the proportion of case reported) are required to be estimated using epidemic data from ZIKV outbreaks in Yap island and the islands of French Polynesia.
Model fitting to the data. We fitted the model to cumulative weekly new infection data. The cumulative new infections predicted by our model, P(t), are given by the solution of the following equation: h h h We solved the system of differential equations numerically using a fourth order Runge-Kutta method. Assuming that the errors are independent and normally distributed with mean zero, we used the solutions to obtain the best-fit parameters via a nonlinear least squares regression method that minimizes the following sum of the squared residuals.
is a set of m parameters to be estimated; P t k and P t k are cumulative infected population values predicted by the model and those obtained from the survey data, respectively. Here, n represents the total number of data points available for the model fitting. All computations were carried out in MATLAB (The MathWorks, Inc.). In addition to fitting the model to cumulative data, we also fitted the model directly to weekly new infection data (see "Final parameter estimates" section).
Computation of confidence intervals. To obtain confidence limits for the estimated parameters, we compute standard errors for Φ by using similar ideas as described in Banks, et al. 26 . For this, we first compute the sensitivity matrix Ψ of the parameters. www.nature.com/scientificreports www.nature.com/scientificreports/ , , from our model, we use the following complex-step approximation to compute the partial derivatives described briefly below and in Supplementary Materials. We consider the Taylor expansion of P t n using a complex step ih, where h is taken to be a small positive constant (h = 10 −40 in our computations) and i is the unit imaginary number.
Taking the imaginary part of both sides of the above equation and dividing by h gives