Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Error associated with estimates of Minimum Infection Rate for Endemic West Nile Virus in areas of low mosquito trap density


West Nile Virus (WNV) is a mosquito-borne infection that can cause serious illness in humans. Surveillance for WNV primarily focuses on a measure of infection prevalence in the Culex spp. mosquitos, its primary vectors, known as the Minimum Infection Rate (MIR). The calculation of MIR for a given area considers the number of mosquitos tested, but not the relative effort to collect mosquitos, leading to a potential underestimation of the uncertainty around the estimate. We performed Value of Information analysis on simulated data sets including a range of mosquito trap densities in two well-studied counties in Illinois between 2005 and 2016 to determine the relative error introduced into MIR associated with changing the density of mosquito traps. We found that low trap density increases the potential for error in MIR estimation, and that it does so synergistically with low true MIR values. We propose that these results could be used to better estimate uncertainty in WNV risk.


West Nile Virus (WNV) causes an infectious disease in birds, horses and humans that is transmitted by the bite of infected mosquitoes1. In 80% of human cases, the disease does not produce any symptoms, but in a subset of patients it can cause febrile illness, joint pain, fatigue and weakness and about 1 in 150 people develop encephalitis or meningitis1. The disease first emerged in the United States in 1999, following which it has spread across the contiguous states and is now considered established. The first report of WNV in the state of Illinois was among dead birds in 2001; it has since spread to almost all counties in the state. From 1999 until 2017, the CDC reports that Illinois has had 2,458 human cases of WNV, the 5th highest number in the nation, including 1,553 cases of neuro-invasive disease1. It is a nationally notifiable disease and the state of Illinois has a surveillance system to monitor this pathogen, resulting in reports in 2017 of 90 human cases, 8 human deaths, 25 birds with WNV positivity and 2,022 positive mosquito batches2.

Birds serve as the main reservoir hosts for WNV. The virus, which belongs to the Flaviviridae family, is transmitted by mosquitoes in the genus Culex. Several species of Culex mosquitoes, such as C. tarsalis, C. quinquefasciatus, C. stigmatosoma, C. thriambus, C. pipiens, and C. nigripalpus, have been found to be able to transmit WNV3. Mosquitoes become infected when they feed on birds that harbor the virus; they are then capable of transmitting WNV to humans and horses1.

Surveillance for WNV is conducted primarily by testing mosquitoes, monitoring birds particularly in the family Corvidae, and sero-surveillance of sentinel chicken flocks and equines. The Illinois Department of Public health monitors WNV by testing groups of up to 50 mosquitoes, dead perching birds (such as crows, blue jays, and robins), and testing sick horses and humans with West Nile virus-like symptoms2. Since, individual collection and testing of mosquitoes can be an expensive and arduous process, pooled samples are used to detect the presence of pathogens within the species- a concept first introduced by Dorfman in 19434. Mosquitoes are thus collected for testing by setting traps throughout the state; the placement of traps and number of mosquitoes submitted for testing is determined by local public health departments or mosquito abatement districts. The mosquito abundance in traps can be affected by factors such as temperature5, rainfall6, structure of urban landscapes7,8, vegetation9 and climatic variability10. Mosquitoes from these traps are collected in pools of up to 50 for viral testing. The results of mosquito testing are used to calculate the Minimum Infection Rate (MIR) for West Nile Virus, which is defined as the number of positive pools of a particular mosquito species over a defined time period and area divided by the total number of mosquitoes in those pools. The underlying assumption of MIR is that there is just one infected individual within a pool of mosquitoes. Another method used by researchers to detect the infection rate is using the Maximum Likelihood Estimate (MLE) which is defined as the infection rate most likely observed given the testing results and an assumed probabilistic model (i.e., binomial distribution of infected individuals in a positive pool)11. An increase in MIR estimates often is assumed to increase risk of disease transmission to humans. The Centers for Disease Control & Prevention have shown that MIR is an important indicator in WNV surveillance systems that can be helpful in predicting patterns in virus activity, and thereby human cases in a given area12. Previous studies that have used MIR to detect WNV are Bernard et al.13, Kulasekara et al.14 and Hadler et al.15. Therefore, prediction of MIR is a public health priority in areas endemic for WNV.

Under-sampling is believed to be a significant obstacle in developing robust prediction models for MIR. Under-sampling could be caused by not having enough traps set out within the geographic area of study, by not testing all mosquitoes collected in the traps, by low mosquito abundance in the traps, or by a combination of any of those factors. Additionally, MIR may not be the best measure of calculation if the virus to be detected is common or if pool sizes are too large11. Without sufficient data, MIR estimates are likely to be inaccurate and appropriate public health efforts will be difficult to determine. However, there is no current method for determining the error introduced into MIR by under-sampling.

Value of information analysis is a quantitative method to estimate the return on investment (value) produced by research16. This concept can be applied to compare the results of using an artificially reduced data set with that of the full data set to determine how much data are required to meet a specific criterion. The Value of Information (VOI) approach is increasingly becoming a useful tool with applications in prioritizing research decisions17, economic design of clinical trials18, as well as in treatment interventions19 and social sciences20. As the roots of VOI lie in statistics and has wide ranging applications, it is therefore used as a guiding concept in this paper to deal with error associated with MIR. In this case, the value we are interested in estimating is the accuracy and precision of the MIR estimate, while the information we are using is varying mosquito trapping densities.

The objective of this study was to determine the error associated with calculation of WNV MIR at the county level in cases of low mosquito trap density. This analysis considered data obtained from mosquito traps that were set in different locations in Illinois between the periods of 2005–2016. Utilizing the value of information concept as outlined above, we determined the impact of low trap density (under-sampling) in the accuracy of MIR results.


After selecting only weeks in which at least 50 pools of mosquitos were tested and at least one pool was positive, a total of 240 weeks of data were available for Cook County and 182 weeks of data were available for DuPage County (Table 1).

Table 1 Description of observed and simulated data (median [interquartile range (IQR)]) used for estimating baseline MIR and to determine effect of lower density sampling.

After randomly sampling subsets of mosquito trap data, the absolute relative error in estimated MIR for a county, \({E}_{p}=|\frac{MIR-MI{R}_{100}}{MI{R}_{100}}|\), was clearly skewed, with a high concentration near 0 and a long tail (Fig. 1). There was a secondary peak in frequency near Ep = 1 due to the bounded distribution of MIR, which cannot fall below 0; any iteration in which no positive traps were sampled (probabilities shown in Table 1) would result in an Ep value of 1. This also resulted in a skew in the relative error, \(\frac{MIR-MI{R}_{100}}{MI{R}_{100}}\) (Fig. 2) towards more positive outliers.

Figure 1
figure 1

Distribution in simulated absolute relative error (\({E}_{p}=|\frac{MI{R}_{p}-MI{R}_{100}}{MI{R}_{100}}|\)) in MIR associated with sampling different proportions of mosquito trap data in Cook and DuPage counties on a weekly basis between 2005 and 2016. Each proportion of traps was randomly selected from all trap data available in a given week for 50 iterations.

Figure 2
figure 2

Simulated relative error in MIR for Cook and DuPage counties created by randomly sampling only a subset of mosquito trap data. Color indicates the number of traps per square mile in the simulated data. MIR100 is the observed MIR using all data.

The distribution of relative error in MIR was clearly wider when the density of traps decreased, and also when the observed MIR100 was low (Fig. 2). Results of a lognormal regression to determine the effect of trap density and MIR100 on relative absolute error showed that there was a significant synergy between the variables (Table 2). Likewise, the range of the 95% confidence interval around the MIR estimate tended to increase more when the density of the traps decreased (Fig. 3), but a higher estimate of MIR100 was associated with a higher range around the MIR estimate, although this effect was somewhat decreased with high trap density (Table 3). Information about model fit can be found in the Supplementary Information.

Table 2 Results of lognormal regression for absolute relative error in estimated MIR, \(log\,(MIR+0.00001) \sim {\beta }_{0}+{\beta }_{1}MI{R}_{100}+{\beta }_{2}Density+{\beta }_{3}\,MI{R}_{100}\,\ast \,Density+{\varepsilon }_{i}\).
Figure 3
figure 3

Change in the range of the 95% confidence interval around the simulated MIR for Cook and DuPage counties created by randomly sampling only a subset of mosquito trap data, as a function of the number of traps per square mile in the simulated data. Color represents MIR100, the observed MIR using all data.

Table 3 Results of linear regression for range of the 95% confidence interval around estimated MIR, \((Range-Rang{e}_{100}) \sim {\gamma }_{0}+{\gamma }_{1}MI{R}_{100}+{\gamma }_{2}Density+{\gamma }_{3}\,MI{R}_{100}\,\ast \,Density+{\varepsilon }_{i}\).

The equation from Table 2 could be used to calculate the expected potential absolute relative error (ARE) around an observed MIR by using the equation

$${E}_{p}=exp[{\beta }_{0}+{\beta }_{1}MIR+{\beta }_{2}Density+{\beta }_{3}\,MIR\,\ast \,Density]-0.00001$$

where \({E}_{p}=|\frac{MIR-MI{R}_{100}}{MI{R}_{100}}|\). The expected potential range of MIR100 can be calculated as \(\frac{MIR}{1\pm {E}_{p}}\). Likewise, the expected change in the 95% confidence interval range can be calculated from the equation in Table 3. When the regression equation from Table 2 is applied to the observed data for Cook and DuPage counties, the range of predicted values is shown to be small(Fig. 4): at current trap density, the median range around the MIR is 0.18, which is only smaller than the observed 95% confidence interval, which has a median of 2.1. This is likely due to the high trap density, large number of pools, and high MIR100 in Cook County for this period. When the model is used to predict potential for error in example years from Will, McHenry, and Lake Counties(Fig. 5), where trap density is lower, it is seen that the error associated with MIR calculated from all traps is quite high; the median range around the MIR is 2.0. This is only slightly less than the observed 95% confidence interval, which has a median of 3.8, likely due to the small number of pools tested.

Figure 4
figure 4

Predicted error associated with decreased West Nile Virus sampling in Cook and DuPage County for 2 example years. Black solid lines show the observed MIR with error bars showing the 95% confidence intervals, while the red shaded bar is the predicted error around the mean and the red dashed lines show the predicted 95% confidence intervals around the upper and lower bounds of the potential MIR estimate.

Figure 5
figure 5

Predicted MIR error associated with trap density in Will, McHenry, and Lake Counties for one example year each. Black solid lines show the observed MIR with error bars showing the 95% confidence intervals, while the red shaded bar is the predicted error around the mean and the red dashed lines show the predicted 95% confidence intervals around the upper and lower bounds of the potential MIR estimate.

One factor of interest to mosquito control officials is the ability to detect a rise in MIR. We examined the simulated data for false negatives, circumstances in which the observed MIR100 was non-zero but the simulated MIRp was zero (Fig. 6). As with the absolute relative error, the probability of a false negative was significantly increased with low trap density and with low MIR100, and the effect of trap density and MIR100 was synergistic (see Supplementary information). However, the effect of MIR100 was much greater than that of trap density.

Figure 6
figure 6

Simulated trap density in which the simulated MIR was 0 (in yellow) or non-zero (in blue) when the observed MIR100 was non-zero.


There are various environmental factors that contribute to arbovirus transmission. Low levels of the virus might thrive within the host and reservoir populations and the exact conditions that lead to widespread outbreaks are difficult to pinpoint. Therefore, constant surveillance of mosquitoes and the sentinel organisms are a critical aspect of public health activities especially for West Nile Virus (WNV).

We found that if the true MIR is low, higher trap density is needed to accurately estimate MIR. This is intuitive; the sample size necessary to find a disease in a population is inversely proportional to the prevalence of the disease. Current estimates of MIR error are based on basic statistical theory, with sample size (total number of pools) responsible for much of the estimation21. This may result in underestimation of error around low MIR values and areas with low trap density but high trap numbers (such as a large geographical area).

Our study also provides an algorithm by which MIR error can be estimated. We believe that this error calculation can be incorporated into public health planning, giving decision makers a better sense of the potential range in MIR. Importantly, we have found that the probability of failing to detect a non-zero MIR was significantly impacted by the density of traps, especially when the actual MIR is low.

Our study was limited by the use of observed MIR, rather than true MIR, to calculate error. However, the mosquito surveillance systems of Cook and DuPage counties are extremely comprehensive, and the resulting calculated MIRs are likely to approach the true value in most cases. In fact, our model predicts low error in Cook County with only 10% of existing trap data, indicating that the full data set may be sufficient for the purposes of this study.

Researchers working on surveillance of arboviruses are constantly trying to optimize the tools and methods to better estimate disease risk and transmission in order to inform public health measures. Under-sampling has previously been identified as one of the most common sources of error in determining the mosquito infection rates22. For instance, DeFelice et al. (2017) used data assimilated from MIR and human case reports to model WNV transmission in New York and to generate retrospective forecasts of past WNV outbreaks in Long Island23. The authors realized for the model to work effectively in predicting outbreaks, it relied on timely availability of mosquito infection rate data, which can vary depending on the methods used for mosquito sampling. Bustamante and Lord (2010) also discussed other sources of error that are introduced while conducting mosquito surveillance such as temperature, trapping methods used for mosquito sampling, assays used for virus detection and the MIR vs MLE approach24. Thus, they suggest using other surveillance indicators such as historical baseline data and mosquito population size along with MIR for determining arboviral disease transmission.

The results of this study demonstrate that WNV surveillance in mosquitoes can be affected by under-sampling. However, the effect of this under-sampling on error in MIR estimates may now be calculated using our approach. It is important to note that we assumed that the spatial distribution of the traps was not a factor; this is a simplification that should be addressed in future studies. Gu et al. (2008) provide recommendations to better estimate MIR when there are not enough samples or in areas with low level transmission, recommending either “targeted surveillance (increased sampling at locations of higher transmission likelihood) or estimating MIR during periods of high transmission, thereby shifting from detection of mosquito infection to estimation of the transmission intensity, while expanding the number of sampling sites to evaluate the range of arboviral transmission”25. The placement of traps should also take into account landscape features and other factors likely to affect surveillance efforts.

Public health surveillance of diseases like West Nile virus and other mosquito-borne infections is essential and can be conducted smoothly via concerted efforts of public health agencies, health departments and research institutions. The setup of adequate number of traps in different counties to regularly monitor the MIR in the mosquito populations is critical so that necessary public health efforts can be initiated before outbreaks occur. This paper shows that in areas where trap density is low, the method utilized here can be used to detect accurately the error in MIR, which can inform appropriate disease control and prevention measures.


Data were obtained from the Illinois Department of Public Health mosquito surveillance database, which collects mosquito trap testing information from major stakeholders such as public health departments and mosquito abatement districts in the state of Illinois. Mosquito trap data for 4 counties in Illinois (DuPage, Cook, Will, and Lake) were obtained for the years 2005 to 2016. Trap density (traps per square mile) was calculated for each week by dividing the number of traps tested by the total area of the county. All analyses were performed in R26. Analysis was performed using data from Cook and DuPage counties; all other county data were used for illustration of potential impact. Only weeks in which at least 50 pools were tested were included in the analysis.

For each county in each week, all data from p percent of traps (p {50%, 100%}) were removed at random and the remaining trap data were used to calculate the simulated MIRp using the binGroup package27. Simulated MIRp was compared to the observed MIR with 100% of data, MIR100, for that county-week combination, and absolute relative error was calculated as \({E}_{p}=|\frac{MI{R}_{p}-MI{R}_{100}}{MI{R}_{100}}|\). This was repeated 50 times for each county-week combination. Weeks in which MIR100 was 0 were removed from the analysis.

The error Ep was examined visually for each level of p and determined to be log-normally distributed with zero-inflation using the fitdistrplus package28. Due to the zero-inflation, we performed shifted logistic transformation by adding a conservative number (0.00001) to all Ep prior to log transformation. The impact of trap density and MIR100 on log(Ep) was analyzed using mixed linear regression modeling with the lme4 package29, using the county-week combination as a random effect to account for repeated sampling. Two-way interactions were included, and all effects were considered significant at the α = 0.05 level. The 95% confidence interval around MIR100 and MIRp were calculated using the binGroup package27, and the difference in the confidence interval range was calculated as range100-rangep. The impact of trap density and MIR100 on the difference in the confidence interval range was analyzed using mixed linear regression modeling, as described above. All figures were created using the ggplot2 package30.

Data availability

All data and analysis scripts are available at


  1. Centers for Disease Control and Prevention. West Nile Virus-Symptoms, Diagnosis and treatment. Available at, (2018).

  2. Illinois Department of Public Health. West Nile Virus (WNV). Available at, (2018).

  3. Colpitts, T. M., Conway, M. J., Montgomery, R. R. & Fikrig, E. West Nile Virus: biology, transmission, and human infection. Clin. Microbiol. Rev. 25, 635–648 (2012).

    CAS  Article  Google Scholar 

  4. Dorfman, R. The Detection of Defective Members of Large Populations. Ann. Math. Stat. 14, 436–440 (1943).

    Article  Google Scholar 

  5. Beck-Johnson, L. M. et al. The importance of temperature fluctuations in understanding mosquito population dynamics and malaria risk. R. Soc. open Sci. 4, 160969 (2017).

    ADS  Article  Google Scholar 

  6. Epstein, P. R. Climate change and emerging infectious diseases. Microbes Infect. 3, 747–754 (2001).

    CAS  Article  Google Scholar 

  7. Deichmeister, J. M. & Telang, A. Abundance of West Nile virus mosquito vectors in relation to climate and landscape variables. J Vector Ecol. 36, 75–85 (2010).

    Article  Google Scholar 

  8. Keating, J. et al. Spatial and temporal heterogeneity of Anopheles mosquitoes and Plasmodium falciparum transmission along the Kenyan coast. Am. J. Trop. Med. Hyg. 68, 357–365 (2003).

    Article  Google Scholar 

  9. Gardner, A. M. et al. Terrestrial vegetation and aquatic chemistry influence larval mosquito abundance in catch basins, Chicago, USA. Parasit. Vectors. 6, 9 (2013).

    Article  Google Scholar 

  10. Chaves, L. F. et al. Climatic variability and landscape heterogeneity impact urban mosquito diversity and vector abundance and infection. Ecosphere 2, 1–21 (2011).

    Article  Google Scholar 

  11. Gu, W., Lampman, R. & Novak, R. J. Problems in estimating mosquito infection rates using minimum infection rate. J. Med. Entomol. 40, 595–596 (2003).

    Article  Google Scholar 

  12. Centers for Disease Control and Prevention. West Nile Virus in the United States: Guidelines for Surveillance, Prevention, and Control. Available at, (2013).

  13. Bernard, K. A. et al. NY State West Nile Virus Surveillance Team. West Nile virus infection in birds and mosquitoes, New York State, 2000. Emerg. Infect. Dis. 7, 679–685 (2001).

    CAS  Article  Google Scholar 

  14. Kulasekera, V. L. et al. West Nile virus infection in mosquitoes, birds, horses, and humans, Staten Island, New York, 2000. Emerg. Infect. Dis. 7(4), 722 (2001).

    CAS  Article  Google Scholar 

  15. Hadler, J. et al. West Nile virus surveillance in Connecticut in 2000: an intense epizootic without high risk for severe human disease. Emerg. Infect. Dis. 7(4), 636 (2001).

    CAS  Article  Google Scholar 

  16. Wilson, E. C. F. A practical guide to value of information analysis. PharmacoEconomics 33, 105–121 (2015).

    Article  Google Scholar 

  17. Minelli, C. & Baio, G. Value of Information: A Tool to Improve Research Prioritization and Reduce Waste. PLoS Med 12(9), e1001882 (2015).

    Article  Google Scholar 

  18. Claxton, K. & Posnett, J. An economic approach to clinical trial design and research priority-setting. Health economics. 5(6), 513–24 (1996).

    CAS  Article  Google Scholar 

  19. Dong, H., Coyle, D. & Buxton, M. Value of information analysis for a new technology: computer-assisted total knee replacement. Int. J. Technol. Assess. Health Care. 23(3), 337–42 (2007).

    Article  Google Scholar 

  20. Mohseninejad, L., van Baal, P. H., van den Berg, M., Buskens, E. & Feenstra, T. Value of information analysis from a societal perspective: a case study in prevention of major depression. Value Health. 16(4), 490–7 (2013).

    Article  Google Scholar 

  21. Bross, I., Anderson, R. L. & Bancroft, T. A. Statistical Theory in Research. I.: Basic Statistical Theory, Statistical Theory in Research. II, Analysis of Experimental Models by Least Squares. Q. Rev. Biol. 29(1), 100–101 (1954).

    Google Scholar 

  22. Katholi, C. R. & Unnasch, T. R. Important experimental parameters for determining infection rates in arthropod vectors using pool screening approaches. Am. J. Trop. Med. Hyg. 74(5), 779–785 (2006).

    Article  Google Scholar 

  23. DeFelice, N. B., Little, E., Campbell, S. R. & Shaman, J. Ensemble forecast of human West Nile virus cases and mosquito infection rates. Nat. Commun. 8, 14592 (2017).

    ADS  CAS  Article  Google Scholar 

  24. Bustamante, D. M. & Lord, C. C. Sources of error in the estimation of mosquito infection rates used to assess risk of arbovirus transmission. Am. J. Trop. Med. Hyg. 82(6), 1172–1184 (2010).

    Article  Google Scholar 

  25. Gu, W., Unnasch, T. R., Katholi, C. R., Lampman, R. & Novak, R. J. Fundamental issues in mosquito surveillance for arboviral transmission. Trans. Royal Soc. Trop. Med. Hyg. 102(8), 817–822 (2008).

    Article  Google Scholar 

  26. The R Development Core Team. R: A Language and Environment for Statistical Computing. 0 (2009).

  27. Zhang, B., Bilder, C., Biggerstaff, B.J., Schaarschmidt, F. & Hitt, B. binGroup: Evaluation and Experimental Design for Binomial Group Testing (2018).

  28. Delignette-Muller, M. & Dutang, C. fitdistrplus: An R package for fitting distributions. J. Stat. Softw. 64, 1–34 (2015).

    Article  Google Scholar 

  29. Bates, D. et al. Linear mixed-effects models using Eigen and S4 (2014).

  30. Wickham, H. ggplot2: Elegant Graphics for Data Analysis. (Springer-Verlag, 2016).

Download references


The authors would like to acknowledge the contribution of Dr. Marilyn O’Hara Ruiz, who made essential contributions to the conception and implementation of this study prior to her death. This publication was supported by Cooperative Agreement #U01 CK000505, funded by the Centers for Disease Control and Prevention. Its contents are solely the responsibility of the authors and do not necessarily represent the official views of the Centers of Disease Control and Prevention or the Department of Health and Human Services.

Author information

Authors and Affiliations



R.L.S. conceived the study and conducted the statistical analysis. S.C. conducted the VOI analysis. Both S.C. and R.L.S. interpreted the results and wrote the manuscript.

Corresponding author

Correspondence to R. L. Smith.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Chakraborty, S., Smith, R.L. Error associated with estimates of Minimum Infection Rate for Endemic West Nile Virus in areas of low mosquito trap density. Sci Rep 9, 19093 (2019).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI:


By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing