Introduction

A novel avian influenza A (H7N9) virus infecting humans has emerged in mainland China1, causing global concerns about its potential to start an influenza pandemic2,3. Since the National Health and Family Planning Commission of China (NHFPC) announced the emerging infectious disease on March 31, 2013, a total of 131 confirmed cases, with 39 deaths have been reported in eight provinces and two municipalities as of May 304. Fortunately, thus far there has been no evidence of sustained person-to-person transmission5. However, unlike the high pathogenic avian influenza A viruses such as H5N1 and H7N7, in which outbreaks in poultry precede human infections and imply where the public health threat lies6,7, the novel H7N9 virus causes no or only mild disease in birds8,9. This means that the virus is likely to spread silently in birds or other animal reservoirs. Human infections are therefore the sentinel events and the quick geographical expansion of human cases indicate that a hidden epidemic in birds is well underway3 and many parts of the country offers a favorable breeding ground for the virus to circulate. So far, however, apart from birds and the contaminated environments at the live poultry markets9,10, the sources of infection remain elusive11. It is unclear how the emerging avian influenza A (H7N9) virus is spreading in China3 and what and how the underlying risk factors are involved in the cross-species transmission. Lack of such knowledge has made it difficult to refine prevention and control strategies, which so far have relied primarily on health education and closing wet markets in affected areas12. The objectives of this study were to understand the spread dynamics of the novel avian influenza A (H7N9), to identify the agro-ecological, environmental and meteorological factors favoring the occurrence of human infections and thus to devise targeted surveillance and control efforts in both human and animal populations.

Results

A total of 131 cases with avian influenza A (H7N9) virus infection, which had been confirmed by laboratory tests according to MHFPC guidelines13 and reported to the China Information System for Diseases Control and Prevention (CISDCP) since its emergence on February 19, 2013, were included in analyses. Each reported case was geo-referenced and linked to a digital map of China (www.geodata.cn) according to its onset location using Geographic Information System (GIS) technologies. Then, a thematic map was created through displaying the cumulative number of cases in each of affected counties (Fig. 1) rather than merely providing a crude sketch of the epidemic situation of affected provinces5,14. From the map, we can intuitively observe that although most (107 of 131) cases occurred in Shanghai Municipality, Jiangsu Province and Zhejiang Province around the Yangtze River delta, where the outbreak was initially recognized, the remaining cases were sporadically distributed across the adjacent provinces, even spanning northward to Beijing with a total of 68 counties were affected (Fig. 1). The patient detected in Taiwan on April 23 was an imported case and originally infected and developed symptoms in Suzhou City, Jiangsu Province. Hence, this case was indicated in the map as an imported case and was mapped in Suzhou City according to the patient's symptom onset location. An asymptomatic human case from Beijing was displayed in the map (Fig. 1), but not included in any analysis.

Figure 1
figure 1

The thematic map of the avian influenza A (H7N9) epidemic in China.

Distribution of cumulative numbers of cases with reported onset dates from February 19 to May 30, 2013 in the eight provinces and two municipalities. Colored gradients reflect the number of cases at county level. An asymptomatic infection reported in Beijing and an imported case reported in Taiwan are also shown on the map. The map was created in ArcGIS 9.2 software (ESRI Inc., Redlands, CA, USA).

To illustrate spreading dynamics of human infection with avian influenza A (H7N9) virus, distinct epidemic curves for the three major endemic provinces and other provinces were plotted (Fig. 2). About a week after the closure of live poultry markets, no case had occurred in Shanghai Municipality and Zhejiang Province. Although the disease was suspended for a few days, it reappeared in a new area of Jiangsu Province. The majority of cases in Shanghai Municipality and Jiangsu Province were living in urban areas, however, more than half of the cases in Zhejiang Province and other provinces were living in rural areas. We also noticed that more cases were infected and scattered in rural areas after closing down live poultry markets (Fig. 2).

Figure 2
figure 2

Epidemic curves of avian influence A (H7N9) for selected provinces or municipality and the whole country.

The red and black bars respect the daily number of urban and rural cases according to their onset date, respectively. The date of closure of live birds markets is marked by the vertical dashed line. (A) Shanghai City; (B) Jiangsu Province; (C) Zhejiang Province; (D) All other provinces; (E) The whole country.

To explore potential breeding ground for the novel H7N9 virus to circulate, we overlapped confirmed human cases onto the maps displaying the distribution of live poultry markets as well as the population densities of poultry, pigs and humans throughout mainland China (Fig. S1 in the Supplementary Information). Using buffer zone analysis, we calculated a total of 1838 live poultry markets and about 248 million poultry, 25 million pigs and 179 million persons within a 50-kilometre radius of each of the 131 human cases that had been reported as of May 30, 2013.

We then integrated data on other potentially influencing factors including poultry trade routes, water-bodies, wetlands, land use and meteorological conditions (Table S1) and examined the contribution of each variable to the occurrence of human H7N9 infection using a boosted regression tree (BRT) model. Each variable's importance was assessed by the estimated weight and the weights of all variables of a BRT model sum to 1. As a result, the occurrence of human infection with avian influenza A (H7N9) were found to be significantly associated with the number of ambient live poultry markets, human population density, coverage of irrigated croplands and built-up lands, relative humidity and temperature, all with BRT weights > 5.0 (Table 1, Fig. S1–S2 in the Supplementary Information). The number of live poultry markets for each county was found to be the most important variable in predicting the risk of the disease in the model, followed by temperature, percentage coverage of irrigated croplands and built-up lands, relative humidity and density of humans. The risk functions plotted according to the BRT model showed that the occurrence of human infection increased with the number of live poultry markets, population density, percentage coverage of irrigated croplands and built-up land and relative humidity (Fig. 3A–E). The risk for human infection is low when the temperature is lower than 12°C, increased with the rise of temperature, peaked around 15°C and then dropped thereafter (Fig. 3F).

Table 1 Results of the boosted regression trees applied to the occurrence of human infection with avian influenza A (H7N9) virus data
Figure 3
figure 3

Relationship between risk factors and avian influenza A (H7N9) risk.

The avian influenza A (H7N9) risk based on the BRT model is plotted as a function of (A) number of live poultry markets (LPM), (B) population density (PD_3), (C) percentage coverage of irrigated croplands (IC), (D) percentage coverage of built-up lands (BU), (E) relatively humidity (RH) and (F) temperature (TP). The curves are average predicted lines for above 6 variables by 50 repeats based on the bootstrapping procedure.

To map the probability of occurrence or reemerging of human H7N9 infection, we conducted a bootstrapping procedure for the BRT model to generate robust estimates of predicted probability. The map predicting the risk for occurrence of the avian influenza A (H7N9) was created based on the estimated probability (Fig. 4A). The map indicated that the highest risk areas were predicted at wide range of eastern China extending from the Yangtze River delta and might occur even in the most northwestern part at Xinjiang Autonomous Region. To assess the discriminatory ability, the receiver-operating characteristic(ROC) curve was produced for the BRT model and area under the curve (AUC) was calculated (Fig. 4B). The estimated AUC value of 0.974 (95% CI 0.963–0.986) indicated an excellent prediction of the risk for emergence of human infections with avian influenza A (H7N9) virus. As expected, the AUC estimated based on the training data (0.984, 95% CI 0.975–0.991) is always better than that estimated using the test dataset (0.932, 95% CI 0.905–0.959). Also, goodness of fit of the BRT models was evaluated using the Hosmer-Lemeshow test, showing a decent risk discrimination between counties with H7N9 cases and “control” areas (median of X2 = 12.36, P-value = 0.14). To avoid overconfidence in the model discriminating actual areas of increased risk, which possible considering the enormous heterogeneity in all variables across the whole country, we also performed sensitivity analysis using importance sampling for all “controls” restricted to counties in the 10 provinces with H7N9 cases. The ROC and the AUC's are similar to our current model that samples the whole country (the estimated AUC value of 0.952, 95% CI 0.932–0.972) (Fig. S3).

Figure 4
figure 4

The predicted risk map of occurrence of human H7N9 infection in mainland China and receiver operating characteristic (ROC) curves of the predicted risk.

(A) Predicted risk of occurrence of human H7N9 infection is displayed by different color grades, according to BRT modeling. The map was created in ArcGIS 9.2 software (ESRI Inc., Redlands, CA, USA). (B) ROC curves for BRT models: the grey lines are the ROC curve for each repeat and the solid, dashed and dotted lines indicate the average ROC curves of 50 repeats based on the bootstrapping procedure for the train set, test set and prediction, respectively.

Discussion

The outbreak of human infections with the novel avian influenza A (H7N9) virus lasted for over three months since it emerged in mainland China. Fortunately, thus far there has been no sustained human-to-human transmission5, although the virus has genetic characteristics that suggest it could effectively replicate in mammals1,15. Currently, H7N9 virus infection is primarily zoonotic. In this study, we used GIS-based spatial analysis to map the spatial distribution of human infections with H7N9 virus. The thematic map displaying the distribution of human cases indicates that although most human cases were concentrated at the Yangtze River delta on China's eastern seaboard, sporadic cases were distributed in large areas of adjacent provinces, even spreading northward to Beijing and being exported to Taiwan (Fig. 1). As recommended by the international H7N9 assessment team convened by the World Health Organization12, it is critical to continue to conduct and strengthen both epidemiological and laboratory-based surveillance in humans and animals in all provinces of China to identify changes of the virus gaining the ability to infect people more easily.

From examining epidemic curves (Fig. 2), we detected a latent period following the closure of live poultry markets in Shanghai, Jiangsu and Zhejiang provinces, during which human cases greatly decreased or disappeared. This observation supports a leading risk factor theory that live poultry markets and such environments contaminated by H7N9 virus are the most likely sources of human infection9. However, according to the data from Ministry of Agriculture, tens of thousands of birds and other animals in farms from many places in China have all been tested negative10. Exposure source of human H7N9 infections are not frequently or clearly identified, especially in rural areas, the fact that human infections have quickly expanded to a large geographical range suggests that a hidden epidemic in animals has well been underway and the countryside may offer favorable conditions for the novel virus to further circulate. As estimated in the study, the regions where H7N9 is now circulating have large populations of poultry, pigs and humans. This may provide good opportunities for the H7N9 virus to adapt itself to mammals and re-assort with other endemic human- or pig-adapted influenza viruses, as previous identified in other avian influenza viruses16.

It seems likely that animal husbandry practices, poultry trading, agro-ecological systems, land use and meteorological factors (all of which were considered to have an impact on the distribution of zoonosis), could have contributed to the spread and hence they were included in our analyses. We used BRT model to determine the contribution of each variable, estimate the discriminatory ability of the models and map areas most at risk to avian influenza A (H7N9)17,18. The modeling approach has given excellent predictions of the risk of avian influenza A (H7N9) with the estimated AUC values of 0.974. The model revealed that existence of live poultry markets, human population density, coverage of irrigated lands and built-up land, high humidity and an atmospheric temperature around 15°C were predictive factors for the risk of avian influenza A (H7N9) (Table 1).

Our findings indicate that high human population density and high proportion of built-up land lead to a high risk of H7N9 infections. Our interpretation is that the built-up areas with high population density are usually home to poultry-related trading or farming, which may promote transmission of the pathogen among animal reservoir and increase the chance of human acquiring H7N9 infection. Another possibility is that patients are more likely to be detected in places with more people and better medical facilities around. Along with the asymptomatic case reported in Beijing and several mild cases already reported through active surveillance, the data suggests that the virus might has been more widespread among humans than the number of reported cases indicated. In addition, agricultural and cultural practices in mainland China put people and domestic animals in close proximity to one another19, not only increasing the risk for animal-to-person transmission but also likely accelerating the virus's adaption to man.

The association between occurrence of human cases and proximity of irrigated lands indicates that waterfowl such as ducks and geese may play a role in the transmission of H7N9, as they do for other avian influenza A viruses. Although waterfowl infections with H7N9 virus have not been reported so far, they are well known as asymptomatic reservoirs for H5N1 virus20 and should be amenable to H7N9 infection according to virology studies21. In that case, waterfowl might shed the virus through their salivary and nasal secretions and feces into irrigated lands where they inhabit. Domestic ducks have been known to be able to excrete great amounts of viruses in their breeding places22. Birds and other animals also may transport the virus on their feathers or fur to a water source after coming into contact with an infected animal or contaminated surface on a farm23. Avian influenza virus has been cultured from water for up to 100 days24. People may be imperceptibly infected through exposure to contaminated surfaces or infected birds.

In contrast to the transmission of 2009 pandemic influenza A (H1N1)25, BRT modeling showed that higher relative humidity created a higher risk of avian influenza A (H7N9). This observation is somewhat contradicting previous animal experiments on seasonal influenza viruses26. The estimated effect of temperature is consistent with the fact that the number of H7N9 cases dropped and disappeared as the temperature rose in the summer, as have occurred with many other influenza viruses.

It should be clarified that the BRT model was used to predict the probability of reemergence or increased occurrence of human H7N9 infection in the coming months rather than to make causal inferences. All the identified variables are merely good predictors for H7N9 infections, while causality could be tested with future experimental, field or analytical work. The BRT analysis, which used 75% data to train the model and the remaining 25% data to make prediction, might suffer from over-fitting, because the data from a single outbreak was space and time limited. Validating this model would require true out-of-sample data, i.e. those from another outbreak, which unfortunately do not exist for the emerging infectious disease. In addition, it is likely that many mild cases have already have occurred but were not detected27. Not accounting for about the incomplete coverage of the sentinel surveillance network and for the possibility that not all patients with influenza-like illness will seek medical care, our model probably underestimates the risk of the disease in mainland China by only focusing on laboratory-confirmed H7N9 cases. In conclusion, mapping offers a valuable approach to describing the spread of and risk factors for avian influenza A (H7N9). The predictive risk map of human H7N9 infections established for mainland China on the basis of modeling could be useful for identifying the areas where surveillance efforts and preventive interventions should be targeted. As more is learned about H7N9 infections in birds and other animals, such models could be improved and provide more valuable scientific support for decision-making on how to control the emerging virus.

Methods

Data regarding reported human cases of avian influenza A (H7N9) from February 19 (onset date of the first case) to May 30, 2013 were collected from China Information System for Diseases Control and Prevention (CISDCP) and official reports by NHFPC4. A confirmed case was defined according to previously published MHFPC guidelines28, i.e. a person with influenza-like symptoms or evidence of pneumonia and one or more of positive tests for H7N9 infection5. Influenza-like symptoms were defined according to the World Health Organization criteria29. As has been done with previous influenza outbreaks23,25, each reported case was geo-referenced and linked to a digital map of China (www.geodata.cn) according to its symptom onset location using GIS technologies.

Data concerning agro-ecological, environmental and meteorological factors were collected for the model. The map of locations of live poultry markets in 2012 in mainland China were obtained from AutoNavi, a Chinese location based service (www.autonavi.com). Raster-typed data with 3′ (about 5 km*5 km) resolution regarding the density of poultry and pigs were derived from the Food and Agriculture Organization of the United Nations30. The number of population of each county were obtained from the National Bureau of Statistics of China31, from which population densities were computed based on the area of each county. Data on live poultry transportation such as freeways, national highways and data regarding water bodies (lakes, reservoirs and rivers) were directly derived from the digital map of China (www.geodata.cn). In addition to the above-mentioned water bodies, data on the wetlands were obtained from the State Key Laboratory of Remote Sensing Science in Beijing, China32. Land use data with 300 m*300 m resolution in 2009 were derived from the Global Land Cover Facility Data Products and Satellite Imagery (http://due.esrin.esa.int/globcover). The data regarding meteorological variables including daily temperature and relative humidity during the study period were obtained from Chinese Academy of Meteorological Sciences (www.cams.cma.gov.cn). In this study, 15 explanatory variables possibly contributing to occurrence of human H7N9 infection were calculated for each county by overlapping these maps of agro-ecological, environmental and meteorological factors with on the map of counties in China and by using spatial analytic approaches in the ArcToolbox in ArcGIS 9.2 software (ESRI Inc., Redlands, CA, USA). Details on these variables and their estimates in the modeling analysis were defined and described in Supporting Information Table S1.

A thematic map of cumulative case number in affected counties was produced in ArcGIS 9.2 software (ESRI Inc., Redlands, CA, USA) to characterize the spatial distribution of the avian influenza A (H7N9) outbreak during the study period. Epidemic curves (daily numbers of cases over time) were plotted to describe the temporal dynamics of the zoonosis. We calculated the numbers of live poultry markets, domestic birds, pigs and humans within a 50-kilometre radius of each human case by buffer zone analysis.

A BRT model was built at the county level in this study to identify risk factors associated with the occurrence of human H7N9 infections and to predict the high-risk areas. All 68 counties with reported avian influenza A (H7N9) cases were considered as the positives and the negatives were selected randomly from all 2854 counties without reported human infections. BRT modeling is efficient for predicting distributions of organisms while accounting for non-linear relationships and interactions between covariates17,18 and it has been proved to be successful for mapping the distribution of HPAI H5N1 risk33. For the BRT model, a bootstrapping procedure was employed to provide a robust estimation of model parameters. A tree complexity of 4, a learning rate of 0.005 and a bag fraction of 75% were used to identify the optimal tree for each bootstrap data. The weight of each variable was estimated from the identified trees and served as an indicator of each variable's importance for predicting H7N9 presence/absence. One should note that these weights are not absolute metrics and the weights of all variables of a BRT model sum to 1. In the bootstrapping procedure, the following sequential steps were repeated 50 times: firstly, 340 counties were randomly selected without replacement from all 2854 counties without H7N9 cases throughout mainland China and were then combined with the 68 counties with H7N9 cases to form a balanced bootstrap dataset (5-to-1 case-control ratio). Secondly, a training dataset with 75% of the points and a test dataset with 25% of the points were randomly selected from the current bootstrap data for building and validating the model, respectively. Thirdly, a BRT model was built using the training set and then the model equations from BRT model was validated using the test set, which in turn was assessed using ROC curves and areas under the curve (AUC); goodness of fit of the model was evaluated using the Hosmer-Lemeshow test34. Finally, a risk map predicted by the model was created. The mean value and standard deviation of parameter estimates over 50 resampled datasets were calculated and the risk map in relation to presence of the avian influenza A (H7N9) was created based on the average predicted probabilities33. In addition, we performed sensitivity analysis for BRT models using importance sampling for all “controls” restricted to counties in the 10 provinces with H7N9 cases.