Avian Influenza (AI) is an infectious viral disease of domestic and wild birds. Depending on the strain and severity of the clinical symptoms on naïve chickens, AI is pathotyped into low pathogenic avian influenza (LPAI) or highly pathogenic avian influenza (HPAI)1. Mild symptoms are observed in poultry infected with LPAI and wild birds usually are asymptomatic. HPAI however, is a re-emerging, highly contagious and economically devastating viral infection with severe socio-economic consequences that strongly impact the poultry industry2. HPAI is also considered an important public health concern due to the potential contribution of AI virus (AIV) to the emergence of human influenza pandemics3. So far, subtypes H5 and H7 of the virus are recognized to cause HPAI, but not all H5 and H7 serotypes are virulent4. It has been proven that AI viruses (AIV) undergo frequent genetic re-assortment and LPAI may mutate to HPAI5. Little is known about the origin of LPAI to HPAI mutation, but it has been linked to the introduction of the LPAI virus from wild birds into poultry farms6. The more the LPAI viruses circulate and replicate in poultry dense areas, the higher the risk of mutation to HPAI viruses6. LPAI viruses circulate in wild bird populations, primarily in waterfowl and migratory water birds belonging to the Anseriform and Charadriiform orders, which are believed to be the major natural LPAI virus reservoirs7. In contrast, the main source of HPAI transmission, particularly in Asia and Europe, has been associated with the trade of live poultry, poultry products and smuggling birds8. There have been rare reports on the circulation of HPAI in wild birds and the associated prevalence and pathogenicity in those HPAI wild bird cases has been shown to vary widely depending on the infected species9.

Historically, the US was affected by several HPAI epidemics; the most important ones occurred in 1924, 1983, 2004 and recently in 2014–201510,11. There were no reports of significant human illness resulting from any of these outbreaks12. The 1924 H7 outbreak involved East Coast live bird markets13,14 but was successfully controlled and eradicated at an estimated cost of $13 million15. The 1983–84 H5N2 outbreak resulted in the destruction of approximately 17 million chickens, turkeys and guinea fowl in the northeastern US to contain and eradicate the disease16. In 2004, the USDA confirmed an H5N2 outbreak in chickens in the southern US17,18. The outbreak was quickly controlled due to close coordination and cooperation between the USDA and state and industry leaders. The recent 2014–2015 HPAI epidemic affected over 21 states, where 242 poultry farms (221 commercial, 21 backyard) and around 100 wild birds were confirmed to be HPAI positive19,20. The estimated costs to control the outbreak exceed $950 million19. Although the main sources of disease introduction and spread of this 2014–2015 epidemic are still under investigation, genetic analysis revealed the presence of different HPAI subtypes, including H5N8, H5N2 and a novel H5N1, which has been attributed as the likely cause of the recent epidemic19. This new H5N1 strain is distinct from those found in other parts of the world. USDA investigations suggest that this new H5N1 is the consequence of a re-assortment between Asian H5N8 HPAI strains with North American LPAI viruses and probably occurred during summer arctic migration, where Asian and Alaskan migratory birds intermingle.

The first confirmed case of the U.S. 2014–2015 epidemic occurred on December 2014 at a small-scale backyard operation in Oregon consisting of approximately 130 poultry birds21. The first wild bird cases were detected in Washington state in December 201422. Several outbreaks were subsequently reported in wild bird populations in addition to multiple backyard and commercial poultry flocks in the West Coast. In April 2015, cases were observed in the American Midwest in Wisconsin where a State of Emergency was declared as a response to the rapid spread of the AI viruses23. The states of Iowa, Minnesota and Wisconsin were most affected representing 90% of the total number of birds infected across the U.S. Overall, over 232 outbreaks occurred during 2014–2015 affecting almost 50 million birds24. Preliminary studies show that the recent HPAI outbreaks are linked to the North American Migratory Flyways [NAMF]25,26. These flyways are migratory paths used by bird species moving between Canada, Latin America or Asia in search of water and food. Interestingly, some AIV strains are found only in specific flyways27. Flyways have specific climatic and environmental characteristics that may impact the AI epidemiological cycle in these areas. A better understanding of the factors contributing to the LPAI presence and occurrence of HPAI outbreaks in the different flyway sub-regions as well as the identification of areas at highest risk of AIV transmission, especially at the wild-domestic interface, are needed. This knowledge could be used to develop and implement solutions for improved prevention and rapid control of future HPAI epidemics in the US.

Species Distribution Modeling (SDM) has proven to be effective for identifying the most important predictors contributing to HPAI outbreaks and for determining likely areas of HPAI occurrence in previous studies in Japan and the Middle East28,29. Moreover, areas combining high poultry density with suitable areas for LPAI may be at higher risk of mutation from LPAI to HPAI. In this study we aim to prove the value of using LPAI suitability maps based on wild birds LPAI surveillance in addition to climatic, environmental and, poultry demographic factors to identify areas at high risk of HPAI outbreak occurrence. A “presence-only” SDM30 was used to generate four suitability maps for LPAI considering the four NAMF: Atlantic, Mississippi, Central, and Pacific. These four sub-regions were chosen because (1) they capture the natural migratory bird flyways (i.e., routes the birds follow to migrate between nesting and wintering areas), (2) they correspond to established, non-overlapping, administrative areas31, with independent Councils (i.e., representatives from each state and territorial agency and technical committees within each flyway) used to facilitate management of migratory birds and their habitats and, therefore, (3) we believe is the best regionalization for both capturing the distinct eco-epidemiological characteristics of AI and facilitating the decision making and the implementation of potential risk-reduction activities or policies. Results of this study will provide further insights into the epidemiology of AIV in the US and will inform the design of more cost-effective, risk-based surveillance programs specific to each NAMF sub-region for better prevention and control of future HPAI epidemics in the US.


The LPAI dataset used in this study consisted of 7,714 positive samples. Those positive samples are spread on the four flyway sub-regions as following: Pacific (37.9%), Central (9.47%), Mississippi (38.96%) and Atlantic (13.67%) (Fig. 1).

Figure 1
figure 1

Distribution of the LPAI cases in the United States.

The different colors represent the geographic areas covered by the four NAMF according to The U.S. Fish & Wildlife Service 31. The black dots represent the spatial distribution LPAI cases reported by the Influenza Research Database from Jan 2005 to Feb 2015. Maps were created using ArcMap version 10.3 (Environmental Systems Resource Institute,

Spatial distribution of poultry demographics is shown in Fig. 2. Backyard chicken farms were well spread across the country with a higher density in the East Coast region; the Midwest and the Rocky Mountain area have the highest density (Fig. 2E). Poultry farms are abundant in the Atlantic and the Mississippi flyway, Texas and Oklahoma in the Central flyway and the coastal region of the Pacific flyway (Fig. 2D). Poultry density is high in the Atlantic and the Mississippi flyway (Fig. 2B). Broiler farms are located in the South Atlantic in the Atlantic flyway and East South Central in the Mississippi flyway (Fig. 2C).

Figure 2
figure 2

Spatial distribution of the LPAI outbreaks in wild birds from Jan 2005 to Feb 2015 and HPAI 2014–2015 outbreaks in the US.

(A) Poultry density (B), Broiler farm density (C), Poultry farm density (D), Backyard farm density (E). The thick black boundaries represent the limits of each of the four North American Migratory flyways. In map (A) green points represent the LPAI samples and the red points represent the HPAI 2014–2015 outbreak centroids. Color gradient of each pixel in map (BE) represents density gradient from clear blue shading (low density) to bright blue shading (high density). Maps were created using ArcMap version 10.3 (ESRI, Environmental Systems Resource Institute,

Important predictors and high suitable areas for LPAI

Five environmental variables were necessary to determine suitability for LPAI outbreaks in the Pacific flyway with an AUCc of 0.98: altitude, NDVI, mean temperature of the warmest quarter, land cover and backyard chicken farm density (Table 1). Areas of high suitability in the Pacific flyway were the coastal areas in the Pacific North West, the Sacramento Valley and Southern California, the central region in Alaska and the Mid-Atlantic (Fig. 3A). For the Central model, four environmental variables provided a model with an AUCc of 0.94: merged migratory bird abundance, altitude, backyard chicken density and land cover (Table 1). Areas with highest suitability are the eastern part of the coastal plain in Texas and the east areas of South and North Dakota (Fig. 3A). NDVI, land cover, distance to water surfaces and altitude were the variables contributing the most for the Mississippi flyway resulting in a model with an AUCc of 0.88 (Table 1). This model estimated that the North Central States of the Midwest (i.e. Minnesota, Iowa, Michigan, Illinois, Indiana and Ohio) are highly suitable for LPAI (Fig. 3A). For the Atlantic flyway, mean temperature of the warmest quarter, distance to water surfaces, merged migratory bird abundance, backyard chicken farm density and land cover were the most important predictors resulting in a model with an AUCc of 0.96 (Table 1). LPAI suitable areas in the East coast were mostly around the Delaware Bay (Fig. 3A). Overall, very low correlation was observed between variables in each reduced model as shown in the Spearman correlation plots (see Supplementary Figure 1). Response curves also suggested important differences in the ranges in which those variables are important (see Supplementary Figure 2). For example, in the Pacific Flyway, highest relative probability of LPAI occurrence lies between 0 and 1500 meters above sea level, but above 1500 m the relative probability of LPAI occurrence is negligible. Those response curves also reflect a contrast in the range of values that are important when comparing different flyways. For example, type of land cover with the highest relative probability of LPAI presence differs between the four flyways.

Table 1 Percent relative contributions of the selected environmental variables to the MaxEnt models.
Figure 3
figure 3

Validation of merged migratory flyways suitability map using the HPAI 2014–2015 outbreak data. The green points represent the centroids of the 2014–2015 HPAI outbreaks. The color gradient of each pixel represents the LPAI presence probability at the county level from clear red shading (low presence probability) to bright red shading (high presence probability). Map was created using both RStudio (RStudio Team, 2015) and ArcMap version 10.3 (ESRI, Environmental Systems Resource Institute,

Models’ diagnostics, validation and HPAI prediction ability

Model performance was considered good based on the ROC curves. Jackknives tests on training data, used for model selection, and response curves, used to characterize the areas of highest/lowest relative probability of presence are shown in Supplementary Figure 2 and Supplementary Figure 3. Up to 89% (n = 278/312) of the 2014–2015 HPAI cases were located in counties classified as “highly suitable” in the merged map of the four flyway models.


Results suggest that LPAI surveillance data, together with wild and domestic bird demographics as well as climatic and environmental factors can be used to accurately detect suitable areas for LPAI presence (AUCc from 0.88 to 0.98). Moreover, a total of 89% of the counties reporting HPAI outbreaks during 2014–2015 were in highly suitable areas for LPAI. These results reinforce the hypothesis that the 2014–2015 HPAI outbreaks may have been associated to a LPAI-to-HPAI mutation and transmission at the wild-domestic interface in high bird density areas where high loads of AI viral circulation and replication was occurring. Suitable areas for LPAI presence were concentrated primarily in the North Central region (Minnesota, Iowa, Michigan, Illinois, Indiana and Ohio). Although a previous study in the US reported these states to be at high risk32, this study highlights several suitable areas that have not been previously identified (e.g., Coastal Plains of Texas, The Sacramento Valley in California) and that may be under-sampled and under-represented in the current surveillance programs. Moreover, the SDM models provide higher resolution and granularity than previous studies, which were conducted at the county level. Splitting the US into four distinct ecologic and geographic areas following the NAMF is a good method to produce models that capture the particularities of AI epidemiology in those specific regions. Moreover, the characterization of different risk factors in each flyway may facilitate the implementation of specific preventive and risk-mitigation strategies by each of the individual flyway Councils. Bird density rasters generated for this study, particularly backyard poultry farms’ density, provide a novel way to estimate bird populations. Estimates of backyard chicken farms density in California in this study are consistent with surveys conducted by extension specialists to estimate the census of California backyard poultry33. The authors believe the backyard bird density raster will benefit future poultry production studies.

Overall, ecological and topographic variables were the main predictors in all of the models presented. More specifically, land cover, altitude, distance to water surface and NDVI were the factors with higher contribution to the LPAI suitability. Consequently, results indicate a higher LPAI suitability in low altitude, cultivated croplands and woody wetlands with a mean temperature of the warmest quarter between 10 and 25 °C. This supports findings of similar studies that considered locations having these conditions as optimal habitats for aggregation of migratory waterfowl and thus at higher risk for AIV transmission between different bird species28,34. Moreover, humid areas and colder temperatures have been described as favorable for viral persistence in the environment35. Land cover was an important contributing factor in all models. Presence of cultivated croplands and woody wetlands seem to offer the most suitable environment for LPAI. In Alaska, areas with Dwarf scrub and Shrub seems to be correlated with LPAI presence36. Low altitude was consistently associated with AI outbreaks in three of the four models (Pacific, Central and Mississippi). This reinforces results from previous studies performed in different countries that showed an inverse relationship between altitude and AIV outbreaks34,35,37. Distance to water contributed to the Mississippi and the Atlantic model. Areas close to inland water in the Mississippi region tend to be more suitable for LPAI presence. The Mississippi River basin and associated wetlands have previously been identified as major hotspots for HPAI outbreaks32,38. These areas contain many shallow bodies of water which have been proven to play an important role in rapid transmission of AIV39. The North East of the US, specifically areas around the Delaware Bay and the Delaware River, seems to be highly suitable for LPAI. The Delaware Bay is classified on the Ramsar list of Wetland of International Importance40 as it is highly frequented by shorebirds and waterfowls. Those areas were also described as “hotspots” for AI41. Areas with moderate NDVI are suitable for LPAI presence in the Mississippi region and areas with low to moderate NDVI seem suitable for LPAI presence in The Pacific region. Low to Moderate NDVI values are often associated with areas where herbaceous plants and grasses are abundant. This type of vegetation represents an optimal food source for some waterfowl. Higher NDVI values are often associated with larger plant and trees34. Another predictor associated with the suitability of LPAI was the density of backyard chicken farms in the US42, particularly in the Pacific, the Central and Atlantic models (4.7%, 17.1% and 13.1% contribution, respectively). This agrees with previous studies in Asia, the Middle East and Africa that indicate a direct association between high density of backyard chicken and the occurrence of AI outbreaks43,44,45. A better characterization of the spatial distribution and biosecurity of backyard poultry farms will certainly allow refinement of current suitability maps and better identify the specific areas and time periods where AI outbreaks are more likely to occur.

The major challenge faced during this study was obtaining good quality data for the hypothesized important predictors. Namely, the incorporation of more accurate poultry farm location data instead of poultry farm density estimations would improve the ability of the models to identify areas where LPAI transmission from wild birds to poultry or from poultry to wild birds is most likely to occur. Suitability map precision could also be improved with better LPAI and HPAI presence data. Even though certified laboratories collect LPAI data, sampling bias is likely to be present as surveillance efforts are not uniform across all US states (i.e., regions with higher wildlife services stations and poultry production may tend to have more intensive surveillance programs and collect higher numbers of wild bird samples). Furthermore, presence-only models relying on pseudo-absence data were used to correct for the fact that negative samples tend to be under reported by diagnostic laboratories. Positive cases were readily available and are likely to be more complete than negative results. However, adding unbiased negative results to the models could potentially improve the models’ predictions (i.e. using presence-absence MaxEnt). Model validation is restricted by the spatial scale of the HPAI outbreak data available (i.e. data were only available at a county level). Exact HPAI outbreak locations will not only allow a more accurate validation of the LPAI suitability maps but also will be useful for directly generating models using HPAI instead of LPAI data. Similarly, the lack of census data on backyard poultry in the US leads us to generate an estimate based on population census and socio-economic characteristics. Authors believe that using more accurate data on poultry demographics, particularly backyard poultry farms, will increase the predictive ability of our models as well as serve to identify priority areas where outreach and communication strategies should be conducted for backyard producers. Also, the reduced set of wild bird species selected in this study was based on the LPAI prevalence from the Influenza Research Database, which is likely biased and thus may lead to underestimation of the LPAI suitability in some areas. Furthermore, future studies should consider the incorporation of other sources of information on migratory birds’ abundance. The North American Breeding Bird Survey is likely to underrepresent species such as breeding waterfowl that require a different type of sampling. While this dataset might be useful in investigating trends of a proportion of the population over time, it may not be a good indicator of abundance for specific water birds. This could potentially explain why migratory bird density factors were not an important contributing factor as expected by the authors. Future studies should aim to include specific wild bird species density from sources other than the Breeding Bird Survey and to evaluate the relative contribution of each species for each of the migratory flyways models and for different seasons and time periods.

To our knowledge, this is the first study applying SDM to generate high resolution AI suitability maps for the US. These results can be used to better prevent and control future HPAI outbreaks in the US and could be easily extended to other regions in North America. The identification of the main predictors and high suitable areas can be used to implement risk-based and more cost-effective surveillance strategies and to increase awareness of poultry producers located in high suitable zones. The results of this study have been included into a public database within the Disease BioPortal46 to allow the visualization of area-specific predictions, the integration of periodic updates and facilitate decision making to minimize the impact of future HPAI epidemics in the US.

Material and Methods

Wild bird LPAI presence-only data

Wild bird LPAI positive data was retrieved from the Influenza Research Database (FluDB)47. Briefly, FluDB is an influenza database consisting of georeferenced collection coordinates, species names, AI-positive results, viral subtypes and many other collection specifics regarding each bird tested. Samples collected between January 2005 and February 2015 were considered for our study. Samples from domestic species and samples from unknown species or without latitude or longitude coordinates were disregarded (0.2%). A case was defined as a unique animal collected from a specific geographic location that is confirmed to be LPAI positive by necropsy and laboratory testing in one of the NIAID-funded Centers of Excellence for Influenza Research and Surveillance (CEIRS). The CEIRS is a multidisciplinary, collaborative network that aims to control AI nationally and internationally via global surveillance, pathogenesis research and training48.

Environmental data

In total, 14 variables were considered as potential predictors for the models (Table 2). The variables fell into three different categories: (1) climatic, (2) environmental/topographic and (3) domestic and wild bird demographics. All the selected variables were selected based on literature review as they have been described to play an important role for AIV dynamics and distribution in other countries49,50,51.

Table 2 Variables considered in the model.

Spatial data was collected for each of the predictors and 500 m × 500 m cell size rasters were created for each one of them. This resolution was chosen as it offers a good balance between adequate resolution for decision-making and a reasonable processing calculation speed (40 minutes to run one flyway model on average). All the predictor rasters were standardized to have the same map extent and a common projection (NAD_1983_Albers) using ARCGIS 10.352 and RStudio Version 0.98.110253.

Climate data layers were selected from the standard19 Bioclim variables from the WorldClim organization54,55. Bioclim, a set of interpolated climate data, is the result of 50 years of ground-based weather measurements. It contains means of monthly minimum and maximum temperatures and precipitation in a grid format at several different resolutions. As described in previous studies, all the Bioclim variables were included in a preliminary maximum entropy model in order to select the most important predictors as described in previous studies43. Based on the result of this preliminary model, the mean diurnal range (Bio2) and mean temperature of warmest quarter (Bio 10) were the sole predictors selected to be included in the final model.

Environmental/topographic variables consisted of altitude, distance to water points, the Normalized Difference Vegetation Index (NDVI) and land cover. Specifically, distance to water consisted of the Euclidean distance to open water (oceans) and to inland water points (rivers, wetlands and lakes). Inland water layers were obtained from US. Geological Survey56: the US Lakes and the US Rivers and open water layer was obtained from Natural Earth database48. NDVI represent an estimate of vegetation activity measured by the Moderate Resolution Imaging Spectroradiometer (MODIS) aboard NASA’s Terra satellite. NDVI raster incorporated in the models consisted of an average of the vegetation activity estimated over the study period. Land cover at a 5 arc-minutes resolution was obtained from the United States Geological Survey (USGS) database56. More details about the different land cover classes and their pixel values in the raster are presented in the Supplementary Table 1. Domestic and wild bird densities were included in the analysis to account for the potential transmission of AIV from wild birds to poultry and vice versa. Bird farm density variables included poultry and poultry farm densities in total and by type of production system, the backyard chicken farm density and the migratory bird density. Commercial poultry farm density per km2 was generated by the Model of Infectious Disease Agent Study (MIDAS)57 using 2002 Census of Agriculture poultry farm counts. This layer contains locations of poultry farms by production system and species (broiler, duck, layer, pullet, turkey). This data was integrated in a Kernel density function with a search radius of 3 km to generate a global density raster for all poultry farms, a raster for the global poultry density and a raster specific to the poultry density in each commercial poultry farming system (broiler, duck, layer, pullet, turkey). Backyard chicken farm density per km2 was estimated based on two sources: the percentage of households that owned chickens by race/ethnicity determined using the USDA Agriculture Ownership in Four Inspection U.S. Cities survey conducted during 2010–2012 in four different cities: Denver, Miami, New York city and Los Angeles58 and the 2010 human demographic census per census track59. Equation (1) used to estimate the number of backyard chicken by census track is:

Equation (1): Backyard chicken estimation by census track

where, Ej is the total estimate of the number of backyard chicken farms per census track per km2; λj is the area of the census tract in km2; βi is the percentage of race/ethnicity-specific households that owned chickens with values of 0.7 for Asian households (i = 1), 0.1 for African-American (i = 2), 1.4 for Hispanic/Latino (i = 3), 0.7 for White (i = 4) and 1.1 for Multiracial (i = 5); and Hij is the number of households by race/ethnicity i per census track j. Since the census track does not directly provide the number of households by race/ethnicity, only the total number of households and the total population by race/ethnicity were used. Hi was estimated by multiplying the total number of households by the proportion of the population belonging to a particular race/ethnicity for each census track.

Regarding the migratory bird density, only the most relevant wild bird species in terms of AIV risk were considered for this study (i.e., wild bird species with at least 10% LPAI apparent prevalence –number of birds testing positive/total number of birds sampled–) during the study period as observed from the FluDB data). As a result, eight bird species were selected: Podiceps grisegena (LPAI prevalence = 0.316); Bucephala islandica (LPAI prevalence = 0.182); Aechmophorus occidentalis (LPAI prevalence = 0.167); Bubulcus ibis (LPAI prevalence = 0.154); Anas platyrhynchos (LPAI prevalence = 0.131); Anas rubripes (LPAI prevalence = 0.120); Anas acuta (LPAI prevalence = 0.102); Anas discors (LPAI prevalence = 0.101). Abundance rasters for each of the selected bird species were created from abundance shapefiles extracted from the USGS website60. The shapefiles were based on the North American Breeding Bird Survey (1966–2003). Abundance rasters were created for each of the eight migratory birds selected using RStudio (RStudio Team, 2015). Finally, for simplicity, all the individual migratory bird raster’s created were merged into a global unique migratory bird density raster map using the sum of abundance from each individual raster.

Analysis: Species Distribution Model (SDM)

Presence-only maximum entropy SDM was used to detect areas with high relative probability of LPAI presence and model the top contributing predictors to LPAI suitability. MaxEnt, a maximum entropy approach to presence-only distribution modeling was used for the analysis via the “dismo” package in R software61. MaxEnt uses occurrence data (i.e., LPAI confirmed cases) and a set of environmental predictors (climatic, topographic, economic) to determine the geographic distribution of a specific group of individuals in a specific area (here, the LPAI confirmed birds in each of the NAMF). A total of 10,000 background points were randomly chosen for each model. The algorithm behind MaxEnt is described elsewhere62,63. A default convergence threshold of 0.00001, a regularization of 1 and number of iterations of 500 were chosen for this study. In addition, a logistic model was used so that predictions’ estimates are between 0 and 1 for the spatial suitability per map cell. In this study, four models were generated to predict suitable areas for LPAI in four distinct geographic areas in the US following the NAMF: Atlantic, Central, Mississippi and Pacific flyway31. The NAMF correspond to the administrative flyway areas that were established in 1948 based on the migration routes of North American waterfowl and are defined by Councils and Technical Committees that facilitate waterfowl management and conservation across the continent (e.g., hunting regulations, research and habitat management). The goal behind having a model specific to each of these regions is to detect contributing risk factors that may differ or be specific to each sub-region and thus facilitate the implementation of customized interventions for each of them. The administrative flyways of the US are defined as follows: The Atlantic flyway includes the states of Connecticut, Delaware, Florida, Georgia, Maine, Maryland, Massachusetts, New Hampshire, New Jersey, New York, North Carolina, Pennsylvania, Rhode Island, South Carolina, Vermont, Virginia, and West Virginia. The Central flyway which includes the states of Montana, Wyoming, Colorado, New Mexico, Texas, Oklahoma, Kansas, Nebraska, South Dakota, and North Dakota; The Mississippi flyway which includes the states of Alabama, Arkansas, Indiana, Illinois, Iowa, Kentucky, Louisiana, Michigan, Minnesota, Mississippi, Missouri, Ohio, Tennessee, and Wisconsin. The Pacific flyway includes the states of Alaska, Arizona, California, Idaho, Nevada, Oregon, Utah and Washington. These flyways are associated with the major topographic characteristics in North America (Fig. 1). For each of the four flyway models all the 14 variables were included at first in four full-models. Subsequently, variables that contributed to 5% or more for each of the respective “full” models were retained and ran again in so-called “reduced” models. The risk factors selected for each of the “reduced” models were then used to generate the corresponding four suitability maps (Pacific, Central, Mississippi and Atlantic), which were merged together using the Mosaic to New Raster function in ArcMap version 10.3 (ESRI, Environmental Systems Resource Institute) to evaluate the overall AI suitability for US. Spearman’s rank correlation was used to evaluate correlation between predictors and to avoid collinearity problems. Variables with a correlation of 0.5 or higher were considered as highly correlated. In the case that two variables are highly correlated, the variable less related to the outcome is removed.

Model performance, validation and HPAI prediction ability

Jackknife training gain tests were used to determine which variables have a higher contribution to each model. The jackknife tests were run multiple times in different ways: (i) using all variables, (ii) dropping one variable at a time, and (iii) running the model using only one variable. Variables with the highest training gains or those that reduced the training gain the most when left out of the model are considered to be the most valuable variables to the model. Using k-fold method from the “dismo” package61, 80% of the records were used in the construction of the MaxEnt niche models. The remaining 20% of the records were set aside for external validation. The maximum number of iterations for each model was set to 1,000. Both the Area Under the Curve (AUC) of Receiver Operating Characteristics Curve (ROC) and the Calibrated AUC (AUCc) were generated using the “dismo” package61 in RStudio53 and used to assess the models. The AUCc. previously described64 is used to check for spatial sorting bias (i.e., an AUCc close to one shows the absence of spatial sorting bias).

USDA official notifications data of the US HPAI 2014–2015 epidemic were used to evaluate the HPAI outbreak detection rate of the map generated with LPAI data (i.e., the merged flyways). As the HPAI outbreak locations are only available at the county level (county centroids), the first step was to adapt the merged LPAI suitability map considering the maximum suitability present in a county (i.e., assumption of worst case scenario). Counties were then classified as LPAI “high suitable” or “low suitable” based on the median value of the whole dataset. The median for the merged flyways map was 0.3. Therefore, a county was considered as “highly suitable” for LPAI outbreaks if the LPAI suitability probability was above 0.3. In a second step, HPAI outbreak centroids were overlaid over the county level LPAI suitability map. The number of HPAI outbreaks occurring in “high suitable” counties was then calculated.

Additional Information

How to cite this article: Belkhiria, J. et al. Application of Species Distribution Modeling for Avian Influenza surveillance in the United States considering the North America Migratory Flyways. Sci. Rep. 6, 33161; doi: 10.1038/srep33161 (2016).