Bumble bees in landscapes with abundant floral resources have lower pathogen loads

The pollination services provided by bees are essential for supporting natural and agricultural ecosystems. However, bee population declines have been documented across the world. Many of the factors known to undermine bee health (e.g., poor nutrition) can decrease immunocompetence and, thereby, increase bees’ susceptibility to diseases. Given the myriad of stressors that can exacerbate disease in wild bee populations, assessments of the relative impact of landscape habitat conditions on bee pathogen prevalence are needed to effectively conserve pollinator populations. Herein, we assess how landscape-level conditions, including various metrics of floral/nesting resources, insecticides, weather, and honey bee (Apis mellifera) abundance, drive variation in wild bumble bee (Bombus impatiens) pathogen loads. Specifically, we screened 890 bumble bee workers from varied habitats in Pennsylvania, USA for three pathogens (deformed wing virus, black queen cell virus, and Vairimorpha (= Nosema) bombi), Defensin expression, and body size. Bumble bees collected within low-quality landscapes exhibited the highest pathogen loads, with spring floral resources and nesting habitat availability serving as the main drivers. We also found higher loads of pathogens where honey bee apiaries are more abundant, a positive relationship between Vairimorpha loads and rainfall, and differences in pathogens by geographic region. Collectively, our results highlight the need to support high-quality landscapes (i.e., those with abundant floral/nesting resources) to maintain healthy wild bee populations.

The pollination services provided by bees are critical for supporting healthy and diverse natural and agricultural ecosystems 1 . Continuing declines documented in populations of wild and managed bees across the world thus pose significant threats to the stability of these systems 2,3 . Declines in bee populations have been attributed to several factors 4 . Extensive habitat loss and degradation results in a dearth of floral resources and nest sites which has contributed to loss of wild bee abundance and diversity 5,6 . Bee losses, especially those of honey bees (Apis mellifera) and bumble bees (Bombus spp.), have more recently been ascribed to rising levels of novel bee pathogens 3,7 . For example, the microsporidian Vairimorpha (= Nosema) bombi (hereafter, Vairimorpha) has been identified as a major contributor to bumble bee population losses, leading to the extirpation and near extinction of several bumble bee species in North America 8,9 . Several bee viruses also contribute to declining honey bee survival 7 . These honey bee viruses often spill over into native bee populations [10][11][12] , where they impose negative effects on wild bee health 7,13 . In the US, the toxic load to bees of insecticides applied in agricultural landscapes has increased significantly over the last few decades 14 . Exposure to insecticides, particularly neonicotinoids, can lead to a variety of negative effects on native bees and managed honey bees alike, including compromised learning and foraging capabilities and reduced reproductive output 2,[15][16][17] . More recently, climate change has been added to the list of major threats to global bee populations, as large-scale geographic analyses identified climate shifts as a contributing factor to local bumble bee extirpation 18 . Given the myriad of threats to pollinator populations, it can be difficult to tease apart the relative importance of each of these factors to bee health.
Determining the leading threats to bee populations is further complicated by interactions among these stressors in the landscape, specifically many of the factors known to undermine bee health (such as poor nutrition or

Results
Pathogen incidence. For pathogen analysis, we collected workers of B. impatiens, as it is the most abundant species across our Pennsylvania study region. We collected 890 workers from sites spanning a diversity of habitats across Pennsylvania, USA (Fig. 1A), for 2-3 weeks during the peak of bumble bee abundance (late Junemid July), and across 2 years, including 21 sites in 2018 (n = 310 bees) and 41 sites in 2019 (n = 580 bees). Screens of loads of pathogens among pooled bees (n = 5 bees/pool) from each site (typically 3 pools/site), performed via quantitative PCR, revealed that most bumble bee samples (97%) contained BQCV and that BQCV exhibited a broad range of pathogen load values. DWV was much less common, with several samples (29%) lacking notable DWV detection (Fig. 2), and most of those samples that did test positive for DWV exhibiting low loads. Honey bee samples from the same sites (unpublished data) had much higher loads of DWV, suggesting that low loads were not due to issues with our detection protocol, but rather that B. impatiens tends not to harbor high loads of this pathogen. Vairimorpha detections were sporadic, detected in about 40% of samples, but usually at low loads. All bumble bee samples exhibited some degree of expression of the immune gene, Defensin (Fig. 2).
Landscape correlates with pathogen loads. Pathogen loads inferred for DWV, BQCV, Vairimorpha, and a combined pathogen index were modeled against several landscape predictor variables in two separate modeling analyses (tiers) using ranked candidate sets of linear mixed-effects models (see "Statistical analyses" section, below). Our first model tier (Tier 1) examined associations between pathogen loads and key landscape indices: floral resource quality in spring and summer, nesting resource quality, insecticide loading, and honey bee abundance, with each year modeled separately. Tier 1 models revealed higher loads of BQCV at sites with lower quality spring floral resources (both years), lower quality nesting resources (2018), and higher honey bee colony density (2019 ; Supplementary Table S2). These models suggested higher DWV loads at sites with lower quality summer floral resources (2018), lower quality nesting resources (2019)  Our second model tier considered the same covariates as model Tier 1, and also considered additional variables that might be useful predictors of bumble bee health: latitude, longitude, elevation, weather variables, catch-per-effort, local Bombus spp. diversity (H′; measured from the first 20 workers collected per site) and land cover variables (i.e., NLCD percent cover). These models revealed positive relationships between BQCV loads and both longitude (2018) and developed land cover (2018), indicating that more eastern-and developed sites hosted the most BQCV-infected bumble bees (Supplementary Table S2). Likewise, we found relationships between BQCV loads and latitude (negative; 2018), longitude (positive; 2018), forest cover (negative; 2018), natural cover  Similarly, no significant relationships were found between pathogen loads and bee species diversity or bee abundance per unit sampling time.
Although the variables that were most significant varied between years by pathogen, most trends found in both Tier 1 and Tier 2 variables were consistent in directionality between years, indicating that results were repeatable across years (Fig. 4). These models sought to find the variables that explained the data the most, allowing up to 2 variables to be combined within a model, without including correlated variables in a given combinatorial model. Several of our variables, however, were correlated and may alternatively explain patterns. For example, percent 'natural' cover was positively correlated (|r|> 0.70) with the spring floral resource and nesting indices, as well as percent forest cover (Fig. 4). These two variables were also both negatively correlated with percent 'arable' land cover. Spring GDD and summer GDD were positively correlated and both were negatively correlated with percent forest cover, percent natural cover, latitude, and elevation (Fig. 4).

Discussion
Our study is among the first to evaluate how different aspects of landscape quality correlate with the distribution and loads of key pathogens and parasites in wild bees. We used generalizable landscape indices of key landscape characteristics known to influence bee health (e.g., forage resource quality, nesting resource quality, and insecticide toxic loads) in our analyses, which allowed us to assess pathogen load patterns over diverse and geographically distributed landscapes. Our results demonstrate that bees sampled from lower quality landscapes have higher loads of pathogens and parasites, with spring floral resources, nesting habitat availability, and honey bee colony density driving the strongest patterns. While clearly these patterns need to be verified across a larger spatio-temporal scale and with more bee species, our results suggest it may be possible to predict potential risks from pathogens and parasites based on these landscape indices. These indices and models can thus help inform decisions as to where habitat restoration and conservation practices should be applied.
Inadequate forage (i.e., floral dearth) has been cited, along with emerging infectious disease, as among the most important threats to global bee populations 4 . While these two stressors clearly affect bees independently 32,41 , www.nature.com/scientificreports/ our study provides support for the hypothesis that nutritional stress (via lack of floral resources) and disease infection may also interact to impact wild bee populations 19,42 . This is consistent with laboratory-based studies that suggest nutritionally-stressed bees exhibit compromised immunocompetence 38 and greater pathogen loads 43 , and analogous results have been reported in vertebrate systems 44,45 . Additionally, wild bees may benefit from having a choice of floral secondary plant compounds that may be constrained in conditions of floral dearth, as some such compounds are known to reduce pathogen loads 32 . Although a negative relationship between floral availability and disease infection seems intuitive, alternative hypotheses predict the opposite pattern whereby abundant floral resources facilitate more interaction between managed-and wild bee populations 4,46 . Although floral resources were strongly correlated with pathogen loads, we were surprised that features like urban/agricultural development did not predict pathogen loads (Table 1). One potential explanation for this pattern was that such habitats host adequate floral resources to support healthy wild bee populations 47 . Our results indicate that spring resources, as opposed to summer resources, were more strongly correlated with pathogen loads. The spring is a vulnerable period for bumble bees, as resources can be patchy and phenologically mismatched, while queen bees emerge partially depleted of resources from diapause and must find sufficient resources to successfully start their colonies 48 . More attention should be drawn to the abundance of spring resources for supporting bumble bees and other wild bees.  www.nature.com/scientificreports/ Honey bees, which are not native to North America, are well-known disease vectors with the potential to impact wild bees through the introduction of novel pathogens or promotion of those already present within populations 4,35 . In further support of this, our analyses reveal a positive correlation between managed honey bee colony density and some bumble bee pathogen loads (Table 1). Honey bees host high pathogen loads as compared to 'background' wild bee populations, especially for DWV and BQCV (but not Vairimorpha bombi 12,36 ). Pathogen loads present within managed honey bee colonies can be transmitted to wild bees through honey bee foraging activities which usually extend several kilometers around each colony 49 . While our analysis suggested that higher honey bee density was associated with higher bumble bee pathogen loads, patterns varied by year and disease. Although directionality of honey bee density effect was similar across pathogens and years, honey bee impacts were only significant in 2018 and only for BQCV and our combined pathogen index, but not DWV, Vairimorpha, or our non-pathogen metrics (Defensin and marginal cell length). Moreover, our tier 2 models indicated that variables like latitude and longitude were better explanators of pathogen loads than was honey bee colony density. It is important to note that the presence of viruses commonly found in honey bees within wild bees is not necessarily indicative of pathogenicity, though it is possible that these viruses can become more problematic if wild bees are stressed or immunocompromised due to other factors 13,50 .
Pesticides pose a significant threat to wild bee populations 4,51 and the aggregate hazard of insecticides applied to agricultural landscapes across North America has risen in recent decades as a result of widespread use of neonicotinoid seed treatments in common field crops 14 . However, our analyses revealed few clear relationships between pathogen loads and insecticide loading, with several possible explanations. To observe an impact on bumble bee pathogen loads, pesticide exposure likely needs to be high enough to suppress bees' immune systems while remaining sublethal 42 . To this end, we may not expect a linear increase in pathogen loads with increasing insecticide loading because highly toxic sites have few bees 42 . Furthermore, Pennsylvania may not have enough variation in insecticide use across our study sites (Fig. 1C) because most of the state is forested 52 and such natural habitats typically have relatively low insecticide loads 53 . Finally, the insecticide index considered here is based on annual, state-wide insecticide application data for croplands, and it is possible that field-and time-specific insecticide data-or data including non-agricultural insecticide uses-may reveal clearer relationships with pathogen loads. Still, even under the best circumstances, linking pesticides to bee health has proven challenging for observational studies like ours 51 .
In this study, we quantified expression of an immune gene Defensin, considering that this may be a direct metric of overall pathogen strain on the bee. Defensin is one of four antimicrobial peptides produced in bees as an immune response to pathogen infection from bacteria, protozoa, fungi, and viruses, and thus can serve as an indicator of pathogen load 40 . We, however, found that Defensin expression did not correlate well with any Table 1. Habitat variables associated with metrics of bumble bee (Bombus impatiens) health across varied landscapes in Pennsylvania 2018-19. Variables shown below are those with statistical support from our mixed-effects regression analyses (Appendices 1-6) based on Akaike's Information Criterion adjusted for small sample size and β coefficient 95% confidence intervals (those overlapping zero considered to be weak relationships) in either year. www.nature.com/scientificreports/ of our landscape factors. The use of immune genes for understanding pathogen loads has had mixed results in the literature, as high levels of nutritional resources can boost both immunocompetence and immune gene expression 40 . Immune gene expression in response to pathogens is therefore likely to be complex. One of the most consistent patterns we observed in our study was that pathogen loads varied geographically across our study area. This finding echoes the spatially heterogeneous disease patterns reported for wild bees by other studies 12,32 . With this in mind, latitude and longitude are likely proxies for variables (or suites of variables) not fully measured in our study. For instance, both BQCV and DWV were least common in northern Pennsylvania. Indeed, northern Pennsylvania supports a suite of landscape characteristics expected to support healthy bee populations; these landscapes tend to support more native habitat, host better floral resources, and fewer managed honey bee colonies (Fig. 1B-E). Indeed, past predictive models of bee abundance suggest northern Pennsylvania as among the highest quality regions in the eastern United States 54 and our data seem to support this notion, at least from a disease perspective. Ultimately, the geographic variation we observed in bumble bee pathogen loads during our study highlights the importance of accounting for regional variation in assessments of disease risk for wild bees, especially for widely-dispersed species like B. impatiens 55 .
Weather patterns can have profound impacts on animal disease dynamics because weather impacts disease agents, vectors, and even host activity 56,57 . For example, Retschnig et al. 58 observed a negative relationship between ambient temperature and Vairimorpha infection rates in honey bee colonies because colder temperatures kept bees from foraging outside the colony and within close proximity to nest-mates. The opposite pattern has been reported for DWV virus titers across a temperature gradient 59 , however, higher temperatures also reduced bee survival rates. Given the complexity of relationships between insect pathogens and environmental conditions, it is not entirely surprising that our models suggested only modest correlations between weather and pathogen loads ( Table 1). The clearest pattern we observed was a positive relationship between spring precipitation and infection by Vairimorpha bombi. Indeed, several previous studies have indicated that spore viability and infection for other Vairimorpha spp. to be positively correlated with rainfall (e.g., Ref. 60 ), presumably because rainfall Floral"), nesting resources ("Nesting"), insecticide loading ("Insecticides"), honey bee (Apis mellifera) colony density ("Honey Bees"), latitude, longitude, percent shrubland (within 500 m; "Shrubland"), percent urban development ("Developed"), percent arable ("Arable"), percent natural ("Natural"), percent grassland/pasture ("Grass/pasture"), percent forest ("Forest"), and elevation. Additionally shown are weather variables: spring growing degree days ("Spr. GDD"), summer growing degree days ("Sum. GDD"), amount of spring precipitation ("Spr. Precip. "), and amount of summer precipitation ("Sum. Precip. "). Finally, we also included Bombus spp. www.nature.com/scientificreports/ enhances transmission rates 61 (but see Ref. 62 ). Future studies should model more detailed weather parameters than we have to examine their relative impacts. Although our study marks an important exploration into the macro-ecological patterns of pathogen prevalence in a wild bee population, it is important to keep several important caveats in mind with the interpretation of our results. We observed year-to-year variation in the strength, but not direction, of habitat trends (i.e., Fig. 4). The consistency in general patterns highlights the value of multi-year studies for revealing reliable trends, but the variance suggests that we may have uncovered additional patterns had our study continued beyond two years. Additionally, Pennsylvania is a highly forested state 51 ; analyses of data similar to those presented here from other regions with different ecosystem types (e.g., grasslands, intense agriculture, etc.) would help to decouple co-correlated variables and better understand the relative impacts of different habitats. Furthermore, we did not sample pathogens from sites with very few bees (i.e., those where we could not collect ≥ 5 B. impatiens), which could have hampered our ability to assess some of the effects of the lowest quality sites on bees. Finally, many variables in our analyses were correlated (|r|≥ 0.70) and could not be modeled simultaneously. Because correlated variables were included as separate variables in each tier, competing models often included different members of a set of correlated variables (e.g., Supplementary Table S2), indicating uncertainty as to which variable best predicted disease loads. Future analyses might consider methods that account for correlated variables (e.g., ordination) or are robust to variable correlation (e.g., random forest, etc.).
Collectively, our results highlight the need to support high-quality landscapes (i.e., those with abundant floral/nesting resources) to maintain healthy wild bee populations. These results are particularly timely in light of widespread population declines in many insect groups 63 , especially pollinators like bumble bees 2 . In addition to helping support healthy pathogen-free bees, conservation of low-pathogen habitats identified here is also important from other perspectives; landscape features that support abundant floral and nesting resources also host more abundant and diverse pollinator populations and communities 54,64 , and the critical ecosystem services they provide (Russo et al. 2013). The model results presented here could also be mapped across landscapes and incorporated into conservation/planning tools (e.g., The 'Beescape' program; beescape.org). Indeed, spatially explicit conservation tools like Beescape that map biological patterns in space can be instrumental in ensuring conservation success for sensitive species like bees 65,66 . With that in mind, the analytical approaches used here coupled with the habitat indices we incorporated could be applied to assess wide array of conditions and study areas, thus allowing even broader insights to relative value of different landscape features for pollinators.

Methods
Site selection. Our sample sites (21 sites in 2018 and 41 sites in 2019) includes 38 of the 67 counties across Pennsylvania (Fig. 1A). These sites were selected to represent evenly the span of the state while also collecting from a wide suite of habitat types and land use patterns. The central Appalachian Mountains run through the center of the state and, therefore, vary in elevation from relatively low (~ 100-200 m) to moderately high (> 500 m 67 ). Pennsylvania is heavily dominated by mature forest 68 with northern hardwood (e.g., Acer spp.), Appalachian oak (Quercus spp.), and coniferous (e.g., Tsuga canadensis) among the most common forest types 69,70 . Although high-elevation regions in Pennsylvania remain largely forested 52 , lower elevations host fertile soils on which row cropping and livestock-based agriculture are important land uses, as is human development 71 . As a result, a range of values for each of nesting habitat, floral resources, and agricultural insecticide loading was available for sampling (Fig. 1).
Bee sampling. Upon arriving at each site, we conducted an unlimited-length visual encounter survey (VES) for bumble bees, where 1-2 surveyors examined all available flowers as evenly as possible for bumble bees, ignoring bumble bee species identity, until we caught 20 Bombus workers. In most sites, bees were captured within 100-200 m of the site centroid. Gathered bumble bees were identified in the field and retained for lab identification if identity was uncertain (rarely necessary). Given the difficulty of identifying B. perplexus, B. vagans, and B. sandersoni workers, these were lumped and treated as a single taxon. This 20-bee survey allowed us to obtain a diversity metric for each site (standardized Shannon Diversity Index (H′) 72 ). All bees were captured legally under Pennsylvania special use permit #2019-75.
For pathogen analysis, we sought to collect 15 B. impatiens from each site; if our initial sample of 20 Bombus included fewer than 15 B. impatiens workers, we continued sampling until 15 B. impatiens workers were collected. During 2019 sampling, we recorded the start-and end times at each site which allowed us to calculate 'catch per unit effort' for B. impatiens in 2019 as: (# B. impatiens within the first 20 bees sampled/survey duration)/the number of surveyors. A few sites where 15 B. impatiens could not be obtained across many hours of sampling were sampled for 5 or 10 bees. If a site yielded < 5 B. impatiens workers across several hours of sampling, it was not included in the study, thus some of the poorest quality sites were potentially excluded. Bees were gathered into vials in groups of five and stored immediately onto dry ice in the field, followed by transfer to a -80 °C freezer. Most sites thus contained three replicate vials with five individuals each for subsequent pooled pathogen screens.
Molecular quantification of pathogens. Abdomens were removed from workers while frozen, placed in a 5 ml tube with 5 metal beads and 2.0 ml of Qiazol (2018) or 2.5 ml Trizol (2019) buffer and homogenized using an Omni Bead ruptor for 35 s on 'low' . After brief centrifugation to remove particulate matter, 350 μl of homogenate was used for RNA extraction. RNA was extracted using the standard protocol for the Direct-zol RNA Miniprep Kit (Zymo Research, Irvine, California) including incorporated DNA removal steps with DNa-seI. RNA samples eluted in water were quantified and quality assessed using a Nanodrop One (Thermo Fisher Scientific, Waltham, Massachusetts). Then, 500 ng (2018) or 400 ng (2019) was taken through cDNA synthesis using standard protocols of the High-Capacity cDNA Reverse Transcription Kit (Thermo Fisher Scientific) ( www.nature.com/scientificreports/ RNA + 10 µl Mastermix reaction, proportional to recommended quantities), followed by a 1(cDNA):4(water) dilution. Quantitative PCR was performed using this cDNA to assess loads Vairimorpha bombi (SSU 16S ribosomal RNA, BOMBICAR primers 73,74 ) and loads of single stranded positive-sense RNA viruses BQCV and DWV (primers from Ref. 75 ). We also quantified expression of the immune gene, Defensin, using primers designed for bumble bees and previously used on B. impatiens (Bi' def-278F, Bi' def-397R 76 . Bee housekeeping gene Elongation factor 1α (EF-1α) was used as a control gene for normalization (primers sequences and conditions in 78) of immune genes and pathogen loads against available bee tissue. For qPCR, 1 µl of cDNA product, 5 µl of SYBR green, and 0.3 µl each of 10 µM primer were combined and run on an 7900RT Fast Real-time PCR system machine (40 cycles, 60 °C annealing temperature; Applied Biosystems, Waltham, Massachusetts) using dissociation curves to ensure no non-specific products were generated. qPCRs were run in triplicate using standard curves and no template controls. Standard curves included eight serial fivefold dilutions of original PCR amplified or concentrated cDNA product. Within-site samples were randomized across extraction and cDNA runs and plates, with 2018 and 2019 samples run separately. Extent of gene expression was calculated from qPCR C T values by first adjusting C T values based on inferred amplification efficiency of standard curves. Standard curves showed good primer efficiency for all genes, with linear increase of C T values closely matching doubling of sampling concentration across values. The resulting adjusted values were normalized against the standard curve adjusted expression of the EF-1α control gene (subtracting EF-1α C T values from C T values). Samples which deviated strongly from the others for EF-1α (most samples were within 2 C T values of one another) were removed from analysis. Samples where no expression was detected, which was found in some pathogen samples, were assigned a final C T value of the sample of lowest detected expression level (highest C T ) rounded up to the nearest integer which included C T cycles of 35 (BQCV), 37 (Vairimorpha), and 39 (DWV). For statistical analyses, we used the normalized values (normalized to the amount of EF-1α in each sample) for DWV, BQCV, Vairimorpha and defensin. To quantify body size, we measured the marginal (radial) cells on the right forewing of each worker (to the nearest 0.1 mm) using a dissecting microscope 77 .
Landscape data metrics. We characterized broad land cover categories around each sampling location from the most recent version (2016) of the National Land Cover Database (NLCD 69 , 30 m resolution). Using the raster package in program R version 3.6.1 78,79 , we extracted percent area for forest cover (deciduous + mixed + coniferous), developed cover ('open space' through-'intensely developed'), pasture, row crop, and shrubland, within 2 km radius of the starting location for each study site. We also created a category for all cultivated crops combined (a sum of all crop categories; hereafter, 'arable' land cover) and all 'natural'/seminatural communities combined (e.g., the sum of: forest, shrubland, wetland, etc.; hereafter, 'natural' land cover). Although bumble bees are known to occasionally use habitat at broader spatial scales 5,80 , the majority of foraging activity for B. impatiens and other bumble bee species likely occurs within 2 km of colonies 81,82 suggesting this as a useful scale for analysis.
Although we consider broad land cover classes (e.g., NLCD, described above; Fig. 1), a primary interest of ours was to assess the value of published habitat indices of bee habitat quality. Of particular focus herein were the spring and summer floral quality indices published by Koh et al. 54 , their nesting habitat quality index, and the insecticide loading index published by Douglas et al. 14 . To predict floral and nesting resources at each of our study sites, we used the Integrated Valuation of Environmental Services and Tradeoffs (InVEST v. 3.1.0) crop pollination model 83 . The floral and nesting indices available through the InVEST crop pollination model were developed to explain and predict the relative value of different habitats to pollinator communities from the perspectives of floral resource availability and nest site availability 53,82 . These habitat indices were developed by surveying a panel of experts on wild bee ecology regarding their understanding of relative resource quality among vegetation communities in the NLCD. Expert ranks were then averaged across each land cover category to produce a re-class table which can be used to translate the NLCD into maps of predicted habitat quality for components of bee habitat 53 . As in Kammerer et al. 84 , we translated land use into relative value of nesting and floral resources with reclassification tables from Koh et al. 54 , and, for distance-weighting procedures within the model, assumed a Bombus foraging range of 2 km. To generate maps of agricultural insecticide loading (bee lethal doses applied per ha), we obtained year × state × crop reclassification tables presented by Douglas et al. 14 [Douglas et al., unpublished data] through email correspondence with the corresponding author (M. Douglas). As with our floral and nesting indices, we used the Douglas et al. 14 reclassification tables to scale land cover into insecticide toxic load and applied a distance-weighting procedure to more heavily weight insecticide application proximate to our study sites. To calculate nesting, floral, and insecticide indices, we used the 2018-19 Cropland Data Layer (USDA NASS 2018) as our land-use layer, which includes natural habitats from the NLCD and greater resolution of agricultural crop types (Ref. 85 ; Fig. 1C,D).
In addition to data on cover type and resource availability indices, we incorporated data related to site-specific weather conditions during each sampling year. Specifically, we quantified cumulative precipitation and growing degree day (GDD; base = 0 °C) maps for our study area using the publicly available PRISM daily weather dataset (Ref. 86 ; 4 km resolution). From these maps, we extracted precipitation and GDD values at each sampling location for 'spring' (March-May) and 'summer' (June-August) of each survey year. We quantified honey bee colony density using the Pennsylvania Department of Agriculture registered apiary database (K. Roccasecca, unpub. data) with each apiary buffered by 5 km 49 , scaled by the number of colonies in each apiary (Fig. 1B).

Statistical analyses.
Prior to analysis we scaled all variables around a mean of zero to improve model convergence using the scale function in R. We assessed the relationships between bumble bee metrics of pathogen prevalence (Vairimorpha, BQCV, DWV, and a combined pathogen metric), expression of the immune gene Scientific Reports | (2020) 10:22306 | https://doi.org/10.1038/s41598-020-78119-2 www.nature.com/scientificreports/ Defensin, and body size, against landscape features described above using an information-theoretic approach 87 . Our use of the information-theoretic approach involved ranking a series of models with the same response variable (hereafter, a 'set' of models), each of which is a 'candidate' for the position of top rank in a statistical hierarchy, as determined by an information criterion like Akaike's Information Criterion (AIC 87 ). All models presented here are analyzed as ranked candidate model sets. Our combined pathogen metric was the mean normalized pathogen value averaged across our three pathogens, Vairimorpha, BQCV, and DWV. Specifically, we developed candidate sets of mixed-effects linear models using the lme4 package in R 88 . All models included a random effect for 'site' to account for potential non-independence of individual samples within each site 89 and each year was modeled separately. In each model, bumble bee pathogen metrics for each group of five workers were used as response variables with site-specific habitat characteristics as predictors. For Defensin expression, we also considered pathogen loads as predictor variables. Within each model set, we considered all possible combinations of 0-2 predictor variables with both linear (x) and quadradic (x 2 ) relationships. To assess the extent to which both independent and dependent variables were correlated prior to analyses, we created correlation matrices of all pairwise variables. We considered two variables to be correlated if |r|> 0.70 90 . Because there were numerous covariates that were correlated (Fig. 2) and correlated variables impact numerical optimization of linear models 89 , we specified only models where correlated predictors did not occur together in a single model. Additionally, in all model sets, we also specified a 'null' model that included only the random effect for 'site' with no fixed effects. We specified a total of 22 candidate model sets with 11 sets in our first tier and 11 analogous sets in our second tier. The 11 model sets in each tier consisted of our six response variables: (1) Vairimorpha, (2) BQCV, (3) DWV, (4) Defensin, (5) combined pathogen load, and (6) marginal cell size analyzed separately for each year (marginal cell size only for 2019). Within each candidate model set, we assessed models and associated covariates using AIC adjusted for small sample size (AIC c ; models < 2.0 AIC c considered equivalent) applying β 95% confidence intervals (with those overlapping zero considered weak effects 87 .

Data availability
Upon acceptance, the datasets generated and analyzed during the current study will be publicly available in the Dryad data repository.