Climate change, brought about by increased human activity in the last few decades, has a number of effects on planetary and human health1. Increased human activity has led to increases in a number of greenhouse gases such as carbon dioxide (CO2), methane (CH4), nitrous oxide (N2O), and ozone (O3). The global average atmospheric CO2 in 2018 was 407.4 parts ppm, which are higher than at any point in at least the past 800,000 years2. Global average temperature increased by about 1.0 °C from 1901 to 20163 and continues to increase. The last 5 years, 2015–2019, have been the hottest years ever recorded. Climate change has led to increases in extreme weather events, such as increased flooding, wildfires, and thunderstorms4. The Centers for Disease Control and Prevention lists health effects of climate change including increased risk of atopic diseases such as allergic rhinitis and allergic asthma5. This trend is especially concerning due to the high prevalence of atopic disorders. Currently, approximately a quarter of individuals in developed countries6 are affected by allergic disease and these numbers are expected to increase with climate change. Temperature, rainfall, and other variables of climate change have been shown to indirectly effect allergies and asthma by their effects on pollen and molds7,8. Pollen from trees, grasses, and weeds and spores from mold are sources of allergens. Changes in vegetation, increased pollen/mold spore concentrations, and prolonged pollen seasons are linked to climate change. Increases in pollen/mold spores from climate change lead to allergies and asthma; the effects of climate change on human health is well documented9, especially in case of allergies10. For example, the distribution of the common ragweed (Ambrosia artemisiifolia L) has been expanding from Central to Northern and Eastern Europe due to changes in climate (rising temperatures, favorable precipitation) and that increases in CO2 has been increasing ragweed pollen production and allergies in these regions11. Following thunderstorms, a record-breaking number of visits to the emergency department for respiratory issues was observed in Australia in 201612. During thunderstorms, whole pollen grains are swept into the clouds where they are broken up into smaller allergenic pollen fragments and eventually carried back to ground level13. These smaller size of pollen fragments permit their entry deep into the lungs. The mechanisms hypothesized for the fragmentation of pollen during thunderstorms include mechanical friction from wind gusts, electrical build up and discharge incurred during conditions of low relative humidity, and lightning strikes14. Air pollutants and CO2 levels have also been shown to affect the prevalence of aeroallergens15.

Airborne pollen and mold contribute significantly to adverse health outcomes in allergy and asthma. Increased pollen counts in spring is associated with increases in over-the-counter allergy medication sales and increases in emergency visits due to asthma exacerbations16,17. Pollen and molds are key triggers for allergic rhinoconjunctivitis and asthma flares. Increases in molds, caused by heavier rainfall and higher temperatures, can cause respiratory and asthma-related conditions as well as allergic bronchopulmonary aspergillosis, allergic fungal rhinosinusitis, and hypersensitivity pneumonitis18. There is also growing evidence that changes in the climate may be contributing to the rising incidence of food allergy due to changes in distribution of sensitizing plants and possibly due to a direct alteration in the allergenicity of plants with rising CO2 levels19. Since different patients are sensitive to different levels of atmospheric allergens, it is important to understand how the pollen and mold activities are changing over time. Patients who are sensitive to even small amounts of pollen and mold spores could benefit from the knowledge of their activities and outside peak allergy seasons, and how they vary with climate change.

As pollen and molds exhibit geographical variations, we sought to understand the effects of climate on common environmental pollens and mold spores in a specific region in the San Francisco Bay Area (Los Altos Hills, CA). In addition to measurements for maximum temperature, carbon dioxide level, and precipitation, we also compare the change in pollen or mold spore concentration with wildfire smoke exposure, as our area of study has been experiencing increasing exposure to wildfire smoke in recent years. An important gauge of the impact of climate change lies in phenology of pollen and mold exposure due to changes in pollen seasons and intensity of exposures20,21,22,23. We therefore evaluated a long-term dataset of outdoor pollen and mold observations over an 18-year period (2002–2019) using an in-depth analysis across the spectrum of aeroallergens (tree, grass and weed pollens and mold spores) contributing to allergic disease. The region is bounded by the Santa Cruz mountain range to the west and the San Francisco Bay to the east and has a Mediterranean climate. It is one of the most populated ecoregions in the United States and lies within the San Francisco-San Jose metropolitan area24. The average land change footprint of the area between 1973 and 2000 as determined by 11 land-cover classes (eg., mining, forest, agriculture, wetland) was estimated at 9.9%24. The vegetation is a mixture of grasslands, shrublands, and various forest types, the dominant among which is the evergreen forest25. In addition to the mixed evergreen forests, the coastal areas are covered by coastal scrub26. The region is undergoing changes in land cover due to rapid urbanization. A 19-year study (1984–2002) found that that the population increased by 30% and the urban area increased by 73%, leading to a 17% increase in impervious land cover and a 27% decrease in pervious surfaces27.


Annual and seasonal trend analysis

A summary of the terminologies to measures pollen and spore activity are presented in Table 1 and described in Methods.

Table 1 Measures of pollen and spore activity, and their description.

For all three groups major allergens, selected species, and commonly observed species (Methods), we analyzed annual trends in pollen and mold concentrations, seasonality, and activity. All statistical analyses were performed in the Python programming environment (Python Software Foundation, ) and p values < 0.05 were considered statistically significant.

Summary statistics for annual and seasonal characteristics of major allergens is presented in Table 2. The week on which pollen concentrations peak for each type of allergen is given in the “Peak Week” column, which shows the distinct seasonal pattern of each type of major allergen. Tree pollens peak in Spring, Grass pollens peak in late Spring and early Summer, Weed pollens peak in Summer, and Mold concentrations peak in Fall. To quantify various annual trends for each observation, the annual average values for major allergens as well as the annual trends for the different climate variables were analyzed and plotted (Supplementary Figs. S1 and S2). Statistical significance was calculated by fitting linear trends using first-order linear regression. Supplementary Fig. S3 shows statistically significant increasing trends for TMax and CO2, while there was no such trend for precipitation. In Supplementary Table S1-S3, we present the temporal trends for major allergens, selected species, and most commonly observed species. A decreasing trend for major allergens’ annual average concentrations (statistically significant for trees and grasses, coefficients of linear trend: − 3.16 and − 0.19 respectively) were observed (Supplementary Table S1). Although the annual average concentrations for all major allergens except weeds showed a decreasing trend, only tree and grass pollens were statistically significant with p value < 0.05. For details about individual tree, weed, grass, and mold species refer to Fig. 2 and Supplementary Tables S3-S5. We also analyzed changing season length for different pollen and mold spore types over the years. We found increases in the season length for tree pollens (0.38 weeks). Given the increasing season lengths for some pollen despite the decrease in average annual pollen counts, the number of active weeks was also investigated. The annual linear trends for these values are shown in the third column of Supplementary Table S1-S3. The number of active weeks significantly increased for tree pollens and molds. To examine whether pollen and mold seasons were starting sooner and extending further into the year, the weeks of the year when the pollen seasons and mold seasons start and end were calculated (seasons were calculated using an established procedure, Methods); the coefficients of linear trends are shown in the fourth and fifth columns of Supplementary Tables S1-S3. For tree pollens, we observed a significant delay in the end of season (0.29 weeks).

Table 2 Summary statistics for major allergens (2002–2019), showing information about seasonality, number of active weeks (NAW), Seasonal Pollen Integrals (SPIn), Annual Pollen Integrals (APIn), Maximum Pollen Concentrations (MPC), and Peak Week. Mean, Standard Deviation (SD), Minimum, and Maximum values are shown for each measure.

Supplementary Tables S2 and S3 reveal interesting properties regarding annual concentrations, which are different from what was observed for major pollens and mold. For both mold and weed species, there were increases in annual average concentrations (although not statistically significant), while all tree species show a decreasing trend. The pollen season is getting longer and starting earlier for a majority of species, but the trends were not statistically significant. Similarly, the season is ending later for a majority of species, but the trend was not statistically significant. In Supplementary Table S3, the most commonly observed species (all of which are molds) demonstrate increasing trends for the number of active weeks. The top-two most commonly observed species were active for an average of half a week more than prior years. In Figs. 1 and 2, we visually show the change in seasonal characteristics and number of active weeks for major allergens, and commonly observed species. In Supplementary Fig. S1, we visually show these results for those selected species whose season length we were able to calculate for at least 10 years during our study duration.

Figure 1
figure 1

Coefficient estimates and 95% confidence intervals for change in season length, number of active weeks, start of season, and end of season for Major allergens.

Figure 2
figure 2

Coefficient estimates and 95% confidence intervals for change in season length, number of active weeks, start of season, and end of season for the most commonly observed species. Italics: Trees, Normal: Molds, Bold: Weeds.

Association of pollen counts with climate variables

To study the association of pollen concentrations with patterns of climate variables, the well-established autoregressive method Auto Regressive Integrated Moving Average (ARIMA) was used. For details on ARIMA, please refer to the Methods section.

Associations of pollen and mold with three climate variables (maximum temperature, precipitation, carbon-dioxide, and smoke area) are shown in Table 3. The columns in the table show the association of climate variables in different lags, e.g., TMax (0–24) shows the association of maximum temperature in prior six months on the pollen and mold concentrations. In other words, these values show how peak pollen and mold concentrations are related to the lagged values of different climate variables. Climate variables immediately before, as well as a year before could be strongly associated with pollen and mold concentrations as shown in the results.

Table 3 Summary of the multivariate ARIMA model with the explanatory climate variables for major species (2002–2019). The best fitting univariate ARIMA( , , )( , , , ) model parameters were used to estimate the coefficients and p-values for different lag of the climate variables: Smoke Area, CO2, precipitation, and TMAX. AIC denotes the Akaike’s Information Criterion. Lags indicate averaged values at prior indicated durations, divided into week 0–1, week 0–4, week 0–12, week 0–24, week 0–52, and week 53–104 for immediate, short-term, seasonal, and pre-seasonal effects. Numbers in bold represent the statistically significant associations(p value < 0.05).

The strongest association with recent temperature changes was observed in the concentrations of tree and weed pollens. For tree pollens, the association is positive with changes in temperatures in the same week (lag 0–1), whereas the association is negative with changes in temperatures in longer timeframes (lag 0–12 and 0–24). For weed pollens, the associations are positive for both immediate, seasonal and annual timeframes (lag 0–1, 0–4, 0–12, 0–52). Results in Supplementary Table S1 revealed that tree and weed pollens peak in Spring and Summer respectively. This suggests that for trees, their peak pollen concentrations are associated with rising Spring temperatures (likely associated with blooming season) and falling Winter temperatures. Similarly, for weeds, their peak pollen concentrations are associated with rising summer and spring temperatures.

Peak values of tree pollen concentrations were also associated with lagged values of precipitation (negative at lag 0–1 and positive at lag 0–4, 0–12, 0–24). This suggests that Winter rains are associated with increased tree pollen concentrations a few weeks later in Spring, but decreased tree pollen concentrations week immediately after. For details about individual tree, weed, grass, and mold species refer to Fig. 2 and Supplementary Tables S3-S5.

Mold concentrations were also observed to be significantly associated with lagged values of precipitations (lag 0–1, 0–4, 0–12, 0–24), suggesting that increased rainfall leads to increase in mold spores up to six months in the future. In the case of grasses, whose pollen concentrations peak in late Spring and early Summer, the association was found to be strongest with lagged values of temperature and precipitation from up to 6 months in the past (lag 0–24). This suggests that increase in Winter rain or decrease in Winter temperature are associated with higher grass pollen concentrations in the next season. Grass pollens were also observed to be negatively associated with lagged values of temperature from the previous year (lag 53–104) and positively associated with temperature of the same week. This suggests that increase in Summer temperature are associated with higher grass pollens in the same week.

In this dataset, strong associations between atmospheric CO2 and pollen and mold counts were not observed. Previous studies also found it difficult to separate the influence of rising CO2 from temperature change on growth or floral phenology of plants28. In our study, wildfire smoke exposure was also not found to be associated with pollen or mold concentrations for any of the major allergens.


In this retrospective analysis of pollen and mold concentrations in the San Francisco Bay Area during the past two decades, we observed that whereas average concentrations for most species is decreasing over time, the season length and number of active weeks are increasing. Further, these observations are statistically significant for the most commonly observed species and are also correlated with observed maximum temperature and precipitation in the region. While previous studies in this subject have looked at a limited number of species, our analysis covers more than twenty species observed in the studied region for a long time-period.

Some of our findings are consistent with the observations made in other studies. These include increasing pollen seasons, and their association with observed climate variables. However, we found that the average annual concentrations of most species in our study region has been decreasing over the years. Prior studies with regards to annual trends of pollen concentrations show a mixed result, with increases in some areas and decreases in other. Notably, a study of pollen counts in different areas in the United States29 observed that the annual concentrations were increasing significantly in northern latitudes, but not in the southern latitudes. In our study, we observed increasing periods of activity for several species even as we observed a decrease in their average annual concentrations, suggesting that the pollen and mold activities are increasing outside their peak seasons. Rapid urbanization and land-use change could be a possible reason for decreasing trend of pollen concentration in the area under our study. Moreover, changes in climate variables like temperature could be due to both local change in land-use (e.g., urbanization) or global climate pattern. Both climate change and land-use change could bring about changes in the species of trees and plants in a region due to species migration or changes in architecture and landscaping preferences28.

While indoor molds are known to be present throughout the year, our study concerns outdoor molds, whose season peaks in Summer and Fall30 and are known to cause allergic reactions. In the region of our study, we observed that mold species are the most commonly observed ones, and both the season length and number of active weeks for the most frequent among them have increased in the past two decades.

A major difference of our study in comparison to previous studies is the wide range of pollen and mold species covered in our analysis. Additionally, we also look at the changing trends for the number of active weeks of pollen and mold, in addition to their seasons. Seasons are the durations when pollen and mold concentrations reach their peaks. However, pollens and molds are active outside of those peak durations as well and knowing how these activities are changing could be beneficial to improve care for specific groups of people. Antihistamine and anti-inflammatory allergy medications can take up to 4 weeks to be fully effective31. Because individuals could be sensitivity to even small amounts of pollens and molds, our study could help both patients and physicians prepare ahead of peak seasons.

The relationship between climate change and phenology in a variety of plant species has been an area of increasing interest22,32. Previous studies have shown an advancement in the onset of pollen seasons in plants33,34. The U.S. Environmental Protection Agency has acknowledged the role of changing climate on pollen season35. A recent study using more than 20 years of airborne pollen data from across 13 countries in the Northern hemisphere demonstrated the effect of changing temperature on pollen season and load28. The International Phenological Gardens, a European network, has reported that since the 1960s, growing seasons have increased by approximately 11 days20. Ziello et al. reported an increase in atmospheric pollen of multiple types between 1977 and 2009 across Europe36.

Previous studies have shown that temperature and water availability correlate with pollen pro-duction intensity37,38. Increases in temperature directly increases pollen production both in the year prior to the pollen seasons, as well as in the month preceding flowering. A study from Spain examining the pollen trends of olive trees found increases in temperature were correlated with an earlier start and a later end to the pollen season each year between 1982 and 2011, demonstrating an increase in pollen production. Modeling suggested significant changes in the reproductive cycle of the olive tree due to climate change33. Several studies have demonstrated a relationship between higher temperatures and sun exposure the year prior to higher daily pollen concentrations the following year39,40. The previous summer’s temperature influences the intensity of pollen production as pollen grains are being formed the year prior, which depend on the photosynthates from the summer to reproduce in the spring. Studies have also found that higher temperatures in the month leading up to flowering also directly correlated with higher pollen concentrations41,42. Fungal spore concentrates increase with increased temperature43.

The relationship between rainfall, water availability and the concentration of pollen has been variable. Soil moisture is needed for seed germination but precipitation during flowering and pollen dispersal can wash out pollen and lower counts. Water deficits have been shown to delay olive flowering44,45. Drought conditions have been shown to decrease pollen in Switzerland and the Mediterranean46,47. In North America, tree pollen increases with increasing precipitation. However, Rasmusseen found that precipitation from the previous year was negatively correlated with average birch pollen concentration; although this was postulated to be due to a negative correlation between temperature and precipitation45. Increased water and soil moisture stimulate fungal growth spore growth and dissemination.

In our study, we also found significant associations of temperature and precipitation with pollen activities of multiple pollen species, consistent with prior work in this area. Additionally, this work sheds light into the role of changing climate with regard to individual species, as well as their short- and long-term influences (a few weeks to a year). Since different geographical areas have different prevalence of plant species that contribute to pollen activity, this analysis helps understand the unique characteristics of the San Francisco Bay Area’s pollen seasons and their changing nature.

CO2 is the source of carbon for photosynthesis. Ziello et al. suggested that that rising CO2 concentrations may be responsible for pollen increases36. Increases in CO2 is also thought to contribute to mold growth. Zhang et al. used Bayesian modeling and found that annual mean CO2 concentrations were significantly related to birch pollen levels and projected rising pollen counts in the next century29. Growth chamber experiments in which trees, grasses and weeds are exposed to higher levels of CO2 show increase in pollen production48,49,50. Experiments have also shown that increasing CO2 increases mold spore production51. However, it has also been noted that ascertaining the influence of rising carbon dioxide apart from temperature on pollen activity is hard to ascertain28. In our study, with regards to climate variables, as expected, we found that both CO2 and maximum temperature shows statistically significant increasing trends. For the pollen and mold types we studied in the San Francisco Bay Area, we found that the annual average concentrations show a decreasing trend over the years with grass pollens and some frequently occurring molds and tree pollens showing statistically significant trends. For trees and several molds, the average number of active weeks shows a strong increasing trend over the years.

Future changes due to climate change are expected to further impact pollen production. Hamaoui-Laguel et al., using models to predict ragweed pollen concentrations in Europe found an anticipated four-fold increase in airborne pollen levels by 205052, which has been predicted to increase rates of pollen sensitization53. Similar results have been found in Italy with increasing tree pollen counts and an associated increase in patients sensitized to pollen54. Better understanding the impact of climate change on pollen and mold spore production can guide predictive modeling to forecast pollen and mold production, improving public health measures to prevent asthma and allergy flares and prepare resources to respond to events that cause spikes in pollen and mold levels.

A limitation of this study was that we used a single site of pollen and mold collection and analysis. As pollen and mold spore concentrations are influenced by changes in the local environment and changes in landscaping, additional sites of collection would further strengthen the reliability of the data and interpretations. Thus, the results of this study provide insight into only the local region of the San Francisco Bay Area. The decreasing annual average concentrations for pollens and molds could be due to several of these reasons, including the rapid urbanization and change in vegetation cover in the area of our study. However, the findings are consistent with other studies examining phenology and climate change and suggest broad implications and a global impact of climate change on allergen activity.

Given the observational nature of the study, multiple environmental factors may be contributing to the observed findings. Given the complicated nature of plant biology, other factors are difficult to account for such as masting behavior, and the production of many seeds by a plant. Furthermore, local atmospheric changes and soil composition on pollen activity may have influenced our findings, but these variables could not be tested due to the lack of a suitable dataset. In addition to smoke exposure, particulate matter could also have influenced pollen activity and should be evaluated in future studies.

In future studies, we plan to examine the change in pollen concentrations and activities and their relationships with clinical outcomes. By combining datasets of electronic health records (EHRs), we could study how changing climate patterns and pollen activities affect patient visits as well as prescription of allergy medications. Additionally, datasets of land cover could be used to study the association between change in land-use with the changes in the activities and concentrations of pollens and molds. Although the most commonly observed species are specific to the study location, future studies could look into how the activities of some of the species observed in our study area have changed in other similar geographical locations across the world.

Extant research has largely focused on individual species or on a few taxa. We provide detailed analysis of pollen and mold activity for the twenty most frequent species in the area of our study, as well as for selected species of clinical significance. The long temporal span of this dataset (18 years) lends itself to studying the effect of changing climate on pollen and mold activity. As temperatures are increasing, the length of the pollen season for several species is significantly increasing. Similarly, there are strong associations between multiple pollen and mold species and climate variables, although for some species the direction of these association is not always uniform.


Collection and counts of pollen and mold spores

We used a database spanning 18 years (2002–2019) of weekly pollen and mold spore concentrations for an area in the San Francisco Bay Area (Los Altos Hills, Santa Clara County, CA, USA) obtained from a National Allergy Bureau (NAB) certified pollen counting station. The location of the pollen collection site and neighboring areas in the San Francisco Bay Area is shown in Supplementary Fig. S5. Concentrations of outdoor pollens and mold spores were obtained with a Burkard Spore Tap (Burkard Collector) and were identified by species and also categorized as tree pollen, grass pollen, weed pollen or mold spore. The Burkard Collector is a volumetric air sampler and a standard device for monitoring airborne pollen and spores. This device draws in air at regular intervals and as a result, any airborne particles with enough inertia are captured on a surface inside the device, e.g., a greased tape or a microscopic slide. The capturing surface moves in a steady speed, allowing for newer samples to be collected. The device also has a wind vane and an ability to rotate, making it always oriented into the wind. The Burkard Collector can collect particles up to 3.7 µm and has been used in prior studies55.

Time Series Analysis (ARIMA).

ARIMA is a well-established method for time-series analysis and has been used to find associations between climate variables and health outcomes56. The pollen timeseries datasets have a seasonal component, as can be observed in the decomposed time series plots (Supplementary Figs. S6-S9). For more details on time-series decomposition, see Supplementary Appendix Section “Time Series Decomposition.” For this reason, we used SARIMA, which is the seasonal variation of ARIMA, and which has the flexibility to control the seasonality and autocorrelation in the timeseries. In ARIMA( , , ) models, the target variable is predicted using three components: (1) past values (lags) of the target variable (AR or autoregressive), (2) differentiation of the timeseries, and (3) a moving average model (MA or moving average) on past forecast errors. The parameters for these three components together define the order of an ARIMA(p, d, q) model, where p, d, q correspond to the first, second, and third components, respectively. The seasonal ARIMA(p, d, q) (P, D, Q, m) model has an additional seasonal order where the parameters P, D, Q similarly refer to the seasonal variants of the first, second, and third components, and m refers to the frequency of the timeseries. This model is written in short as ARIMA(, , ) ( , , , ). All statistical analyses were performed in the Python programming environment (Python Software Foundation, ) and p values < 0.05 were considered statistically significant. In all ARIMA models, the Box-Ljung test was used to test the null hypothesis that the autocorrelations of the residuals equal zero and the augmented Dickey–Fuller test was used test whether the timeseries was stationary.

First, univariate ARIMA( , , )( , , , ) models of different orders were fitted for the timeseries of pollen and spore concentrations of each major allergen (Trees, Weeds, Molds, Grasses) using the Box-Jenkins approach57. Additional information on species can be obtained from Tables S3/S4. The best performing ARIMA models for each allergen were chosen based on the Akaike Information Criterion (AIC), and they are presented in Table 4.

Table 4 Summary of the univariate ARIMA( , , )( , , , ) model fitting parameters on the timeseries datasets for major allergens (2002–2019). The best fitting models are chosen based on Akaike’s Information Criterion (AIC).

Next, the best fitted ARIMA model was examined together with different climate variables. The statistical significance of the climate variables was then determined using these multivariate ARIMA models. Given prior finding in the literature than pollen activity can be influenced by climate factors from earlier seasons, climate variables at different lags (earlier periods) were included to check the associations of immediate, short-term, seasonal, and pre-seasonal climate variations with peak pollen and spore concentrations. The values of each climate variable were averaged for the following lagged durations: week 0–1 (immediate), week 0–4 (short-term), week 0–12 (seasonal), week 0–24 (pre-seasonal), week 0–52 (annual), week 53–104 (previous year).

Environmental data

Environmental data were collected from a variety of databases. These environmental variables and datasets have been used in prior studies on the environmental health58,59. The daily maximum temperature TMAX (measured in Fahrenheit) and precipitation data (measured in inch) were collected from the National Climatic Data Center of the National Oceanic and Atmospheric Administration (NCDC/NOAA). NCDC publishes historical climate observations for several monitoring sites across the United States. The San Jose monitoring site was selected because of its proximity to the site of the pollen and mold spore collection and as it had coverage spanning the period (2002–2019) during which the pollen and mold spore data were collected. For atmospheric CO2 data, none of the monitoring sites in California had observations for the complete period (2002–2019); therefore, the CO2 dataset (measured in parts per million) from Mauna Loa Observatory (MLO) of NOAA, located in Kona, Hawaii was used. This dataset informs us about the changes in CO2 trends in the earth’s atmosphere. As a cross-check, we compared the correlation of the MLO dataset with the CO2 observations during 2008–2017 from the Humboldt State University observatory in Northern California. We found a correlation coefficient of 0.85 for the 7-day moving averages in the two datasets. These datasets are overlaid in Supplementary Fig. S10 and the linear relationship between these two datasets is shown in Supplementary Fig. S11, revealing a highly linear trend. For data on wildfire smoke exposure, we utilize the Hazard Mapping System (HMS) dataset developed by the National Oceanic and Atmospheric Administration (NOAA) of the United States government. On a daily basis, trained analysts use visible satellite imagery, satellite-based automatic fire detections, and infrared images to annotate fire locations and perimeter of smoke plumes60. Additionally, they also annotate the amount of smoke density (as low, medium, or high) and this dataset is available from 2010 onwards. From the daily dataset of smoke plume perimeters, we first identified smoke plumes that intersected with Santa Clara county. For those intersecting smoke plumes, we calculated the total area of smoke plume that lied wholly inside the county boundaries, resulting in a daily time series containing the area of smoke plumes that the county was exposed to.

Dataset preparation

The pollen and mold observations include weekly concentration of several commonly observed species in the collection area, although some weeks contain more than one observation. Those weeks with no pollen counts were treated as missing data. The pollen and mold observations were resampled to obtain a dataset with weekly concentrations. In addition, three datasets of pollen and mold spore concentrations were extracted from the raw observation files. The first dataset, called “Major allergens”, summarizes the concentrations for four major pollen and mold categories: trees, grasses, weeds, and molds. The second dataset (“Most-active species”) includes the concentrations for the twenty most active species. To identify the most-active species, the species were ordered by the number of total weeks in which each species had a concentration greater than zero during the complete observation period (2002–2019). Then, the top twenty species were selected from this list. The third dataset (“Selected species”) includes seven species which were picked based on their known importance in allergic outcomes61,62. Unlike pollen and mold concentrations which have a weekly frequency, the climate variables have a daily frequency. Moving seven-day averages of the climate observations were created to help offset the effect of short-term measurement effects and outliers, which is a standard practice in time-series analysis63. For smoke plumes we converted the daily timeseries to a weekly one by taking 7-day maximum area of smoke exposure. Finally, the weekly timeseries from the climate datasets were overlaid with the three pollen and mold spore datasets based on their dates. This generated combined time-series datasets of pollen and mold concentrations with corresponding climate variables. The annual average values for major allergens and different climate variables are plotted in Supplementary Figs. S1 and S2.

Pollen species

We found over 100 different species of pollen and mold spores in our dataset, which are listed in Supplementary Table S5. Since some species are observed more often than others, a list of 20 most commonly observed species was created. To create this list, each species was ordered by the number of weeks in which it had a concentration greater than zero. The list of 20 most commonly observed species is shown in Supplementary Table S4. Similarly, owing to the clinical importance of some, a list of “selected species” was created containing Alternaria spp., Penicillium/Aspergillus, Quercus spp. (Oak), Cupressaeae spp. (incl. junipers/cedars), Betulaceae spp. (birch), Artemesia spp. (sage), and Ambrosia spp./Franseria spp. (ragweed). Additionally, the counts for the four major pollen and mold were summarized as: trees, grasses, weeds, and molds.

Concentrations and activity for pollens and mold spores

Pollen or mold spore concentrations refer to the observations by the counting device during a given time period. When the concentrations are annually averaged, we call them annual average concentrations (AAC) and when they are weekly averaged, we call them weekly average concentrations (WAC). Seasonal and Annual Pollen or Spore integrals refer to sums of WAC values over a season (SPIn) and calendar year (APIn), respectively.

Further, we differentiate between weeks with pollen and mold spore concentrations greater than zero with those when the pollen and spore concentrations are zero. We refer to the weeks where the pollen and spore concentrations are greater than zero as active weeks and the total number of active weeks in a calendar year as Number of Active Weeks (NAW). In other words, active weeks correspond to duration, whereas integrals correspond to the quantitative extent of that activity. Finally, a pollen or mold season is the continuous period during a calendar year when their observations are most concentrated. Each season has a starting week and an ending week, and the season length refers to the encompassing number of weeks. To calculate season length, we followed the procedure described in Ziska et al.28. To identify the start of the season, we take the first continuous 4-week period of the year when the concentrations are greater than zero and take the last week of that 4-week period. To identify the end of the season, we take the last continuous 4-week period of the year when the concentrations are greater than zero and take the last week of that 4-week period. For some species on some years, there were no continuous 4-week period with concentrations greater than zero. In those cases, we took a second approach and considered the fourth week when the concentrations are greater than zero as the start of the season and fourth-from-the-end week when the concentrations are greater than zero as the end of the season. Even after this procedure, for some species on some years, we observed less than 4 weeks when the concentrations are greater than zero. For those, we took a third approach and considered the first week when the concentrations are greater than zero as the start of the season and the last week when the concentrations are greater than zero as the end of the season. In Figs. 12 and Supplementary Fig. S4, only those species for which we could calculate the season using first or second approach for at least 10 years are shown.

Consider the following example to illustrate the differences between these measures. If in a given year pollen concentrations are greater than zero on weeks 8, 15, 16, 17, 18, 20, 22, 25, 26, 27, 28, 35, the value of NAW is 12. The value of APIn is the sum of pollen concentration on all of these 12 weeks and the AAC is the average of these values. The season starts on week 18, ends on week 28, and the season length is 10 weeks. The value of SPIn is the sum of pollen concentrations from week 18 to week 28.